Compare commits

..

49 Commits

Author SHA1 Message Date
Cursor Agent
62a262d554 Reset finalize guard on each executor invocation 2026-03-03 18:21:27 +00:00
lorenzejay
87e1852746 refactor: enhance planning and execution flow in agents
- Updated the PlannerObserver to accept a kickoff input for standalone task execution, improving flexibility in task handling.
- Refined the step execution process in StepExecutor to support multi-turn action loops, allowing for iterative tool execution and observation.
- Introduced a method to extract relevant task sections from descriptions, ensuring clarity in task requirements.
- Enhanced the AgentExecutor to manage step failures more effectively, triggering replans only when necessary and preserving completed task history.
- Updated translations to reflect changes in planning principles and execution prompts, emphasizing concrete and executable steps.
2026-03-03 10:17:35 -08:00
lorenzejay
76f329a025 fix: update observation handling in PlannerObserver for LLM errors
- Modified the error handling in the PlannerObserver to default to a conservative replan when an LLM call fails.
- Updated the return values to indicate that the step was not completed successfully and that a full replan is needed.
- Added a new test to verify the behavior of the observer when an LLM error occurs, ensuring the correct replan logic is triggered.
2026-02-25 13:44:50 -08:00
lorenzejay
687d6abdaa refactor: enhance final answer synthesis logic in AgentExecutor
- Updated the finalization process to conditionally skip synthesis when the last todo result is sufficient as a complete answer.
- Introduced a new method to determine if the last todo result can be used directly, improving efficiency.
- Added tests to verify the new behavior, ensuring synthesis is skipped when appropriate and maintained when a response model is set.
2026-02-24 15:04:02 -08:00
lorenzejay
3302c5ab77 enhance step executor with tool usage events and validation
- Added event emissions for tool usage, including started and finished events, to track tool execution.
- Implemented validation to ensure expected tools are called during step execution, raising errors when not.
- Refactored the  method to handle tool execution with event logging.
- Introduced a new method  for parsing tool input into a structured format.
- Updated tests to cover new functionality and ensure correct behavior of tool usage events.
2026-02-24 14:19:27 -08:00
lorenzejay
32059c7d79 refactor: streamline observation and refinement process in PlannerObserver
- Updated the PlannerObserver to apply structured refinements directly from observations without requiring a second LLM call.
- Renamed  method to  for clarity.
- Enhanced documentation to reflect changes in how refinements are handled.
- Removed unnecessary LLM message building and parsing logic, simplifying the refinement process.
- Updated event emissions to include summaries of refinements instead of raw data.
2026-02-24 09:03:04 -08:00
lorenzejay
8194bb42f1 improving step executor 2026-02-23 14:53:25 -08:00
lorenzejay
388de2252e fix datetime 2026-02-23 14:01:28 -08:00
lorenzejay
8f104e6eca Merge branch 'lorenze/feat/plan-execute-pattern' of github.com:crewAIInc/crewAI into lorenze/feat/planning-pt-3-todo-list-execution 2026-02-23 13:39:05 -08:00
lorenzejay
5317947b4f Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/feat/plan-execute-pattern 2026-02-23 13:07:09 -08:00
lorenzejay
9fea9fe757 Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/feat/plan-execute-pattern 2026-02-20 09:54:39 -08:00
lorenzejay
fd6558e0f2 consolidate agent logic 2026-02-13 13:55:28 -08:00
lorenzejay
e26d3e471d Refactor PlannerObserver and StepExecutor to Utilize I18N for Prompts
This update enhances the PlannerObserver and StepExecutor classes by integrating the I18N utility for managing prompts and messages. The system and user prompts are now retrieved from the I18N module, allowing for better localization and maintainability. Additionally, the code has been cleaned up to remove hardcoded strings, improving readability and consistency across the planning and execution processes.
2026-02-13 13:40:39 -08:00
lorenzejay
fad23d804a Refactor PlannerObserver and StepExecutor to Utilize I18N for Prompts
This update enhances the PlannerObserver and StepExecutor classes by integrating the I18N utility for managing prompts and messages. The system and user prompts are now retrieved from the I18N module, allowing for better localization and maintainability. Additionally, the code has been cleaned up to remove hardcoded strings, improving readability and consistency across the planning and execution processes.
2026-02-13 13:40:04 -08:00
lorenzejay
ca89b729f8 dry 2026-02-13 13:33:55 -08:00
lorenzejay
7e09e01215 fixing tests 2026-02-12 14:26:29 -08:00
lorenzejay
eec88ad2bb cassette regen 2026-02-12 10:53:46 -08:00
lorenzejay
5d4ed12072 regen cassettes for test and fix test 2026-02-11 16:04:24 -08:00
lorenzejay
a164e94f49 Enhance PlanningConfig and AgentExecutor with Reasoning Effort Levels
This update introduces a new  attribute in the  class, allowing users to customize the observation and replanning behavior during task execution. The  class has been modified to utilize this new attribute, routing step observations based on the specified reasoning effort level: low, medium, or high.

Additionally, tests have been added to validate the functionality of the reasoning effort levels, ensuring that the agent behaves as expected under different configurations. This enhancement improves the adaptability and efficiency of the planning process in agent execution.
2026-02-11 13:55:02 -08:00
lorenzejay
576345140f fix 2026-02-10 17:38:49 -08:00
lorenzejay
8fd7ef7f43 linted 2026-02-10 17:14:38 -08:00
lorenzejay
9cac1792bd regen tests 2026-02-10 17:05:47 -08:00
lorenzejay
b2de783559 Merge branch 'lorenze/feat/plan-execute-pattern' of github.com:crewAIInc/crewAI into lorenze/feat/planning-pt-3-todo-list-execution 2026-02-10 16:58:14 -08:00
lorenzejay
d77e2cb1f8 Merge branch 'lorenze/feat/plan-execute-pattern' of github.com:crewAIInc/crewAI into lorenze/feat/plan-execute-pattern 2026-02-10 16:10:20 -08:00
Lorenze Jay
a6dcb275e1 Lorenze/feat planning pt 2 todo list gen (#4449)
* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* feat: enhance agent planning with structured todo management

This commit introduces a new planning system within the AgentExecutor class, allowing for the creation of structured todo items from planning steps. The TodoList and TodoItem models have been added to facilitate tracking of plan execution. The reasoning plan now includes a list of steps, improving the clarity and organization of agent tasks. Additionally, tests have been added to validate the new planning functionality and ensure proper integration with existing workflows.

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt

* improve handler

* linted

* linted
2026-02-10 16:08:26 -08:00
Lorenze Jay
79a01fca31 feat: introduce PlanningConfig for enhanced agent planning capabilities (#4344)
* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt
2026-02-10 13:26:49 -08:00
lorenzejay
735a2204fd refactor: implement structured output handling in final answer synthesis
This commit enhances the final answer synthesis process in the AgentExecutor class by introducing support for structured outputs when a response model is specified. The synthesis method now utilizes the response model to produce outputs that conform to the expected schema, while still falling back to concatenation in case of synthesis failures. This change ensures that intermediate steps yield free-text results, but the final output can be structured, improving the overall coherence and usability of the synthesized answers.
2026-02-08 16:19:16 -08:00
lorenzejay
ff57956d05 refactor: enhance final answer synthesis in AgentExecutor
This commit improves the synthesis of final answers in the AgentExecutor class by implementing a more coherent approach to combining results from multiple todo items. The method now utilizes a single LLM call to generate a polished response, falling back to concatenation if the synthesis fails. Additionally, the test cases have been updated to reflect the changes in planning and execution, ensuring that the results are properly validated and that the plan-and-execute architecture is functioning as intended.
2026-02-06 15:39:04 -08:00
lorenzejay
9f3c53ca97 refactor: enhance final answer synthesis in AgentExecutor
This commit improves the synthesis of final answers in the AgentExecutor class by implementing a more coherent approach to combining results from multiple todo items. The method now utilizes a single LLM call to generate a polished response, falling back to concatenation if the synthesis fails. Additionally, the test cases have been updated to reflect the changes in planning and execution, ensuring that the results are properly validated and that the plan-and-execute architecture is functioning as intended.
2026-02-06 10:38:55 -08:00
lorenzejay
8e1474d371 feat: introduce PlannerObserver and StepExecutor for enhanced plan execution
This commit adds the PlannerObserver and StepExecutor classes to the CrewAI framework, implementing the observation phase of the Plan-and-Execute architecture. The PlannerObserver analyzes step execution results, determines plan validity, and suggests refinements, while the StepExecutor executes individual todo items in isolation. These additions improve the overall planning and execution process, allowing for more dynamic and responsive agent behavior. Additionally, new observation events have been defined to facilitate monitoring and logging of the planning process.
2026-02-05 15:46:21 -08:00
lorenzejay
81d9fd4ab3 execute todos and be able to track them 2026-02-05 10:51:54 -08:00
lorenzejay
7e1ae7226b improve handler 2026-02-03 15:15:22 -08:00
lorenzejay
adee852a2a Merge branch 'lorenze/feat/planning-pt-1' of github.com:crewAIInc/crewAI into lorenze/feat-planning-pt-2-todo-list-gen 2026-02-03 13:54:47 -08:00
lorenzejay
b7d5a4afef Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/feat/planning-pt-1 2026-02-03 13:54:27 -08:00
lorenzejay
abf86d5572 Merge branch 'lorenze/feat/planning-pt-1' of github.com:crewAIInc/crewAI into lorenze/feat-planning-pt-2-todo-list-gen 2026-02-03 13:48:36 -08:00
lorenzejay
02dc39faa2 improve planning prompt 2026-02-03 13:47:38 -08:00
lorenzejay
dd8230f051 Merge branch 'lorenze/feat/planning-pt-1' of github.com:crewAIInc/crewAI into lorenze/feat-planning-pt-2-todo-list-gen 2026-02-03 13:36:48 -08:00
lorenzejay
a3c2c946d3 refactor: update planning prompt and remove deprecated methods in reasoning handler 2026-02-03 13:35:45 -08:00
lorenzejay
bd95cffd41 feat: enhance agent planning with structured todo management
This commit introduces a new planning system within the AgentExecutor class, allowing for the creation of structured todo items from planning steps. The TodoList and TodoItem models have been added to facilitate tracking of plan execution. The reasoning plan now includes a list of steps, improving the clarity and organization of agent tasks. Additionally, tests have been added to validate the new planning functionality and ensure proper integration with existing workflows.
2026-02-03 13:24:55 -08:00
lorenzejay
ab6ce4b7aa fix test 2026-02-03 08:57:47 -08:00
lorenzejay
ac1d1fcfa3 more cassettes 2026-02-03 08:27:10 -08:00
lorenzejay
83f38184ff refactor: remove default enabled flag from PlanningConfig in Agent class 2026-02-03 08:01:27 -08:00
lorenzejay
f2016f8979 matching 2026-02-03 07:59:27 -08:00
lorenzejay
fe1e29d2f9 improve planning prompts 2026-02-02 16:36:18 -08:00
lorenzejay
861da95aad refactor: update reasoning handling in Agent class
This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.
2026-02-02 16:27:39 -08:00
lorenzejay
50b9b42de9 revert handle_reasoning here 2026-02-02 16:21:36 -08:00
lorenzejay
85d22ba902 fix test 2026-02-02 16:08:12 -08:00
lorenzejay
9277d219e3 dropping redundancy 2026-02-02 16:01:37 -08:00
lorenzejay
710b0ce2ae feat: introduce PlanningConfig for enhanced agent planning capabilities
This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.
2026-02-02 15:55:28 -08:00
141 changed files with 40321 additions and 5445 deletions

View File

@@ -21,6 +21,7 @@ OPENROUTER_API_KEY=fake-openrouter-key
AWS_ACCESS_KEY_ID=fake-aws-access-key
AWS_SECRET_ACCESS_KEY=fake-aws-secret-key
AWS_DEFAULT_REGION=us-east-1
AWS_REGION_NAME=us-east-1
# -----------------------------------------------------------------------------
# Azure OpenAI Configuration

View File

@@ -440,7 +440,6 @@
"en/enterprise/guides/build-crew",
"en/enterprise/guides/prepare-for-deployment",
"en/enterprise/guides/deploy-to-amp",
"en/enterprise/guides/private-package-registry",
"en/enterprise/guides/kickoff-crew",
"en/enterprise/guides/update-crew",
"en/enterprise/guides/enable-crew-studio",
@@ -879,7 +878,6 @@
"pt-BR/enterprise/guides/build-crew",
"pt-BR/enterprise/guides/prepare-for-deployment",
"pt-BR/enterprise/guides/deploy-to-amp",
"pt-BR/enterprise/guides/private-package-registry",
"pt-BR/enterprise/guides/kickoff-crew",
"pt-BR/enterprise/guides/update-crew",
"pt-BR/enterprise/guides/enable-crew-studio",
@@ -1345,7 +1343,6 @@
"ko/enterprise/guides/build-crew",
"ko/enterprise/guides/prepare-for-deployment",
"ko/enterprise/guides/deploy-to-amp",
"ko/enterprise/guides/private-package-registry",
"ko/enterprise/guides/kickoff-crew",
"ko/enterprise/guides/update-crew",
"ko/enterprise/guides/enable-crew-studio",

View File

@@ -470,7 +470,7 @@ In this section, you'll find detailed examples that help you select, configure,
To get an Express mode API key:
- New Google Cloud users: Get an [express mode API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)
- Existing Google Cloud users: Get a [Google Cloud API key bound to a service account](https://cloud.google.com/docs/authentication/api-keys)
For more details, see the [Vertex AI Express mode documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey).
</Info>
@@ -652,7 +652,6 @@ In this section, you'll find detailed examples that help you select, configure,
# Optional
AWS_SESSION_TOKEN=<your-session-token> # For temporary credentials
AWS_DEFAULT_REGION=<your-region> # Defaults to us-east-1
AWS_REGION_NAME=<your-region> # Alternative configuration for backwards compatibility with LiteLLM. Defaults to us-east-1
```
**Basic Usage:**
@@ -696,7 +695,6 @@ In this section, you'll find detailed examples that help you select, configure,
- `AWS_SECRET_ACCESS_KEY`: AWS secret key (required)
- `AWS_SESSION_TOKEN`: AWS session token for temporary credentials (optional)
- `AWS_DEFAULT_REGION`: AWS region (defaults to `us-east-1`)
- `AWS_REGION_NAME`: AWS region (defaults to `us-east-1`). Alternative configuration for backwards compatibility with LiteLLM
**Features:**
- Native tool calling support via Converse API

View File

@@ -177,11 +177,6 @@ You need to push your crew to a GitHub repository. If you haven't created a crew
![Set Environment Variables](/images/enterprise/set-env-variables.png)
</Frame>
<Info>
Using private Python packages? You'll need to add your registry credentials here too.
See [Private Package Registries](/en/enterprise/guides/private-package-registry) for the required variables.
</Info>
</Step>
<Step title="Deploy Your Crew">

View File

@@ -256,12 +256,6 @@ Before deployment, ensure you have:
1. **LLM API keys** ready (OpenAI, Anthropic, Google, etc.)
2. **Tool API keys** if using external tools (Serper, etc.)
<Info>
If your project depends on packages from a **private PyPI registry**, you'll also need to configure
registry authentication credentials as environment variables. See the
[Private Package Registries](/en/enterprise/guides/private-package-registry) guide for details.
</Info>
<Tip>
Test your project locally with the same environment variables before deploying
to catch configuration issues early.

View File

@@ -1,263 +0,0 @@
---
title: "Private Package Registries"
description: "Install private Python packages from authenticated PyPI registries in CrewAI AMP"
icon: "lock"
mode: "wide"
---
<Note>
This guide covers how to configure your CrewAI project to install Python packages
from private PyPI registries (Azure DevOps Artifacts, GitHub Packages, GitLab, AWS CodeArtifact, etc.)
when deploying to CrewAI AMP.
</Note>
## When You Need This
If your project depends on internal or proprietary Python packages hosted on a private registry
rather than the public PyPI, you'll need to:
1. Tell UV **where** to find the package (an index URL)
2. Tell UV **which** packages come from that index (a source mapping)
3. Provide **credentials** so UV can authenticate during install
CrewAI AMP uses [UV](https://docs.astral.sh/uv/) for dependency resolution and installation.
UV supports authenticated private registries through `pyproject.toml` configuration combined
with environment variables for credentials.
## Step 1: Configure pyproject.toml
Three pieces work together in your `pyproject.toml`:
### 1a. Declare the dependency
Add the private package to your `[project.dependencies]` like any other dependency:
```toml
[project]
dependencies = [
"crewai[tools]>=0.100.1,<1.0.0",
"my-private-package>=1.2.0",
]
```
### 1b. Define the index
Register your private registry as a named index under `[[tool.uv.index]]`:
```toml
[[tool.uv.index]]
name = "my-private-registry"
url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
explicit = true
```
<Info>
The `name` field is important — UV uses it to construct the environment variable names
for authentication (see [Step 2](#step-2-set-authentication-credentials) below).
Setting `explicit = true` means UV won't search this index for every package — only the
ones you explicitly map to it in `[tool.uv.sources]`. This avoids unnecessary queries
against your private registry and protects against dependency confusion attacks.
</Info>
### 1c. Map the package to the index
Tell UV which packages should be resolved from your private index using `[tool.uv.sources]`:
```toml
[tool.uv.sources]
my-private-package = { index = "my-private-registry" }
```
### Complete example
```toml
[project]
name = "my-crew-project"
version = "0.1.0"
requires-python = ">=3.10,<=3.13"
dependencies = [
"crewai[tools]>=0.100.1,<1.0.0",
"my-private-package>=1.2.0",
]
[tool.crewai]
type = "crew"
[[tool.uv.index]]
name = "my-private-registry"
url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
explicit = true
[tool.uv.sources]
my-private-package = { index = "my-private-registry" }
```
After updating `pyproject.toml`, regenerate your lock file:
```bash
uv lock
```
<Warning>
Always commit the updated `uv.lock` along with your `pyproject.toml` changes.
The lock file is required for deployment — see [Prepare for Deployment](/en/enterprise/guides/prepare-for-deployment).
</Warning>
## Step 2: Set Authentication Credentials
UV authenticates against private indexes using environment variables that follow a naming convention
based on the index name you defined in `pyproject.toml`:
```
UV_INDEX_{UPPER_NAME}_USERNAME
UV_INDEX_{UPPER_NAME}_PASSWORD
```
Where `{UPPER_NAME}` is your index name converted to **uppercase** with **hyphens replaced by underscores**.
For example, an index named `my-private-registry` uses:
| Variable | Value |
|----------|-------|
| `UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME` | Your registry username or token name |
| `UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD` | Your registry password or token/PAT |
<Warning>
These environment variables **must** be added via the CrewAI AMP **Environment Variables** settings —
either globally or at the deployment level. They cannot be set in `.env` files or hardcoded in your project.
See [Setting Environment Variables in AMP](#setting-environment-variables-in-amp) below.
</Warning>
## Registry Provider Reference
The table below shows the index URL format and credential values for common registry providers.
Replace placeholder values with your actual organization and feed details.
| Provider | Index URL | Username | Password |
|----------|-----------|----------|----------|
| **Azure DevOps Artifacts** | `https://pkgs.dev.azure.com/{org}/_packaging/{feed}/pypi/simple/` | Any non-empty string (e.g. `token`) | Personal Access Token (PAT) with Packaging Read scope |
| **GitHub Packages** | `https://pypi.pkg.github.com/{owner}/simple/` | GitHub username | Personal Access Token (classic) with `read:packages` scope |
| **GitLab Package Registry** | `https://gitlab.com/api/v4/projects/{project_id}/packages/pypi/simple/` | `__token__` | Project or Personal Access Token with `read_api` scope |
| **AWS CodeArtifact** | Use the URL from `aws codeartifact get-repository-endpoint` | `aws` | Token from `aws codeartifact get-authorization-token` |
| **Google Artifact Registry** | `https://{region}-python.pkg.dev/{project}/{repo}/simple/` | `_json_key_base64` | Base64-encoded service account key |
| **JFrog Artifactory** | `https://{instance}.jfrog.io/artifactory/api/pypi/{repo}/simple/` | Username or email | API key or identity token |
| **Self-hosted (devpi, Nexus, etc.)** | Your registry's simple API URL | Registry username | Registry password |
<Tip>
For **AWS CodeArtifact**, the authorization token expires periodically.
You'll need to refresh the `UV_INDEX_*_PASSWORD` value when it expires.
Consider automating this in your CI/CD pipeline.
</Tip>
## Setting Environment Variables in AMP
Private registry credentials must be configured as environment variables in CrewAI AMP.
You have two options:
<Tabs>
<Tab title="Web Interface">
1. Log in to [CrewAI AMP](https://app.crewai.com)
2. Navigate to your automation
3. Open the **Environment Variables** tab
4. Add each variable (`UV_INDEX_*_USERNAME` and `UV_INDEX_*_PASSWORD`) with its value
See the [Deploy to AMP — Set Environment Variables](/en/enterprise/guides/deploy-to-amp#set-environment-variables) step for details.
</Tab>
<Tab title="CLI Deployment">
Add the variables to your local `.env` file before running `crewai deploy create`.
The CLI will securely transfer them to the platform:
```bash
# .env
OPENAI_API_KEY=sk-...
UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat-here
```
```bash
crewai deploy create
```
</Tab>
</Tabs>
<Warning>
**Never** commit credentials to your repository. Use AMP environment variables for all secrets.
The `.env` file should be listed in `.gitignore`.
</Warning>
To update credentials on an existing deployment, see [Update Your Crew — Environment Variables](/en/enterprise/guides/update-crew).
## How It All Fits Together
When CrewAI AMP builds your automation, the resolution flow works like this:
<Steps>
<Step title="Build starts">
AMP pulls your repository and reads `pyproject.toml` and `uv.lock`.
</Step>
<Step title="UV resolves dependencies">
UV reads `[tool.uv.sources]` to determine which index each package should come from.
</Step>
<Step title="UV authenticates">
For each private index, UV looks up `UV_INDEX_{NAME}_USERNAME` and `UV_INDEX_{NAME}_PASSWORD`
from the environment variables you configured in AMP.
</Step>
<Step title="Packages install">
UV downloads and installs all packages — both public (from PyPI) and private (from your registry).
</Step>
<Step title="Automation runs">
Your crew or flow starts with all dependencies available.
</Step>
</Steps>
## Troubleshooting
### Authentication Errors During Build
**Symptom**: Build fails with `401 Unauthorized` or `403 Forbidden` when resolving a private package.
**Check**:
- The `UV_INDEX_*` environment variable names match your index name exactly (uppercased, hyphens → underscores)
- Credentials are set in AMP environment variables, not just in a local `.env`
- Your token/PAT has the required read permissions for the package feed
- The token hasn't expired (especially relevant for AWS CodeArtifact)
### Package Not Found
**Symptom**: `No matching distribution found for my-private-package`.
**Check**:
- The index URL in `pyproject.toml` ends with `/simple/`
- The `[tool.uv.sources]` entry maps the correct package name to the correct index name
- The package is actually published to your private registry
- Run `uv lock` locally with the same credentials to verify resolution works
### Lock File Conflicts
**Symptom**: `uv lock` fails or produces unexpected results after adding a private index.
**Solution**: Set the credentials locally and regenerate:
```bash
export UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
export UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat
uv lock
```
Then commit the updated `uv.lock`.
## Related Guides
<CardGroup cols={3}>
<Card title="Prepare for Deployment" icon="clipboard-check" href="/en/enterprise/guides/prepare-for-deployment">
Verify project structure and dependencies before deploying.
</Card>
<Card title="Deploy to AMP" icon="rocket" href="/en/enterprise/guides/deploy-to-amp">
Deploy your crew or flow and configure environment variables.
</Card>
<Card title="Update Your Crew" icon="arrows-rotate" href="/en/enterprise/guides/update-crew">
Update environment variables and push changes to a running deployment.
</Card>
</CardGroup>

View File

@@ -176,11 +176,6 @@ Crew를 GitHub 저장소에 푸시해야 합니다. 아직 Crew를 만들지 않
![Set Environment Variables](/images/enterprise/set-env-variables.png)
</Frame>
<Info>
프라이빗 Python 패키지를 사용하시나요? 여기에 레지스트리 자격 증명도 추가해야 합니다.
필요한 변수는 [프라이빗 패키지 레지스트리](/ko/enterprise/guides/private-package-registry)를 참조하세요.
</Info>
</Step>
<Step title="Crew 배포하기">

View File

@@ -256,12 +256,6 @@ Crews와 Flows 모두 `src/project_name/main.py`에 진입점이 있습니다:
1. **LLM API 키** (OpenAI, Anthropic, Google 등)
2. **도구 API 키** - 외부 도구를 사용하는 경우 (Serper 등)
<Info>
프로젝트가 **프라이빗 PyPI 레지스트리**의 패키지에 의존하는 경우, 레지스트리 인증 자격 증명도
환경 변수로 구성해야 합니다. 자세한 내용은
[프라이빗 패키지 레지스트리](/ko/enterprise/guides/private-package-registry) 가이드를 참조하세요.
</Info>
<Tip>
구성 문제를 조기에 발견하기 위해 배포 전에 동일한 환경 변수로
로컬에서 프로젝트를 테스트하세요.

View File

@@ -1,261 +0,0 @@
---
title: "프라이빗 패키지 레지스트리"
description: "CrewAI AMP에서 인증된 PyPI 레지스트리의 프라이빗 Python 패키지 설치하기"
icon: "lock"
mode: "wide"
---
<Note>
이 가이드는 CrewAI AMP에 배포할 때 프라이빗 PyPI 레지스트리(Azure DevOps Artifacts, GitHub Packages,
GitLab, AWS CodeArtifact 등)에서 Python 패키지를 설치하도록 CrewAI 프로젝트를 구성하는 방법을 다룹니다.
</Note>
## 이 가이드가 필요한 경우
프로젝트가 공개 PyPI가 아닌 프라이빗 레지스트리에 호스팅된 내부 또는 독점 Python 패키지에
의존하는 경우, 다음을 수행해야 합니다:
1. UV에 패키지를 **어디서** 찾을지 알려줍니다 (index URL)
2. UV에 **어떤** 패키지가 해당 index에서 오는지 알려줍니다 (source 매핑)
3. UV가 설치 중에 인증할 수 있도록 **자격 증명**을 제공합니다
CrewAI AMP는 의존성 해결 및 설치에 [UV](https://docs.astral.sh/uv/)를 사용합니다.
UV는 `pyproject.toml` 구성과 자격 증명용 환경 변수를 결합하여 인증된 프라이빗 레지스트리를 지원합니다.
## 1단계: pyproject.toml 구성
`pyproject.toml`에서 세 가지 요소가 함께 작동합니다:
### 1a. 의존성 선언
프라이빗 패키지를 다른 의존성과 마찬가지로 `[project.dependencies]`에 추가합니다:
```toml
[project]
dependencies = [
"crewai[tools]>=0.100.1,<1.0.0",
"my-private-package>=1.2.0",
]
```
### 1b. index 정의
프라이빗 레지스트리를 `[[tool.uv.index]]` 아래에 명명된 index로 등록합니다:
```toml
[[tool.uv.index]]
name = "my-private-registry"
url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
explicit = true
```
<Info>
`name` 필드는 중요합니다 — UV는 이를 사용하여 인증을 위한 환경 변수 이름을
구성합니다 (아래 [2단계](#2단계-인증-자격-증명-설정)를 참조하세요).
`explicit = true`를 설정하면 UV가 모든 패키지에 대해 이 index를 검색하지 않습니다 —
`[tool.uv.sources]`에서 명시적으로 매핑한 패키지만 검색합니다. 이렇게 하면 프라이빗
레지스트리에 대한 불필요한 쿼리를 방지하고 의존성 혼동 공격을 차단할 수 있습니다.
</Info>
### 1c. 패키지를 index에 매핑
`[tool.uv.sources]`를 사용하여 프라이빗 index에서 해결해야 할 패키지를 UV에 알려줍니다:
```toml
[tool.uv.sources]
my-private-package = { index = "my-private-registry" }
```
### 전체 예시
```toml
[project]
name = "my-crew-project"
version = "0.1.0"
requires-python = ">=3.10,<=3.13"
dependencies = [
"crewai[tools]>=0.100.1,<1.0.0",
"my-private-package>=1.2.0",
]
[tool.crewai]
type = "crew"
[[tool.uv.index]]
name = "my-private-registry"
url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
explicit = true
[tool.uv.sources]
my-private-package = { index = "my-private-registry" }
```
`pyproject.toml`을 업데이트한 후 lock 파일을 다시 생성합니다:
```bash
uv lock
```
<Warning>
업데이트된 `uv.lock`을 항상 `pyproject.toml` 변경 사항과 함께 커밋하세요.
lock 파일은 배포에 필수입니다 — [배포 준비하기](/ko/enterprise/guides/prepare-for-deployment)를 참조하세요.
</Warning>
## 2단계: 인증 자격 증명 설정
UV는 `pyproject.toml`에서 정의한 index 이름을 기반으로 한 명명 규칙을 따르는
환경 변수를 사용하여 프라이빗 index에 인증합니다:
```
UV_INDEX_{UPPER_NAME}_USERNAME
UV_INDEX_{UPPER_NAME}_PASSWORD
```
여기서 `{UPPER_NAME}`은 index 이름을 **대문자**로 변환하고 **하이픈을 언더스코어로 대체**한 것입니다.
예를 들어, `my-private-registry`라는 이름의 index는 다음을 사용합니다:
| 변수 | 값 |
|------|-----|
| `UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME` | 레지스트리 사용자 이름 또는 토큰 이름 |
| `UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD` | 레지스트리 비밀번호 또는 토큰/PAT |
<Warning>
이 환경 변수는 CrewAI AMP **환경 변수** 설정을 통해 **반드시** 추가해야 합니다 —
전역적으로 또는 배포 수준에서. `.env` 파일에 설정하거나 프로젝트에 하드코딩할 수 없습니다.
아래 [AMP에서 환경 변수 설정](#amp에서-환경-변수-설정)을 참조하세요.
</Warning>
## 레지스트리 제공업체 참조
아래 표는 일반적인 레지스트리 제공업체의 index URL 형식과 자격 증명 값을 보여줍니다.
자리 표시자 값을 실제 조직 및 피드 세부 정보로 대체하세요.
| 제공업체 | Index URL | 사용자 이름 | 비밀번호 |
|---------|-----------|-----------|---------|
| **Azure DevOps Artifacts** | `https://pkgs.dev.azure.com/{org}/_packaging/{feed}/pypi/simple/` | 비어 있지 않은 임의의 문자열 (예: `token`) | Packaging Read 범위의 Personal Access Token (PAT) |
| **GitHub Packages** | `https://pypi.pkg.github.com/{owner}/simple/` | GitHub 사용자 이름 | `read:packages` 범위의 Personal Access Token (classic) |
| **GitLab Package Registry** | `https://gitlab.com/api/v4/projects/{project_id}/packages/pypi/simple/` | `__token__` | `read_api` 범위의 Project 또는 Personal Access Token |
| **AWS CodeArtifact** | `aws codeartifact get-repository-endpoint`의 URL 사용 | `aws` | `aws codeartifact get-authorization-token`의 토큰 |
| **Google Artifact Registry** | `https://{region}-python.pkg.dev/{project}/{repo}/simple/` | `_json_key_base64` | Base64로 인코딩된 서비스 계정 키 |
| **JFrog Artifactory** | `https://{instance}.jfrog.io/artifactory/api/pypi/{repo}/simple/` | 사용자 이름 또는 이메일 | API 키 또는 ID 토큰 |
| **자체 호스팅 (devpi, Nexus 등)** | 레지스트리의 simple API URL | 레지스트리 사용자 이름 | 레지스트리 비밀번호 |
<Tip>
**AWS CodeArtifact**의 경우 인증 토큰이 주기적으로 만료됩니다.
만료되면 `UV_INDEX_*_PASSWORD` 값을 갱신해야 합니다.
CI/CD 파이프라인에서 이를 자동화하는 것을 고려하세요.
</Tip>
## AMP에서 환경 변수 설정
프라이빗 레지스트리 자격 증명은 CrewAI AMP에서 환경 변수로 구성해야 합니다.
두 가지 옵션이 있습니다:
<Tabs>
<Tab title="웹 인터페이스">
1. [CrewAI AMP](https://app.crewai.com)에 로그인합니다
2. 자동화로 이동합니다
3. **Environment Variables** 탭을 엽니다
4. 각 변수 (`UV_INDEX_*_USERNAME` 및 `UV_INDEX_*_PASSWORD`)에 값을 추가합니다
자세한 내용은 [AMP에 배포하기 — 환경 변수 설정하기](/ko/enterprise/guides/deploy-to-amp#환경-변수-설정하기) 단계를 참조하세요.
</Tab>
<Tab title="CLI 배포">
`crewai deploy create`를 실행하기 전에 로컬 `.env` 파일에 변수를 추가합니다.
CLI가 이를 안전하게 플랫폼으로 전송합니다:
```bash
# .env
OPENAI_API_KEY=sk-...
UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat-here
```
```bash
crewai deploy create
```
</Tab>
</Tabs>
<Warning>
자격 증명을 저장소에 **절대** 커밋하지 마세요. 모든 비밀 정보에는 AMP 환경 변수를 사용하세요.
`.env` 파일은 `.gitignore`에 포함되어야 합니다.
</Warning>
기존 배포의 자격 증명을 업데이트하려면 [Crew 업데이트하기 — 환경 변수](/ko/enterprise/guides/update-crew)를 참조하세요.
## 전체 동작 흐름
CrewAI AMP가 자동화를 빌드할 때, 해결 흐름은 다음과 같이 작동합니다:
<Steps>
<Step title="빌드 시작">
AMP가 저장소를 가져오고 `pyproject.toml`과 `uv.lock`을 읽습니다.
</Step>
<Step title="UV가 의존성 해결">
UV가 `[tool.uv.sources]`를 읽어 각 패키지가 어떤 index에서 와야 하는지 결정합니다.
</Step>
<Step title="UV가 인증">
각 프라이빗 index에 대해 UV가 AMP에서 구성한 환경 변수에서
`UV_INDEX_{NAME}_USERNAME`과 `UV_INDEX_{NAME}_PASSWORD`를 조회합니다.
</Step>
<Step title="패키지 설치">
UV가 공개(PyPI) 및 프라이빗(레지스트리) 패키지를 모두 다운로드하고 설치합니다.
</Step>
<Step title="자동화 실행">
모든 의존성이 사용 가능한 상태에서 crew 또는 flow가 시작됩니다.
</Step>
</Steps>
## 문제 해결
### 빌드 중 인증 오류
**증상**: 프라이빗 패키지를 해결할 때 `401 Unauthorized` 또는 `403 Forbidden`으로 빌드가 실패합니다.
**확인사항**:
- `UV_INDEX_*` 환경 변수 이름이 index 이름과 정확히 일치하는지 확인합니다 (대문자, 하이픈 -> 언더스코어)
- 자격 증명이 로컬 `.env`뿐만 아니라 AMP 환경 변수에 설정되어 있는지 확인합니다
- 토큰/PAT에 패키지 피드에 필요한 읽기 권한이 있는지 확인합니다
- 토큰이 만료되지 않았는지 확인합니다 (특히 AWS CodeArtifact의 경우)
### 패키지를 찾을 수 없음
**증상**: `No matching distribution found for my-private-package`.
**확인사항**:
- `pyproject.toml`의 index URL이 `/simple/`로 끝나는지 확인합니다
- `[tool.uv.sources]` 항목이 올바른 패키지 이름을 올바른 index 이름에 매핑하는지 확인합니다
- 패키지가 실제로 프라이빗 레지스트리에 게시되어 있는지 확인합니다
- 동일한 자격 증명으로 로컬에서 `uv lock`을 실행하여 해결이 작동하는지 확인합니다
### Lock 파일 충돌
**증상**: 프라이빗 index를 추가한 후 `uv lock`이 실패하거나 예상치 못한 결과를 생성합니다.
**해결책**: 로컬에서 자격 증명을 설정하고 다시 생성합니다:
```bash
export UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
export UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat
uv lock
```
그런 다음 업데이트된 `uv.lock`을 커밋합니다.
## 관련 가이드
<CardGroup cols={3}>
<Card title="배포 준비하기" icon="clipboard-check" href="/ko/enterprise/guides/prepare-for-deployment">
배포 전에 프로젝트 구조와 의존성을 확인합니다.
</Card>
<Card title="AMP에 배포하기" icon="rocket" href="/ko/enterprise/guides/deploy-to-amp">
crew 또는 flow를 배포하고 환경 변수를 구성합니다.
</Card>
<Card title="Crew 업데이트하기" icon="arrows-rotate" href="/ko/enterprise/guides/update-crew">
환경 변수를 업데이트하고 실행 중인 배포에 변경 사항을 푸시합니다.
</Card>
</CardGroup>

View File

@@ -176,11 +176,6 @@ Você precisa enviar seu crew para um repositório do GitHub. Caso ainda não te
![Definir Variáveis de Ambiente](/images/enterprise/set-env-variables.png)
</Frame>
<Info>
Usando pacotes Python privados? Você também precisará adicionar suas credenciais de registro aqui.
Consulte [Registros de Pacotes Privados](/pt-BR/enterprise/guides/private-package-registry) para as variáveis necessárias.
</Info>
</Step>
<Step title="Implante Seu Crew">

View File

@@ -256,12 +256,6 @@ Antes da implantação, certifique-se de ter:
1. **Chaves de API de LLM** prontas (OpenAI, Anthropic, Google, etc.)
2. **Chaves de API de ferramentas** se estiver usando ferramentas externas (Serper, etc.)
<Info>
Se seu projeto depende de pacotes de um **registro PyPI privado**, você também precisará configurar
credenciais de autenticação do registro como variáveis de ambiente. Consulte o guia
[Registros de Pacotes Privados](/pt-BR/enterprise/guides/private-package-registry) para mais detalhes.
</Info>
<Tip>
Teste seu projeto localmente com as mesmas variáveis de ambiente antes de implantar
para detectar problemas de configuração antecipadamente.

View File

@@ -1,263 +0,0 @@
---
title: "Registros de Pacotes Privados"
description: "Instale pacotes Python privados de registros PyPI autenticados no CrewAI AMP"
icon: "lock"
mode: "wide"
---
<Note>
Este guia aborda como configurar seu projeto CrewAI para instalar pacotes Python
de registros PyPI privados (Azure DevOps Artifacts, GitHub Packages, GitLab, AWS CodeArtifact, etc.)
ao implantar no CrewAI AMP.
</Note>
## Quando Você Precisa Disso
Se seu projeto depende de pacotes Python internos ou proprietários hospedados em um registro privado
em vez do PyPI público, você precisará:
1. Informar ao UV **onde** encontrar o pacote (uma URL de index)
2. Informar ao UV **quais** pacotes vêm desse index (um mapeamento de source)
3. Fornecer **credenciais** para que o UV possa autenticar durante a instalação
O CrewAI AMP usa [UV](https://docs.astral.sh/uv/) para resolução e instalação de dependências.
O UV suporta registros privados autenticados por meio da configuração do `pyproject.toml` combinada
com variáveis de ambiente para credenciais.
## Passo 1: Configurar o pyproject.toml
Três elementos trabalham juntos no seu `pyproject.toml`:
### 1a. Declarar a dependência
Adicione o pacote privado ao seu `[project.dependencies]` como qualquer outra dependência:
```toml
[project]
dependencies = [
"crewai[tools]>=0.100.1,<1.0.0",
"my-private-package>=1.2.0",
]
```
### 1b. Definir o index
Registre seu registro privado como um index nomeado em `[[tool.uv.index]]`:
```toml
[[tool.uv.index]]
name = "my-private-registry"
url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
explicit = true
```
<Info>
O campo `name` é importante — o UV o utiliza para construir os nomes das variáveis de ambiente
para autenticação (veja o [Passo 2](#passo-2-configurar-credenciais-de-autenticação) abaixo).
Definir `explicit = true` significa que o UV não consultará esse index para todos os pacotes — apenas
os que você mapear explicitamente em `[tool.uv.sources]`. Isso evita consultas desnecessárias
ao seu registro privado e protege contra ataques de confusão de dependências.
</Info>
### 1c. Mapear o pacote para o index
Informe ao UV quais pacotes devem ser resolvidos a partir do seu index privado usando `[tool.uv.sources]`:
```toml
[tool.uv.sources]
my-private-package = { index = "my-private-registry" }
```
### Exemplo completo
```toml
[project]
name = "my-crew-project"
version = "0.1.0"
requires-python = ">=3.10,<=3.13"
dependencies = [
"crewai[tools]>=0.100.1,<1.0.0",
"my-private-package>=1.2.0",
]
[tool.crewai]
type = "crew"
[[tool.uv.index]]
name = "my-private-registry"
url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
explicit = true
[tool.uv.sources]
my-private-package = { index = "my-private-registry" }
```
Após atualizar o `pyproject.toml`, regenere seu arquivo lock:
```bash
uv lock
```
<Warning>
Sempre faça commit do `uv.lock` atualizado junto com as alterações no `pyproject.toml`.
O arquivo lock é obrigatório para implantação — veja [Preparar para Implantação](/pt-BR/enterprise/guides/prepare-for-deployment).
</Warning>
## Passo 2: Configurar Credenciais de Autenticação
O UV autentica em indexes privados usando variáveis de ambiente que seguem uma convenção de nomenclatura
baseada no nome do index que você definiu no `pyproject.toml`:
```
UV_INDEX_{UPPER_NAME}_USERNAME
UV_INDEX_{UPPER_NAME}_PASSWORD
```
Onde `{UPPER_NAME}` é o nome do seu index convertido para **maiúsculas** com **hifens substituídos por underscores**.
Por exemplo, um index chamado `my-private-registry` usa:
| Variável | Valor |
|----------|-------|
| `UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME` | Seu nome de usuário ou nome do token do registro |
| `UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD` | Sua senha ou token/PAT do registro |
<Warning>
Essas variáveis de ambiente **devem** ser adicionadas pelas configurações de **Variáveis de Ambiente** do CrewAI AMP —
globalmente ou no nível da implantação. Elas não podem ser definidas em arquivos `.env` ou codificadas no seu projeto.
Veja [Configurar Variáveis de Ambiente no AMP](#configurar-variáveis-de-ambiente-no-amp) abaixo.
</Warning>
## Referência de Provedores de Registro
A tabela abaixo mostra o formato da URL de index e os valores de credenciais para provedores de registro comuns.
Substitua os valores de exemplo pelos detalhes reais da sua organização e feed.
| Provedor | URL do Index | Usuário | Senha |
|----------|-------------|---------|-------|
| **Azure DevOps Artifacts** | `https://pkgs.dev.azure.com/{org}/_packaging/{feed}/pypi/simple/` | Qualquer string não vazia (ex: `token`) | Personal Access Token (PAT) com escopo Packaging Read |
| **GitHub Packages** | `https://pypi.pkg.github.com/{owner}/simple/` | Nome de usuário do GitHub | Personal Access Token (classic) com escopo `read:packages` |
| **GitLab Package Registry** | `https://gitlab.com/api/v4/projects/{project_id}/packages/pypi/simple/` | `__token__` | Project ou Personal Access Token com escopo `read_api` |
| **AWS CodeArtifact** | Use a URL de `aws codeartifact get-repository-endpoint` | `aws` | Token de `aws codeartifact get-authorization-token` |
| **Google Artifact Registry** | `https://{region}-python.pkg.dev/{project}/{repo}/simple/` | `_json_key_base64` | Chave de conta de serviço codificada em Base64 |
| **JFrog Artifactory** | `https://{instance}.jfrog.io/artifactory/api/pypi/{repo}/simple/` | Nome de usuário ou email | Chave API ou token de identidade |
| **Auto-hospedado (devpi, Nexus, etc.)** | URL da API simple do seu registro | Nome de usuário do registro | Senha do registro |
<Tip>
Para **AWS CodeArtifact**, o token de autorização expira periodicamente.
Você precisará atualizar o valor de `UV_INDEX_*_PASSWORD` quando ele expirar.
Considere automatizar isso no seu pipeline de CI/CD.
</Tip>
## Configurar Variáveis de Ambiente no AMP
As credenciais do registro privado devem ser configuradas como variáveis de ambiente no CrewAI AMP.
Você tem duas opções:
<Tabs>
<Tab title="Interface Web">
1. Faça login no [CrewAI AMP](https://app.crewai.com)
2. Navegue até sua automação
3. Abra a aba **Environment Variables**
4. Adicione cada variável (`UV_INDEX_*_USERNAME` e `UV_INDEX_*_PASSWORD`) com seu valor
Veja o passo [Deploy para AMP — Definir Variáveis de Ambiente](/pt-BR/enterprise/guides/deploy-to-amp#definir-as-variáveis-de-ambiente) para detalhes.
</Tab>
<Tab title="Implantação via CLI">
Adicione as variáveis ao seu arquivo `.env` local antes de executar `crewai deploy create`.
A CLI as transferirá com segurança para a plataforma:
```bash
# .env
OPENAI_API_KEY=sk-...
UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat-here
```
```bash
crewai deploy create
```
</Tab>
</Tabs>
<Warning>
**Nunca** faça commit de credenciais no seu repositório. Use variáveis de ambiente do AMP para todos os segredos.
O arquivo `.env` deve estar listado no `.gitignore`.
</Warning>
Para atualizar credenciais em uma implantação existente, veja [Atualizar Seu Crew — Variáveis de Ambiente](/pt-BR/enterprise/guides/update-crew).
## Como Tudo se Conecta
Quando o CrewAI AMP faz o build da sua automação, o fluxo de resolução funciona assim:
<Steps>
<Step title="Build inicia">
O AMP busca seu repositório e lê o `pyproject.toml` e o `uv.lock`.
</Step>
<Step title="UV resolve dependências">
O UV lê `[tool.uv.sources]` para determinar de qual index cada pacote deve vir.
</Step>
<Step title="UV autentica">
Para cada index privado, o UV busca `UV_INDEX_{NAME}_USERNAME` e `UV_INDEX_{NAME}_PASSWORD`
nas variáveis de ambiente que você configurou no AMP.
</Step>
<Step title="Pacotes são instalados">
O UV baixa e instala todos os pacotes — tanto públicos (do PyPI) quanto privados (do seu registro).
</Step>
<Step title="Automação executa">
Seu crew ou flow inicia com todas as dependências disponíveis.
</Step>
</Steps>
## Solução de Problemas
### Erros de Autenticação Durante o Build
**Sintoma**: Build falha com `401 Unauthorized` ou `403 Forbidden` ao resolver um pacote privado.
**Verifique**:
- Os nomes das variáveis de ambiente `UV_INDEX_*` correspondem exatamente ao nome do seu index (maiúsculas, hifens -> underscores)
- As credenciais estão definidas nas variáveis de ambiente do AMP, não apenas em um `.env` local
- Seu token/PAT tem as permissões de leitura necessárias para o feed de pacotes
- O token não expirou (especialmente relevante para AWS CodeArtifact)
### Pacote Não Encontrado
**Sintoma**: `No matching distribution found for my-private-package`.
**Verifique**:
- A URL do index no `pyproject.toml` termina com `/simple/`
- A entrada `[tool.uv.sources]` mapeia o nome correto do pacote para o nome correto do index
- O pacote está realmente publicado no seu registro privado
- Execute `uv lock` localmente com as mesmas credenciais para verificar se a resolução funciona
### Conflitos no Arquivo Lock
**Sintoma**: `uv lock` falha ou produz resultados inesperados após adicionar um index privado.
**Solução**: Defina as credenciais localmente e regenere:
```bash
export UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
export UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat
uv lock
```
Em seguida, faça commit do `uv.lock` atualizado.
## Guias Relacionados
<CardGroup cols={3}>
<Card title="Preparar para Implantação" icon="clipboard-check" href="/pt-BR/enterprise/guides/prepare-for-deployment">
Verifique a estrutura do projeto e as dependências antes de implantar.
</Card>
<Card title="Deploy para AMP" icon="rocket" href="/pt-BR/enterprise/guides/deploy-to-amp">
Implante seu crew ou flow e configure variáveis de ambiente.
</Card>
<Card title="Atualizar Seu Crew" icon="arrows-rotate" href="/pt-BR/enterprise/guides/update-crew">
Atualize variáveis de ambiente e envie alterações para uma implantação em execução.
</Card>
</CardGroup>

View File

@@ -98,11 +98,6 @@ from crewai_tools.tools.mongodb_vector_search_tool.vector_search import (
MongoDBVectorSearchTool,
)
from crewai_tools.tools.multion_tool.multion_tool import MultiOnTool
from crewai_tools.tools.oceanbase_vector_search_tool.oceanbase_vector_search_tool import (
OceanBaseToolSchema,
OceanBaseVectorSearchConfig,
OceanBaseVectorSearchTool,
)
from crewai_tools.tools.mysql_search_tool.mysql_search_tool import MySQLSearchTool
from crewai_tools.tools.nl2sql.nl2sql_tool import NL2SQLTool
from crewai_tools.tools.ocr_tool.ocr_tool import OCRTool
@@ -248,9 +243,6 @@ __all__ = [
"MongoDBVectorSearchTool",
"MultiOnTool",
"MySQLSearchTool",
"OceanBaseToolSchema",
"OceanBaseVectorSearchConfig",
"OceanBaseVectorSearchTool",
"NL2SQLTool",
"OCRTool",
"OxylabsAmazonProductScraperTool",

View File

@@ -87,11 +87,6 @@ from crewai_tools.tools.mongodb_vector_search_tool import (
MongoDBVectorSearchConfig,
MongoDBVectorSearchTool,
)
from crewai_tools.tools.oceanbase_vector_search_tool import (
OceanBaseToolSchema,
OceanBaseVectorSearchConfig,
OceanBaseVectorSearchTool,
)
from crewai_tools.tools.multion_tool.multion_tool import MultiOnTool
from crewai_tools.tools.mysql_search_tool.mysql_search_tool import MySQLSearchTool
from crewai_tools.tools.nl2sql.nl2sql_tool import NL2SQLTool
@@ -231,9 +226,6 @@ __all__ = [
"MongoDBVectorSearchConfig",
"MongoDBVectorSearchTool",
"MultiOnTool",
"OceanBaseToolSchema",
"OceanBaseVectorSearchConfig",
"OceanBaseVectorSearchTool",
"MySQLSearchTool",
"NL2SQLTool",
"OCRTool",

View File

@@ -1,144 +0,0 @@
# OceanBaseVectorSearchTool
## Description
This tool is designed for performing vector similarity searches within an OceanBase database. OceanBase is a distributed relational database developed by Ant Group that supports native vector indexing and search capabilities using HNSW (Hierarchical Navigable Small World) algorithm.
Use this tool to find semantically similar documents to a given query by leveraging OceanBase's vector search functionality.
For more information about OceanBase vector capabilities, see:
https://en.oceanbase.com/docs/common-oceanbase-database-10000000001976351
## Installation
Install the crewai_tools package with OceanBase support by executing the following command in your terminal:
```shell
pip install crewai-tools[oceanbase]
```
or
```shell
uv add crewai-tools --extra oceanbase
```
## Example
### Basic Usage
```python
from crewai_tools import OceanBaseVectorSearchTool
tool = OceanBaseVectorSearchTool(
connection_uri="127.0.0.1:2881",
user="root@test",
password="",
db_name="test",
table_name="documents",
)
```
### With Custom Configuration
```python
from crewai_tools import OceanBaseVectorSearchConfig, OceanBaseVectorSearchTool
query_config = OceanBaseVectorSearchConfig(
limit=10,
distance_func="cosine",
distance_threshold=0.5,
)
tool = OceanBaseVectorSearchTool(
connection_uri="127.0.0.1:2881",
user="root@test",
password="your_password",
db_name="my_database",
table_name="my_documents",
vector_column_name="embedding",
text_column_name="content",
metadata_column_name="metadata",
query_config=query_config,
embedding_model="text-embedding-3-large",
dimensions=3072,
)
```
### Adding the Tool to an Agent
```python
from crewai import Agent
from crewai_tools import OceanBaseVectorSearchTool
tool = OceanBaseVectorSearchTool(
connection_uri="127.0.0.1:2881",
user="root@test",
db_name="test",
table_name="documents",
)
rag_agent = Agent(
name="rag_agent",
role="You are a helpful assistant that can answer questions using the OceanBaseVectorSearchTool.",
goal="Answer user questions by searching relevant documents",
backstory="You have access to a knowledge base stored in OceanBase",
llm="gpt-4o-mini",
tools=[tool],
)
```
### Preloading Documents
```python
from crewai_tools import OceanBaseVectorSearchTool
import os
tool = OceanBaseVectorSearchTool(
connection_uri="127.0.0.1:2881",
user="root@test",
db_name="test",
table_name="documents",
)
texts = []
metadatas = []
for filename in os.listdir("knowledge"):
with open(os.path.join("knowledge", filename), "r") as f:
texts.append(f.read())
metadatas.append({"source": filename})
tool.add_texts(texts, metadatas=metadatas)
```
## Configuration Options
### OceanBaseVectorSearchConfig
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `limit` | int | 4 | Number of documents to return |
| `distance_func` | str | "l2" | Distance function: "l2", "cosine", or "inner_product" |
| `distance_threshold` | float | None | Only return results with distance <= threshold |
| `include_embeddings` | bool | False | Whether to include embedding vectors in results |
### OceanBaseVectorSearchTool
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `connection_uri` | str | Yes | OceanBase connection URI (e.g., "127.0.0.1:2881") |
| `user` | str | Yes | Username for connection (e.g., "root@test") |
| `password` | str | No | Password for connection |
| `db_name` | str | No | Database name (default: "test") |
| `table_name` | str | Yes | Table containing vector data |
| `vector_column_name` | str | No | Column with embeddings (default: "embedding") |
| `text_column_name` | str | No | Column with text content (default: "text") |
| `metadata_column_name` | str | No | Column with metadata (default: "metadata") |
| `embedding_model` | str | No | OpenAI model for embeddings (default: "text-embedding-3-large") |
| `dimensions` | int | No | Embedding dimensions (default: 1536) |
| `query_config` | OceanBaseVectorSearchConfig | No | Search configuration |
## Environment Variables
- `OPENAI_API_KEY`: Required for generating embeddings
- `AZURE_OPENAI_ENDPOINT`: Optional, for Azure OpenAI support

View File

@@ -1,12 +0,0 @@
from crewai_tools.tools.oceanbase_vector_search_tool.oceanbase_vector_search_tool import (
OceanBaseToolSchema,
OceanBaseVectorSearchConfig,
OceanBaseVectorSearchTool,
)
__all__ = [
"OceanBaseToolSchema",
"OceanBaseVectorSearchConfig",
"OceanBaseVectorSearchTool",
]

View File

@@ -1,267 +0,0 @@
from __future__ import annotations
import json
from logging import getLogger
import os
from typing import Any
from crewai.tools import BaseTool, EnvVar
from pydantic import BaseModel, Field
try:
import pyobvector # noqa: F401
PYOBVECTOR_AVAILABLE = True
except ImportError:
PYOBVECTOR_AVAILABLE = False
logger = getLogger(__name__)
class OceanBaseToolSchema(BaseModel):
"""Input schema for OceanBase vector search tool."""
query: str = Field(
...,
description="The query to search for relevant information in the OceanBase database.",
)
class OceanBaseVectorSearchConfig(BaseModel):
"""Configuration for OceanBase vector search queries."""
limit: int = Field(
default=4,
description="Number of documents to return.",
)
distance_threshold: float | None = Field(
default=None,
description="Only return results where distance is less than or equal to this threshold.",
)
distance_func: str = Field(
default="l2",
description="Distance function to use for similarity search. Options: 'l2', 'cosine', 'inner_product'.",
)
include_embeddings: bool = Field(
default=False,
description="Whether to include the embedding vector of each result.",
)
class OceanBaseVectorSearchTool(BaseTool):
"""Tool to perform vector search on OceanBase database."""
name: str = "OceanBaseVectorSearchTool"
description: str = (
"A tool to perform vector similarity search on an OceanBase database "
"for retrieving relevant information from stored documents."
)
args_schema: type[BaseModel] = OceanBaseToolSchema
query_config: OceanBaseVectorSearchConfig | None = Field(
default=None,
description="OceanBase vector search query configuration.",
)
embedding_model: str = Field(
default="text-embedding-3-large",
description="OpenAI embedding model to use for generating query embeddings.",
)
dimensions: int = Field(
default=1536,
description="Number of dimensions in the embedding vector.",
)
connection_uri: str = Field(
...,
description="Connection URI for OceanBase (e.g., '127.0.0.1:2881').",
)
user: str = Field(
...,
description="Username for OceanBase connection (e.g., 'root@test').",
)
password: str = Field(
default="",
description="Password for OceanBase connection.",
)
db_name: str = Field(
default="test",
description="Database name in OceanBase.",
)
table_name: str = Field(
...,
description="Name of the table containing vector data.",
)
vector_column_name: str = Field(
default="embedding",
description="Name of the column containing vector embeddings.",
)
text_column_name: str = Field(
default="text",
description="Name of the column containing text content.",
)
metadata_column_name: str | None = Field(
default="metadata",
description="Name of the column containing metadata (optional).",
)
env_vars: list[EnvVar] = Field(
default_factory=lambda: [
EnvVar(
name="OPENAI_API_KEY",
description="API key for OpenAI embeddings",
required=True,
),
]
)
package_dependencies: list[str] = Field(default_factory=lambda: ["pyobvector"])
_client: Any = None
_openai_client: Any = None
def __init__(self, **kwargs: Any) -> None:
super().__init__(**kwargs)
if not PYOBVECTOR_AVAILABLE:
import click
if click.confirm(
"You are missing the 'pyobvector' package. Would you like to install it?"
):
import subprocess
subprocess.run(["uv", "add", "pyobvector"], check=True) # noqa: S607
else:
raise ImportError(
"The 'pyobvector' package is required for OceanBaseVectorSearchTool."
)
if "AZURE_OPENAI_ENDPOINT" in os.environ:
from openai import AzureOpenAI
self._openai_client = AzureOpenAI()
elif "OPENAI_API_KEY" in os.environ:
from openai import Client
self._openai_client = Client()
else:
raise ValueError(
"OPENAI_API_KEY environment variable is required for OceanBaseVectorSearchTool."
)
from pyobvector import ObVecClient
self._client = ObVecClient(
uri=self.connection_uri,
user=self.user,
password=self.password,
db_name=self.db_name,
)
def _embed_text(self, text: str) -> list[float]:
"""Generate embedding for the given text using OpenAI."""
response = self._openai_client.embeddings.create(
input=[text],
model=self.embedding_model,
dimensions=self.dimensions,
)
return response.data[0].embedding
def _get_distance_func(self) -> Any:
"""Get the appropriate distance function from pyobvector."""
import pyobvector
config = self.query_config or OceanBaseVectorSearchConfig()
valid_distance_funcs = {
"l2": "l2_distance",
"cosine": "cosine_distance",
"inner_product": "inner_product",
}
func_name = valid_distance_funcs.get(config.distance_func, "l2_distance")
return getattr(pyobvector, func_name)
def _run(self, query: str) -> str:
"""Execute vector search on OceanBase."""
try:
config = self.query_config or OceanBaseVectorSearchConfig()
query_vector = self._embed_text(query)
output_columns = [self.text_column_name]
if self.metadata_column_name:
output_columns.append(self.metadata_column_name)
results = self._client.ann_search(
table_name=self.table_name,
vec_data=query_vector,
vec_column_name=self.vector_column_name,
distance_func=self._get_distance_func(),
with_dist=True,
topk=config.limit,
output_column_names=output_columns,
distance_threshold=config.distance_threshold,
)
formatted_results = []
for row in results:
result_dict: dict[str, Any] = {}
if len(row) >= 1:
result_dict["text"] = row[0]
if self.metadata_column_name and len(row) >= 2:
result_dict["metadata"] = row[1]
if len(row) > len(output_columns):
result_dict["distance"] = row[-1]
formatted_results.append(result_dict)
return json.dumps(formatted_results, indent=2, default=str)
except Exception as e:
logger.error(f"Error during OceanBase vector search: {e}")
return json.dumps({"error": str(e)})
def add_texts(
self,
texts: list[str],
metadatas: list[dict[str, Any]] | None = None,
ids: list[str] | None = None,
) -> list[str]:
"""Add texts with embeddings to the OceanBase table.
Args:
texts: List of text strings to add.
metadatas: Optional list of metadata dictionaries for each text.
ids: Optional list of unique IDs for each text.
Returns:
List of IDs for the added texts.
"""
import uuid
if ids is None:
ids = [str(uuid.uuid4()) for _ in texts]
if metadatas is None:
metadatas = [{} for _ in texts]
data = []
for text, metadata, doc_id in zip(texts, metadatas, ids, strict=False):
embedding = self._embed_text(text)
row = {
"id": doc_id,
self.text_column_name: text,
self.vector_column_name: embedding,
}
if self.metadata_column_name:
row[self.metadata_column_name] = metadata
data.append(row)
self._client.insert(self.table_name, data=data)
return ids
def __del__(self) -> None:
"""Cleanup clients on deletion."""
try:
if hasattr(self, "_openai_client") and self._openai_client:
self._openai_client.close()
except Exception as e:
logger.error(f"Error closing OpenAI client: {e}")

View File

@@ -1,208 +0,0 @@
import json
import sys
from unittest.mock import MagicMock, patch
import pytest
from crewai_tools import OceanBaseVectorSearchConfig
mock_pyobvector = MagicMock()
mock_pyobvector.ObVecClient = MagicMock()
mock_pyobvector.l2_distance = MagicMock(return_value="l2_func")
mock_pyobvector.cosine_distance = MagicMock(return_value="cosine_func")
mock_pyobvector.inner_product = MagicMock(return_value="ip_func")
sys.modules["pyobvector"] = mock_pyobvector
@pytest.fixture
def mock_openai_client():
"""Create a mock OpenAI client."""
mock_client = MagicMock()
mock_embedding = MagicMock()
mock_embedding.embedding = [0.1] * 1536
mock_response = MagicMock()
mock_response.data = [mock_embedding]
mock_client.embeddings.create.return_value = mock_response
return mock_client
@pytest.fixture
def mock_obvec_client():
"""Create a mock OceanBase vector client."""
mock_client = MagicMock()
return mock_client
@pytest.fixture
def oceanbase_vector_search_tool(mock_openai_client, mock_obvec_client):
"""Create an OceanBaseVectorSearchTool with mocked clients."""
from crewai_tools import OceanBaseVectorSearchTool
with patch.dict("os.environ", {"OPENAI_API_KEY": "test-key"}):
with patch(
"crewai_tools.tools.oceanbase_vector_search_tool.oceanbase_vector_search_tool.PYOBVECTOR_AVAILABLE",
True,
):
mock_pyobvector.ObVecClient.return_value = mock_obvec_client
with patch("openai.Client") as mock_openai_class:
mock_openai_class.return_value = mock_openai_client
tool = OceanBaseVectorSearchTool(
connection_uri="127.0.0.1:2881",
user="root@test",
password="",
db_name="test",
table_name="test_table",
)
tool._openai_client = mock_openai_client
tool._client = mock_obvec_client
yield tool
def test_successful_query_execution(oceanbase_vector_search_tool, mock_obvec_client):
"""Test successful vector search query execution."""
mock_obvec_client.ann_search.return_value = [
("test document content", {"source": "test.txt"}, 0.1),
("another document", {"source": "test2.txt"}, 0.2),
]
results = json.loads(oceanbase_vector_search_tool._run(query="test query"))
assert len(results) == 2
assert results[0]["text"] == "test document content"
assert results[0]["metadata"] == {"source": "test.txt"}
assert results[0]["distance"] == 0.1
def test_query_with_custom_config(mock_openai_client, mock_obvec_client):
"""Test vector search with custom configuration."""
from crewai_tools import OceanBaseVectorSearchTool
query_config = OceanBaseVectorSearchConfig(
limit=10,
distance_func="cosine",
distance_threshold=0.5,
)
with patch.dict("os.environ", {"OPENAI_API_KEY": "test-key"}):
with patch(
"crewai_tools.tools.oceanbase_vector_search_tool.oceanbase_vector_search_tool.PYOBVECTOR_AVAILABLE",
True,
):
mock_pyobvector.ObVecClient.return_value = mock_obvec_client
with patch("openai.Client") as mock_openai_class:
mock_openai_class.return_value = mock_openai_client
tool = OceanBaseVectorSearchTool(
connection_uri="127.0.0.1:2881",
user="root@test",
db_name="test",
table_name="test_table",
query_config=query_config,
)
tool._openai_client = mock_openai_client
tool._client = mock_obvec_client
mock_obvec_client.ann_search.return_value = [("doc", {}, 0.3)]
tool._run(query="test")
call_kwargs = mock_obvec_client.ann_search.call_args.kwargs
assert call_kwargs["topk"] == 10
assert call_kwargs["distance_threshold"] == 0.5
def test_add_texts(oceanbase_vector_search_tool, mock_obvec_client):
"""Test adding texts to the OceanBase table."""
texts = ["document 1", "document 2"]
metadatas = [{"source": "file1.txt"}, {"source": "file2.txt"}]
result_ids = oceanbase_vector_search_tool.add_texts(texts, metadatas=metadatas)
assert len(result_ids) == 2
mock_obvec_client.insert.assert_called_once()
call_args = mock_obvec_client.insert.call_args
assert call_args[0][0] == "test_table"
assert len(call_args[1]["data"]) == 2
def test_add_texts_without_metadata(oceanbase_vector_search_tool, mock_obvec_client):
"""Test adding texts without metadata."""
texts = ["document 1", "document 2"]
result_ids = oceanbase_vector_search_tool.add_texts(texts)
assert len(result_ids) == 2
mock_obvec_client.insert.assert_called_once()
def test_error_handling(oceanbase_vector_search_tool, mock_obvec_client):
"""Test error handling during search."""
mock_obvec_client.ann_search.side_effect = Exception("Database connection error")
result = json.loads(oceanbase_vector_search_tool._run(query="test"))
assert "error" in result
assert "Database connection error" in result["error"]
def test_config_defaults():
"""Test OceanBaseVectorSearchConfig default values."""
config = OceanBaseVectorSearchConfig()
assert config.limit == 4
assert config.distance_func == "l2"
assert config.distance_threshold is None
assert config.include_embeddings is False
def test_config_custom_values():
"""Test OceanBaseVectorSearchConfig with custom values."""
config = OceanBaseVectorSearchConfig(
limit=20,
distance_func="cosine",
distance_threshold=0.8,
include_embeddings=True,
)
assert config.limit == 20
assert config.distance_func == "cosine"
assert config.distance_threshold == 0.8
assert config.include_embeddings is True
def test_tool_schema():
"""Test OceanBaseToolSchema validation."""
from crewai_tools import OceanBaseToolSchema
schema = OceanBaseToolSchema(query="test query")
assert schema.query == "test query"
def test_tool_schema_requires_query():
"""Test that OceanBaseToolSchema requires a query."""
from crewai_tools import OceanBaseToolSchema
from pydantic import ValidationError
with pytest.raises(ValidationError):
OceanBaseToolSchema()
def test_distance_function_selection(oceanbase_vector_search_tool):
"""Test that the correct distance function is selected."""
oceanbase_vector_search_tool.query_config = OceanBaseVectorSearchConfig(
distance_func="l2"
)
func = oceanbase_vector_search_tool._get_distance_func()
assert func == mock_pyobvector.l2_distance
oceanbase_vector_search_tool.query_config = OceanBaseVectorSearchConfig(
distance_func="cosine"
)
func = oceanbase_vector_search_tool._get_distance_func()
assert func == mock_pyobvector.cosine_distance
oceanbase_vector_search_tool.query_config = OceanBaseVectorSearchConfig(
distance_func="inner_product"
)
func = oceanbase_vector_search_tool._get_distance_func()
assert func == mock_pyobvector.inner_product

View File

@@ -4,6 +4,7 @@ import urllib.request
import warnings
from crewai.agent.core import Agent
from crewai.agent.planning_config import PlanningConfig
from crewai.crew import Crew
from crewai.crews.crew_output import CrewOutput
from crewai.flow.flow import Flow
@@ -82,6 +83,7 @@ __all__ = [
"Knowledge",
"LLMGuardrail",
"Memory",
"PlanningConfig",
"Process",
"Task",
"TaskOutput",

View File

@@ -24,6 +24,7 @@ from pydantic import (
)
from typing_extensions import Self
from crewai.agent.planning_config import PlanningConfig
from crewai.agent.utils import (
ahandle_knowledge_retrieval,
apply_training_data,
@@ -211,13 +212,23 @@ class Agent(BaseAgent):
default="safe",
description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).",
)
reasoning: bool = Field(
planning_config: PlanningConfig | None = Field(
default=None,
description="Configuration for agent planning before task execution.",
)
planning: bool = Field(
default=False,
description="Whether the agent should reflect and create a plan before executing a task.",
)
reasoning: bool = Field(
default=False,
description="[DEPRECATED: Use planning_config instead] Whether the agent should reflect and create a plan before executing a task.",
deprecated=True,
)
max_reasoning_attempts: int | None = Field(
default=None,
description="Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
description="[DEPRECATED: Use planning_config.max_attempts instead] Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
deprecated=True,
)
embedder: EmbedderConfig | None = Field(
default=None,
@@ -284,8 +295,26 @@ class Agent(BaseAgent):
if self.allow_code_execution:
self._validate_docker_installation()
# Handle backward compatibility: convert reasoning=True to planning_config
if self.reasoning and self.planning_config is None:
import warnings
warnings.warn(
"The 'reasoning' parameter is deprecated. Use 'planning_config=PlanningConfig()' instead.",
DeprecationWarning,
stacklevel=2,
)
self.planning_config = PlanningConfig(
max_attempts=self.max_reasoning_attempts,
)
return self
@property
def planning_enabled(self) -> bool:
"""Check if planning is enabled for this agent."""
return self.planning_config is not None or self.planning
def _setup_agent_executor(self) -> None:
if not self.cache_handler:
self.cache_handler = CacheHandler()
@@ -354,7 +383,11 @@ class Agent(BaseAgent):
ValueError: If the max execution time is not a positive integer.
RuntimeError: If the agent execution fails for other reasons.
"""
handle_reasoning(self, task)
# Only call handle_reasoning for legacy CrewAgentExecutor
# For AgentExecutor, planning is handled in AgentExecutor.generate_plan()
if self.executor_class is not AgentExecutor:
handle_reasoning(self, task)
self._inject_date_to_task(task)
if self.tools_handler:
@@ -592,7 +625,10 @@ class Agent(BaseAgent):
ValueError: If the max execution time is not a positive integer.
RuntimeError: If the agent execution fails for other reasons.
"""
handle_reasoning(self, task)
if self.executor_class is not AgentExecutor:
handle_reasoning(
self, task
) # we need this till CrewAgentExecutor migrates to AgentExecutor
self._inject_date_to_task(task)
if self.tools_handler:
@@ -1712,7 +1748,8 @@ class Agent(BaseAgent):
existing_names = {sanitize_tool_name(t.name) for t in raw_tools}
raw_tools.extend(
mt for mt in create_memory_tools(agent_memory)
mt
for mt in create_memory_tools(agent_memory)
if sanitize_tool_name(mt.name) not in existing_names
)
@@ -1937,94 +1974,111 @@ class Agent(BaseAgent):
if isinstance(messages, str):
input_str = messages
else:
input_str = "\n".join(
str(msg.get("content", "")) for msg in messages if msg.get("content")
) or "User request"
raw = (
f"Input: {input_str}\n"
f"Agent: {self.role}\n"
f"Result: {output_text}"
)
input_str = (
"\n".join(
str(msg.get("content", ""))
for msg in messages
if msg.get("content")
)
or "User request"
)
raw = f"Input: {input_str}\nAgent: {self.role}\nResult: {output_text}"
extracted = agent_memory.extract_memories(raw)
if extracted:
agent_memory.remember_many(extracted)
except Exception as e:
self._logger.log("error", f"Failed to save kickoff result to memory: {e}")
def _build_output_from_result(
self,
result: dict[str, Any],
executor: AgentExecutor,
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Build a LiteAgentOutput from an executor result dict.
Shared logic used by both sync and async execution paths.
Args:
result: The result dictionary from executor.invoke / invoke_async.
executor: The executor instance.
response_format: Optional response format.
Returns:
LiteAgentOutput with raw output, formatted result, and metrics.
"""
import json
output = result.get("output", "")
# Handle response format conversion
formatted_result: BaseModel | None = None
raw_output: str
if isinstance(output, BaseModel):
formatted_result = output
raw_output = output.model_dump_json()
elif response_format:
raw_output = str(output) if not isinstance(output, str) else output
try:
model_schema = generate_model_description(response_format)
schema = json.dumps(model_schema, indent=2)
instructions = self.i18n.slice("formatted_task_instructions").format(
output_format=schema
)
converter = Converter(
llm=self.llm,
text=raw_output,
model=response_format,
instructions=instructions,
)
conversion_result = converter.to_pydantic()
if isinstance(conversion_result, BaseModel):
formatted_result = conversion_result
except ConverterError:
pass # Keep raw output if conversion fails
else:
raw_output = str(output) if not isinstance(output, str) else output
# Get token usage metrics
if isinstance(self.llm, BaseLLM):
usage_metrics = self.llm.get_token_usage_summary()
else:
usage_metrics = self._token_process.get_summary()
raw_str = (
raw_output
if isinstance(raw_output, str)
else raw_output.model_dump_json()
if isinstance(raw_output, BaseModel)
else str(raw_output)
)
todo_results = LiteAgentOutput.from_todo_items(executor.state.todos.items)
return LiteAgentOutput(
raw=raw_str,
pydantic=formatted_result,
agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=list(executor.state.messages),
plan=executor.state.plan,
todos=todo_results,
replan_count=executor.state.replan_count,
last_replan_reason=executor.state.last_replan_reason,
)
def _execute_and_build_output(
self,
executor: AgentExecutor,
inputs: dict[str, str],
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Execute the agent and build the output object.
Args:
executor: The executor instance.
inputs: Input dictionary for execution.
response_format: Optional response format.
Returns:
LiteAgentOutput with raw output, formatted result, and metrics.
"""
import json
# Execute the agent (this is called from sync path, so invoke returns dict)
"""Execute the agent synchronously and build the output object."""
result = cast(dict[str, Any], executor.invoke(inputs))
output = result.get("output", "")
# Handle response format conversion
formatted_result: BaseModel | None = None
raw_output: str
if isinstance(output, BaseModel):
formatted_result = output
raw_output = output.model_dump_json()
elif response_format:
raw_output = str(output) if not isinstance(output, str) else output
try:
model_schema = generate_model_description(response_format)
schema = json.dumps(model_schema, indent=2)
instructions = self.i18n.slice("formatted_task_instructions").format(
output_format=schema
)
converter = Converter(
llm=self.llm,
text=raw_output,
model=response_format,
instructions=instructions,
)
conversion_result = converter.to_pydantic()
if isinstance(conversion_result, BaseModel):
formatted_result = conversion_result
except ConverterError:
pass # Keep raw output if conversion fails
else:
raw_output = str(output) if not isinstance(output, str) else output
# Get token usage metrics
if isinstance(self.llm, BaseLLM):
usage_metrics = self.llm.get_token_usage_summary()
else:
usage_metrics = self._token_process.get_summary()
raw_str = (
raw_output
if isinstance(raw_output, str)
else raw_output.model_dump_json()
if isinstance(raw_output, BaseModel)
else str(raw_output)
)
return LiteAgentOutput(
raw=raw_str,
pydantic=formatted_result,
agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=executor.messages,
)
return self._build_output_from_result(result, executor, response_format)
async def _execute_and_build_output_async(
self,
@@ -2032,77 +2086,9 @@ class Agent(BaseAgent):
inputs: dict[str, str],
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Execute the agent asynchronously and build the output object.
This is the async version of _execute_and_build_output that uses
invoke_async() for native async execution within event loops.
Args:
executor: The executor instance.
inputs: Input dictionary for execution.
response_format: Optional response format.
Returns:
LiteAgentOutput with raw output, formatted result, and metrics.
"""
import json
# Execute the agent asynchronously
"""Execute the agent asynchronously and build the output object."""
result = await executor.invoke_async(inputs)
output = result.get("output", "")
# Handle response format conversion
formatted_result: BaseModel | None = None
raw_output: str
if isinstance(output, BaseModel):
formatted_result = output
raw_output = output.model_dump_json()
elif response_format:
raw_output = str(output) if not isinstance(output, str) else output
try:
model_schema = generate_model_description(response_format)
schema = json.dumps(model_schema, indent=2)
instructions = self.i18n.slice("formatted_task_instructions").format(
output_format=schema
)
converter = Converter(
llm=self.llm,
text=raw_output,
model=response_format,
instructions=instructions,
)
conversion_result = converter.to_pydantic()
if isinstance(conversion_result, BaseModel):
formatted_result = conversion_result
except ConverterError:
pass # Keep raw output if conversion fails
else:
raw_output = str(output) if not isinstance(output, str) else output
# Get token usage metrics
if isinstance(self.llm, BaseLLM):
usage_metrics = self.llm.get_token_usage_summary()
else:
usage_metrics = self._token_process.get_summary()
raw_str = (
raw_output
if isinstance(raw_output, str)
else raw_output.model_dump_json()
if isinstance(raw_output, BaseModel)
else str(raw_output)
)
return LiteAgentOutput(
raw=raw_str,
pydantic=formatted_result,
agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=executor.messages,
)
return self._build_output_from_result(result, executor, response_format)
def _process_kickoff_guardrail(
self,

View File

@@ -0,0 +1,115 @@
from __future__ import annotations
from typing import Any, Literal
from pydantic import BaseModel, Field
class PlanningConfig(BaseModel):
"""Configuration for agent planning/reasoning before task execution.
This allows users to customize the planning behavior including prompts,
iteration limits, the LLM used for planning, and the reasoning effort
level that controls post-step observation and replanning behavior.
Note: To disable planning, don't pass a planning_config or set planning=False
on the Agent. The presence of a PlanningConfig enables planning.
Attributes:
reasoning_effort: Controls observation and replanning after each step.
- "low": Observe each step (validates success), but skip the
decide/replan/refine pipeline. Steps are marked complete and
execution continues linearly. Fastest option.
- "medium": Observe each step. On failure, trigger replanning.
On success, skip refinement and continue. Balanced option.
- "high": Full observation pipeline — observe every step, then
route through decide_next_action which can trigger early goal
achievement, full replanning, or lightweight refinement.
Most adaptive but adds latency per step.
max_attempts: Maximum number of planning refinement attempts.
If None, will continue until the agent indicates readiness.
max_steps: Maximum number of steps in the generated plan.
system_prompt: Custom system prompt for planning. Uses default if None.
plan_prompt: Custom prompt for creating the initial plan.
refine_prompt: Custom prompt for refining the plan.
llm: LLM to use for planning. Uses agent's LLM if None.
Example:
```python
from crewai import Agent
from crewai.agent.planning_config import PlanningConfig
# Simple usage — fast, linear execution (default)
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(),
)
# Balanced — replan only when steps fail
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(
reasoning_effort="medium",
),
)
# Full adaptive planning with refinement and replanning
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(
reasoning_effort="high",
max_attempts=3,
max_steps=10,
plan_prompt="Create a focused plan for: {description}",
llm="gpt-4o-mini", # Use cheaper model for planning
),
)
```
"""
reasoning_effort: Literal["low", "medium", "high"] = Field(
default="low",
description=(
"Controls post-step observation and replanning behavior. "
"'low' observes steps but skips replanning/refinement (fastest). "
"'medium' observes and replans only on step failure (balanced). "
"'high' runs full observation pipeline with replanning, refinement, "
"and early goal detection (most adaptive, highest latency)."
),
)
max_attempts: int | None = Field(
default=None,
description=(
"Maximum number of planning refinement attempts. "
"If None, will continue until the agent indicates readiness."
),
)
max_steps: int = Field(
default=20,
description="Maximum number of steps in the generated plan.",
ge=1,
)
system_prompt: str | None = Field(
default=None,
description="Custom system prompt for planning. Uses default if None.",
)
plan_prompt: str | None = Field(
default=None,
description="Custom prompt for creating the initial plan.",
)
refine_prompt: str | None = Field(
default=None,
description="Custom prompt for refining the plan.",
)
llm: str | Any | None = Field(
default=None,
description="LLM to use for planning. Uses agent's LLM if None.",
)
model_config = {"arbitrary_types_allowed": True}

View File

@@ -28,13 +28,20 @@ if TYPE_CHECKING:
def handle_reasoning(agent: Agent, task: Task) -> None:
"""Handle the reasoning process for an agent before task execution.
"""Handle the reasoning/planning process for an agent before task execution.
This function checks if planning is enabled for the agent and, if so,
creates a plan that gets appended to the task description.
Note: This function is used by CrewAgentExecutor (legacy path).
For AgentExecutor, planning is handled in AgentExecutor.generate_plan().
Args:
agent: The agent performing the task.
task: The task to execute.
"""
if not agent.reasoning:
# Check if planning is enabled using the planning_enabled property
if not getattr(agent, "planning_enabled", False):
return
try:
@@ -43,13 +50,13 @@ def handle_reasoning(agent: Agent, task: Task) -> None:
AgentReasoningOutput,
)
reasoning_handler = AgentReasoning(task=task, agent=agent)
reasoning_output: AgentReasoningOutput = (
reasoning_handler.handle_agent_reasoning()
planning_handler = AgentReasoning(agent=agent, task=task)
planning_output: AgentReasoningOutput = (
planning_handler.handle_agent_reasoning()
)
task.description += f"\n\nReasoning Plan:\n{reasoning_output.plan.plan}"
task.description += f"\n\nPlanning:\n{planning_output.plan.plan}"
except Exception as e:
agent._logger.log("error", f"Error during reasoning process: {e!s}")
agent._logger.log("error", f"Error during planning: {e!s}")
def build_task_prompt_with_schema(task: Task, task_prompt: str, i18n: I18N) -> str:

View File

@@ -0,0 +1,309 @@
"""PlannerObserver: Observation phase after each step execution.
Implements the "Observe" phase. After every step execution, the Planner
analyzes what happened, what new information was learned, and whether the
remaining plan is still valid.
This is NOT an error detector — it runs on every step, including successes,
to incorporate runtime observations into the remaining plan.
Refinements are structured (StepRefinement objects) and applied directly
from the observation result — no second LLM call required.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.observation_events import (
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.llm_utils import create_llm
from crewai.utilities.planning_types import StepObservation, TodoItem
from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.task import Task
logger = logging.getLogger(__name__)
class PlannerObserver:
"""Observes step execution results and decides on plan continuation.
After EVERY step execution, this class:
1. Analyzes what the step accomplished
2. Identifies new information learned
3. Decides if the remaining plan is still valid
4. Suggests lightweight refinements or triggers full replanning
LLM resolution (magical fallback):
- If ``agent.planning_config.llm`` is explicitly set → use that
- Otherwise → fall back to ``agent.llm`` (same LLM for everything)
Args:
agent: The agent instance (for LLM resolution and config).
task: Optional task context (for description and expected output).
"""
def __init__(
self,
agent: Agent,
task: Task | None = None,
kickoff_input: str = "",
) -> None:
self.agent = agent
self.task = task
self.kickoff_input = kickoff_input
self.llm = self._resolve_llm()
self._i18n: I18N = get_i18n()
def _resolve_llm(self) -> Any:
"""Resolve which LLM to use for observation/planning.
Mirrors AgentReasoning._resolve_llm(): uses planning_config.llm
if explicitly set, otherwise falls back to agent.llm.
Returns:
The resolved LLM instance.
"""
from crewai.llm import LLM
config = getattr(self.agent, "planning_config", None)
if config is not None and config.llm is not None:
if isinstance(config.llm, LLM):
return config.llm
return create_llm(config.llm)
return self.agent.llm
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def observe(
self,
completed_step: TodoItem,
result: str,
all_completed: list[TodoItem],
remaining_todos: list[TodoItem],
) -> StepObservation:
"""Observe a step's result and decide on plan continuation.
This runs after EVERY step execution — not just failures.
Args:
completed_step: The todo item that was just executed.
result: The final result string from the step.
all_completed: All previously completed todos (for context).
remaining_todos: The pending todos still in the plan.
Returns:
StepObservation with the Planner's analysis. Any suggested
refinements are structured StepRefinement objects ready for
direct application — no second LLM call needed.
"""
agent_role = self.agent.role
crewai_event_bus.emit(
self.agent,
event=StepObservationStartedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
from_task=self.task,
from_agent=self.agent,
),
)
messages = self._build_observation_messages(
completed_step, result, all_completed, remaining_todos
)
try:
response = self.llm.call(
messages,
response_model=StepObservation,
from_task=self.task,
from_agent=self.agent,
)
if isinstance(response, StepObservation):
observation = response
else:
observation = StepObservation(
step_completed_successfully=True,
key_information_learned=str(response) if response else "",
remaining_plan_still_valid=True,
)
refinement_summaries = (
[
f"Step {r.step_number}: {r.new_description}"
for r in observation.suggested_refinements
]
if observation.suggested_refinements
else None
)
crewai_event_bus.emit(
self.agent,
event=StepObservationCompletedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
step_completed_successfully=observation.step_completed_successfully,
key_information_learned=observation.key_information_learned,
remaining_plan_still_valid=observation.remaining_plan_still_valid,
needs_full_replan=observation.needs_full_replan,
replan_reason=observation.replan_reason,
goal_already_achieved=observation.goal_already_achieved,
suggested_refinements=refinement_summaries,
from_task=self.task,
from_agent=self.agent,
),
)
return observation
except Exception as e:
logger.warning(
f"Observation LLM call failed: {e}. Defaulting to conservative replan."
)
crewai_event_bus.emit(
self.agent,
event=StepObservationFailedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
error=str(e),
from_task=self.task,
from_agent=self.agent,
),
)
# Don't force a full replan — the step may have succeeded even if the
# observer LLM failed to parse the result. Defaulting to "continue" is
# far less disruptive than wiping the entire plan on every observer error.
return StepObservation(
step_completed_successfully=True,
key_information_learned="",
remaining_plan_still_valid=True,
needs_full_replan=False,
)
def _extract_task_section(self, text: str) -> str:
"""Extract the ## Task body from a structured enriched instruction.
Falls back to the full text (capped at 2000 chars) for plain inputs.
"""
for marker in ("\n## Task\n", "\n## Task:", "## Task\n"):
idx = text.find(marker)
if idx >= 0:
start = idx + len(marker)
for end_marker in ("\n---\n", "\n## "):
end = text.find(end_marker, start)
if end > 0:
return text[start:end].strip()
return text[start : start + 2000].strip()
return text[:2000] if len(text) > 2000 else text
def apply_refinements(
self,
observation: StepObservation,
remaining_todos: list[TodoItem],
) -> list[TodoItem]:
"""Apply structured refinements from the observation directly to todo descriptions.
No LLM call needed — refinements are already structured StepRefinement
objects produced by the observation call. This is a pure in-memory update.
Args:
observation: The observation containing structured refinements.
remaining_todos: The pending todos to update in-place.
Returns:
The same todo list with updated descriptions where refinements applied.
"""
if not observation.suggested_refinements:
return remaining_todos
todo_by_step: dict[int, TodoItem] = {t.step_number: t for t in remaining_todos}
for refinement in observation.suggested_refinements:
if refinement.step_number in todo_by_step and refinement.new_description:
todo_by_step[refinement.step_number].description = refinement.new_description
return remaining_todos
# ------------------------------------------------------------------
# Internal: Message building
# ------------------------------------------------------------------
def _build_observation_messages(
self,
completed_step: TodoItem,
result: str,
all_completed: list[TodoItem],
remaining_todos: list[TodoItem],
) -> list[LLMMessage]:
"""Build messages for the observation LLM call."""
task_desc = ""
task_goal = ""
if self.task:
task_desc = self.task.description or ""
task_goal = self.task.expected_output or ""
elif self.kickoff_input:
# Standalone kickoff path — no Task object, but we have the raw input.
# Extract just the ## Task section so the observer sees the actual goal,
# not the full enriched instruction with env/tools/verification noise.
task_desc = self._extract_task_section(self.kickoff_input)
task_goal = "Complete the task successfully"
system_prompt = self._i18n.retrieve("planning", "observation_system_prompt")
# Build context of what's been done
completed_summary = ""
if all_completed:
completed_lines = []
for todo in all_completed:
result_preview = (todo.result or "")[:200]
completed_lines.append(
f" Step {todo.step_number}: {todo.description}\n"
f" Result: {result_preview}"
)
completed_summary = "\n## Previously completed steps:\n" + "\n".join(
completed_lines
)
# Build remaining plan
remaining_summary = ""
if remaining_todos:
remaining_lines = [
f" Step {todo.step_number}: {todo.description}"
for todo in remaining_todos
]
remaining_summary = "\n## Remaining plan steps:\n" + "\n".join(
remaining_lines
)
user_prompt = self._i18n.retrieve("planning", "observation_user_prompt").format(
task_description=task_desc,
task_goal=task_goal,
completed_summary=completed_summary,
step_number=completed_step.step_number,
step_description=completed_step.description,
step_result=result,
remaining_summary=remaining_summary,
)
return [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]

View File

@@ -0,0 +1,608 @@
"""StepExecutor: Isolated executor for a single plan step.
Implements the direct-action execution pattern from Plan-and-Act
(arxiv 2503.09572): the Executor receives one step description,
makes a single LLM call, executes any tool call returned, and
returns the result immediately.
There is no inner loop. Recovery from failure (retry, replan) is
the responsibility of PlannerObserver and AgentExecutor — keeping
this class single-purpose and fast.
"""
from __future__ import annotations
from collections.abc import Callable
from datetime import datetime
import json
import time
from typing import TYPE_CHECKING, Any
from pydantic import BaseModel
from crewai.agents.parser import AgentAction, AgentFinish
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageErrorEvent,
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.utilities.agent_utils import (
build_tool_calls_assistant_message,
check_native_tool_support,
enforce_rpm_limit,
execute_single_native_tool_call,
format_message_for_llm,
is_tool_call_list,
process_llm_response,
setup_native_tools,
)
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.planning_types import TodoItem
from crewai.utilities.printer import Printer
from crewai.utilities.step_execution_context import StepExecutionContext, StepResult
from crewai.utilities.string_utils import sanitize_tool_name
from crewai.utilities.tool_utils import execute_tool_and_check_finality
from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agents.tools_handler import ToolsHandler
from crewai.crew import Crew
from crewai.llms.base_llm import BaseLLM
from crewai.task import Task
from crewai.tools.base_tool import BaseTool
from crewai.tools.structured_tool import CrewStructuredTool
class StepExecutor:
"""Executes a SINGLE todo item using direct-action execution.
The StepExecutor owns its own message list per invocation. It never reads
or writes the AgentExecutor's state. Results flow back via StepResult.
Execution pattern (per Plan-and-Act, arxiv 2503.09572):
1. Build messages from todo + context
2. Call LLM once (with or without native tools)
3. If tool call → execute it → return tool result
4. If text answer → return it directly
No inner loop — recovery is PlannerObserver's responsibility.
Args:
llm: The language model to use for execution.
tools: Structured tools available to the executor.
agent: The agent instance (for role/goal/verbose/config).
original_tools: Original BaseTool instances (needed for native tool schema).
tools_handler: Optional tools handler for caching and delegation tracking.
task: Optional task context.
crew: Optional crew context.
function_calling_llm: Optional separate LLM for function calling.
request_within_rpm_limit: Optional RPM limit function.
callbacks: Optional list of callbacks.
i18n: Optional i18n instance.
"""
def __init__(
self,
llm: BaseLLM,
tools: list[CrewStructuredTool],
agent: Agent,
original_tools: list[BaseTool] | None = None,
tools_handler: ToolsHandler | None = None,
task: Task | None = None,
crew: Crew | None = None,
function_calling_llm: BaseLLM | Any | None = None,
request_within_rpm_limit: Callable[[], bool] | None = None,
callbacks: list[Any] | None = None,
i18n: I18N | None = None,
) -> None:
self.llm = llm
self.tools = tools
self.agent = agent
self.original_tools = original_tools or []
self.tools_handler = tools_handler
self.task = task
self.crew = crew
self.function_calling_llm = function_calling_llm
self.request_within_rpm_limit = request_within_rpm_limit
self.callbacks = callbacks or []
self._i18n: I18N = i18n or get_i18n()
self._printer: Printer = Printer()
# Native tool support — set up once
self._use_native_tools = check_native_tool_support(self.llm, self.original_tools)
self._openai_tools: list[dict[str, Any]] = []
self._available_functions: dict[str, Callable[..., Any]] = {}
if self._use_native_tools and self.original_tools:
self._openai_tools, self._available_functions = setup_native_tools(
self.original_tools
)
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def execute(self, todo: TodoItem, context: StepExecutionContext) -> StepResult:
"""Execute a single todo item using a multi-turn action loop.
Enforces the RPM limit, builds a fresh message list, then iterates
LLM call → tool execution → observation until the LLM signals it is
done (text answer) or max_step_iterations is reached. Never touches
external AgentExecutor state.
Args:
todo: The todo item to execute.
context: Immutable context with task info and dependency results.
Returns:
StepResult with the outcome.
"""
start_time = time.monotonic()
tool_calls_made: list[str] = []
try:
enforce_rpm_limit(self.request_within_rpm_limit)
messages = self._build_isolated_messages(todo, context)
if self._use_native_tools:
result_text = self._execute_native(messages, tool_calls_made)
else:
result_text = self._execute_text_parsed(messages, tool_calls_made)
self._validate_expected_tool_usage(todo, tool_calls_made)
elapsed = time.monotonic() - start_time
return StepResult(
success=True,
result=result_text,
tool_calls_made=tool_calls_made,
execution_time=elapsed,
)
except Exception as e:
elapsed = time.monotonic() - start_time
return StepResult(
success=False,
result="",
error=str(e),
tool_calls_made=tool_calls_made,
execution_time=elapsed,
)
# ------------------------------------------------------------------
# Internal: Message building
# ------------------------------------------------------------------
def _build_isolated_messages(
self, todo: TodoItem, context: StepExecutionContext
) -> list[LLMMessage]:
"""Build a fresh message list for this step's execution.
System prompt tells the LLM it is an Executor focused on one step.
User prompt provides the step description, dependencies, and tools.
"""
system_prompt = self._build_system_prompt()
user_prompt = self._build_user_prompt(todo, context)
return [
format_message_for_llm(system_prompt, role="system"),
format_message_for_llm(user_prompt, role="user"),
]
def _build_system_prompt(self) -> str:
"""Build the Executor's system prompt."""
role = self.agent.role if self.agent else "Assistant"
goal = self.agent.goal if self.agent else "Complete tasks efficiently"
backstory = getattr(self.agent, "backstory", "") or ""
tools_section = ""
if self.tools and not self._use_native_tools:
tool_names = ", ".join(sanitize_tool_name(t.name) for t in self.tools)
tools_section = self._i18n.retrieve(
"planning", "step_executor_tools_section"
).format(tool_names=tool_names)
return self._i18n.retrieve("planning", "step_executor_system_prompt").format(
role=role,
backstory=backstory,
goal=goal,
tools_section=tools_section,
)
def _extract_task_section(self, task_description: str) -> str:
"""Extract the most relevant portion of the task description.
For structured descriptions (e.g. harbor_agent-style with ## Task
and ## Instructions sections), extracts just the task body so the
executor sees the requirements without duplicating tool/verification
instructions that are already in the system prompt.
For plain descriptions, returns the full text (up to 2000 chars).
"""
# Try to extract between "## Task" and the next "---" separator
# or next "##" heading — this isolates the task spec from env/tool noise.
for marker in ("\n## Task\n", "\n## Task:", "## Task\n"):
idx = task_description.find(marker)
if idx >= 0:
start = idx + len(marker)
# End at the first horizontal rule or next top-level ## section
for end_marker in ("\n---\n", "\n## "):
end = task_description.find(end_marker, start)
if end > 0:
return task_description[start:end].strip()
# No end marker — take up to 2000 chars
return task_description[start : start + 2000].strip()
# No structured format — use the full description, reasonably truncated
if len(task_description) > 2000:
return task_description[:2000] + "\n... [truncated]"
return task_description
def _build_user_prompt(self, todo: TodoItem, context: StepExecutionContext) -> str:
"""Build the user prompt for this specific step."""
parts: list[str] = []
# Include overall task context so the executor knows the full goal and
# required output format/location — critical for knowing WHAT to produce.
# We extract only the task body (not tool instructions or verification
# sections) to avoid duplicating directives already in the system prompt.
if context.task_description:
task_section = self._extract_task_section(context.task_description)
if task_section:
parts.append(
self._i18n.retrieve("planning", "step_executor_task_context").format(
task_context=task_section,
)
)
parts.append(
self._i18n.retrieve("planning", "step_executor_user_prompt").format(
step_description=todo.description,
)
)
if todo.tool_to_use:
parts.append(
self._i18n.retrieve("planning", "step_executor_suggested_tool").format(
tool_to_use=todo.tool_to_use,
)
)
# Include dependency results (final results only, no traces)
if context.dependency_results:
parts.append(
self._i18n.retrieve("planning", "step_executor_context_header")
)
for step_num, result in sorted(context.dependency_results.items()):
parts.append(
self._i18n.retrieve(
"planning", "step_executor_context_entry"
).format(step_number=step_num, result=result)
)
parts.append(self._i18n.retrieve("planning", "step_executor_complete_step"))
return "\n".join(parts)
# ------------------------------------------------------------------
# Internal: Multi-turn execution loop
# ------------------------------------------------------------------
def _execute_text_parsed(
self,
messages: list[LLMMessage],
tool_calls_made: list[str],
max_step_iterations: int = 15,
) -> str:
"""Execute step using text-parsed tool calling with a multi-turn loop.
Iterates LLM call → tool execution → observation until the LLM
produces a Final Answer or max_step_iterations is reached.
This allows the agent to: run a command, see the output, adjust its
approach, and run another command — all within a single plan step.
"""
use_stop_words = self.llm.supports_stop_words() if self.llm else False
last_tool_result = ""
for _ in range(max_step_iterations):
answer = self.llm.call(
messages,
callbacks=self.callbacks,
from_task=self.task,
from_agent=self.agent,
)
if not answer:
raise ValueError("Empty response from LLM")
answer_str = str(answer)
formatted = process_llm_response(answer_str, use_stop_words)
if isinstance(formatted, AgentFinish):
return str(formatted.output)
if isinstance(formatted, AgentAction):
tool_calls_made.append(formatted.tool)
tool_result = self._execute_text_tool_with_events(formatted)
last_tool_result = tool_result
# Append the assistant's reasoning + action, then the observation.
# _build_observation_message handles vision sentinels so the LLM
# receives an image content block instead of raw base64 text.
messages.append({"role": "assistant", "content": answer_str})
messages.append(self._build_observation_message(tool_result))
continue
# Raw text response with no Final Answer marker — treat as done
return answer_str
# Max iterations reached — return the last tool result we accumulated
return last_tool_result
def _execute_text_tool_with_events(self, formatted: AgentAction) -> str:
"""Execute text-parsed tool calls with tool usage events."""
args_dict = self._parse_tool_args(formatted.tool_input)
agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
started_at = datetime.now()
crewai_event_bus.emit(
self,
event=ToolUsageStartedEvent(
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
),
)
try:
fingerprint_context = {}
if (
self.agent
and hasattr(self.agent, "security_config")
and hasattr(self.agent.security_config, "fingerprint")
):
fingerprint_context = {
"agent_fingerprint": str(self.agent.security_config.fingerprint)
}
tool_result = execute_tool_and_check_finality(
agent_action=formatted,
fingerprint_context=fingerprint_context,
tools=self.tools,
i18n=self._i18n,
agent_key=self.agent.key if self.agent else None,
agent_role=self.agent.role if self.agent else None,
tools_handler=self.tools_handler,
task=self.task,
agent=self.agent,
function_calling_llm=self.function_calling_llm,
crew=self.crew,
)
except Exception as e:
crewai_event_bus.emit(
self,
event=ToolUsageErrorEvent(
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
error=e,
),
)
raise
crewai_event_bus.emit(
self,
event=ToolUsageFinishedEvent(
output=str(tool_result.result),
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
started_at=started_at,
finished_at=datetime.now(),
),
)
return str(tool_result.result)
def _parse_tool_args(self, tool_input: Any) -> dict[str, Any]:
"""Parse tool args from the parser output into a dict payload for events."""
if isinstance(tool_input, dict):
return tool_input
if isinstance(tool_input, str):
stripped_input = tool_input.strip()
if not stripped_input:
return {}
try:
parsed = json.loads(stripped_input)
if isinstance(parsed, dict):
return parsed
return {"input": parsed}
except json.JSONDecodeError:
return {"input": stripped_input}
return {"input": str(tool_input)}
# ------------------------------------------------------------------
# Internal: Vision support
# ------------------------------------------------------------------
@staticmethod
def _parse_vision_sentinel(raw: str) -> tuple[str, str] | None:
"""Parse a VISION_IMAGE sentinel into (media_type, base64_data), or None."""
_PREFIX = "VISION_IMAGE:"
if not raw.startswith(_PREFIX):
return None
rest = raw[len(_PREFIX):]
sep = rest.find(":")
if sep <= 0:
return None
return rest[:sep], rest[sep + 1:]
@staticmethod
def _build_observation_message(tool_result: str) -> LLMMessage:
"""Build an observation message, converting vision sentinels to image blocks.
When a tool returns a VISION_IMAGE sentinel (e.g. from read_image),
we build a multimodal content block so the LLM can actually *see*
the image rather than receiving a wall of base64 text.
Uses the standard image_url / data-URI format so each LLM provider's
SDK (OpenAI, LiteLLM, etc.) handles the provider-specific conversion.
Format: ``VISION_IMAGE:<media_type>:<base64_data>``
"""
parsed = StepExecutor._parse_vision_sentinel(tool_result)
if parsed:
media_type, b64_data = parsed
return {
"role": "user",
"content": [
{"type": "text", "text": "Observation: Here is the image:"},
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{b64_data}",
},
},
],
}
return {"role": "user", "content": f"Observation: {tool_result}"}
def _validate_expected_tool_usage(
self,
todo: TodoItem,
tool_calls_made: list[str],
) -> None:
"""Fail step execution when a required tool is configured but not called."""
expected_tool = getattr(todo, "tool_to_use", None)
if not expected_tool:
return
expected_tool_name = sanitize_tool_name(expected_tool)
available_tool_names = {
sanitize_tool_name(tool.name)
for tool in self.tools
if getattr(tool, "name", "")
} | set(self._available_functions.keys())
if expected_tool_name not in available_tool_names:
return
called_names = {sanitize_tool_name(name) for name in tool_calls_made}
if expected_tool_name not in called_names:
raise ValueError(
f"Expected tool '{expected_tool_name}' was not called "
f"for step {todo.step_number}."
)
def _execute_native(
self,
messages: list[LLMMessage],
tool_calls_made: list[str],
max_step_iterations: int = 15,
) -> str:
"""Execute step using native function calling with a multi-turn loop.
Iterates LLM call → tool execution → appended results until the LLM
returns a text answer (no more tool calls) or max_step_iterations is
reached. This lets the agent run a shell command, observe the output,
correct mistakes, and issue follow-up commands — all within one step.
"""
accumulated_results: list[str] = []
for _ in range(max_step_iterations):
answer = self.llm.call(
messages,
tools=self._openai_tools,
callbacks=self.callbacks,
from_task=self.task,
from_agent=self.agent,
)
if not answer:
raise ValueError("Empty response from LLM")
if isinstance(answer, BaseModel):
return answer.model_dump_json()
if isinstance(answer, list) and answer and is_tool_call_list(answer):
# _execute_native_tool_calls appends assistant + tool messages
# to `messages` as a side-effect, so the next LLM call will
# see the full conversation history including tool outputs.
result = self._execute_native_tool_calls(
answer, messages, tool_calls_made
)
accumulated_results.append(result)
continue
# Text answer → LLM decided the step is done
return str(answer)
# Max iterations reached — return everything we accumulated
return "\n".join(filter(None, accumulated_results))
def _execute_native_tool_calls(
self,
tool_calls: list[Any],
messages: list[LLMMessage],
tool_calls_made: list[str],
) -> str:
"""Execute a batch of native tool calls and return their results.
Returns the result of the first tool marked result_as_answer if any,
otherwise returns all tool results concatenated.
"""
assistant_message, _reports = build_tool_calls_assistant_message(tool_calls)
if assistant_message:
messages.append(assistant_message)
tool_results: list[str] = []
for tool_call in tool_calls:
call_result = execute_single_native_tool_call(
tool_call,
available_functions=self._available_functions,
original_tools=self.original_tools,
structured_tools=self.tools,
tools_handler=self.tools_handler,
agent=self.agent,
task=self.task,
crew=self.crew,
event_source=self,
printer=self._printer,
verbose=bool(self.agent and self.agent.verbose),
)
if call_result.func_name:
tool_calls_made.append(call_result.func_name)
if call_result.result_as_answer:
return str(call_result.result)
if call_result.tool_message:
raw_content = call_result.tool_message.get("content", "")
if isinstance(raw_content, str):
parsed = self._parse_vision_sentinel(raw_content)
if parsed:
media_type, b64_data = parsed
# Replace the sentinel with a standard image_url content block.
# Each provider SDK (LiteLLM → Anthropic, OpenAI native, etc.)
# converts the data-URI to its own wire format.
modified = dict(call_result.tool_message)
modified["content"] = [
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{b64_data}",
},
}
]
messages.append(modified)
tool_results.append("[image]")
else:
messages.append(call_result.tool_message)
if raw_content:
tool_results.append(raw_content)
else:
messages.append(call_result.tool_message)
if raw_content:
tool_results.append(str(raw_content))
return "\n".join(tool_results) if tool_results else ""

View File

@@ -69,7 +69,7 @@ ENV_VARS: dict[str, list[dict[str, Any]]] = {
},
{
"prompt": "Enter your AWS Region Name (press Enter to skip)",
"key_name": "AWS_DEFAULT_REGION",
"key_name": "AWS_REGION_NAME",
},
],
"azure": [

View File

@@ -74,6 +74,14 @@ from crewai.events.types.mcp_events import (
MCPToolExecutionFailedEvent,
MCPToolExecutionStartedEvent,
)
from crewai.events.types.observation_events import (
GoalAchievedEarlyEvent,
PlanRefinementEvent,
PlanReplanTriggeredEvent,
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent,
AgentReasoningFailedEvent,
@@ -534,6 +542,64 @@ class EventListener(BaseEventListener):
event.error,
)
# ----------- OBSERVATION EVENTS (Plan-and-Execute) -----------
@crewai_event_bus.on(StepObservationStartedEvent)
def on_step_observation_started(
_: Any, event: StepObservationStartedEvent
) -> None:
self.formatter.handle_observation_started(
event.agent_role,
event.step_number,
event.step_description,
)
@crewai_event_bus.on(StepObservationCompletedEvent)
def on_step_observation_completed(
_: Any, event: StepObservationCompletedEvent
) -> None:
self.formatter.handle_observation_completed(
event.agent_role,
event.step_number,
event.step_completed_successfully,
event.remaining_plan_still_valid,
event.key_information_learned,
event.needs_full_replan,
event.goal_already_achieved,
)
@crewai_event_bus.on(StepObservationFailedEvent)
def on_step_observation_failed(
_: Any, event: StepObservationFailedEvent
) -> None:
self.formatter.handle_observation_failed(
event.step_number,
event.error,
)
@crewai_event_bus.on(PlanRefinementEvent)
def on_plan_refinement(_: Any, event: PlanRefinementEvent) -> None:
self.formatter.handle_plan_refinement(
event.step_number,
event.refined_step_count,
event.refinements,
)
@crewai_event_bus.on(PlanReplanTriggeredEvent)
def on_plan_replan_triggered(_: Any, event: PlanReplanTriggeredEvent) -> None:
self.formatter.handle_plan_replan(
event.replan_reason,
event.replan_count,
event.completed_steps_preserved,
)
@crewai_event_bus.on(GoalAchievedEarlyEvent)
def on_goal_achieved_early(_: Any, event: GoalAchievedEarlyEvent) -> None:
self.formatter.handle_goal_achieved_early(
event.steps_completed,
event.steps_remaining,
)
# ----------- AGENT LOGGING EVENTS -----------
@crewai_event_bus.on(AgentLogsStartedEvent)

View File

@@ -93,6 +93,14 @@ from crewai.events.types.memory_events import (
MemorySaveFailedEvent,
MemorySaveStartedEvent,
)
from crewai.events.types.observation_events import (
GoalAchievedEarlyEvent,
PlanRefinementEvent,
PlanReplanTriggeredEvent,
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent,
AgentReasoningFailedEvent,
@@ -437,6 +445,39 @@ class TraceCollectionListener(BaseEventListener):
) -> None:
self._handle_action_event("agent_reasoning_failed", source, event)
# Observation events (Plan-and-Execute)
@event_bus.on(StepObservationStartedEvent)
def on_step_observation_started(
source: Any, event: StepObservationStartedEvent
) -> None:
self._handle_action_event("step_observation_started", source, event)
@event_bus.on(StepObservationCompletedEvent)
def on_step_observation_completed(
source: Any, event: StepObservationCompletedEvent
) -> None:
self._handle_action_event("step_observation_completed", source, event)
@event_bus.on(StepObservationFailedEvent)
def on_step_observation_failed(
source: Any, event: StepObservationFailedEvent
) -> None:
self._handle_action_event("step_observation_failed", source, event)
@event_bus.on(PlanRefinementEvent)
def on_plan_refinement(source: Any, event: PlanRefinementEvent) -> None:
self._handle_action_event("plan_refinement", source, event)
@event_bus.on(PlanReplanTriggeredEvent)
def on_plan_replan_triggered(
source: Any, event: PlanReplanTriggeredEvent
) -> None:
self._handle_action_event("plan_replan_triggered", source, event)
@event_bus.on(GoalAchievedEarlyEvent)
def on_goal_achieved_early(source: Any, event: GoalAchievedEarlyEvent) -> None:
self._handle_action_event("goal_achieved_early", source, event)
@event_bus.on(KnowledgeRetrievalStartedEvent)
def on_knowledge_retrieval_started(
source: Any, event: KnowledgeRetrievalStartedEvent

View File

@@ -0,0 +1,99 @@
"""Observation events for the Plan-and-Execute architecture.
Emitted during the Observation phase (PLAN-AND-ACT Section 3.3) when the
PlannerObserver analyzes step execution results and decides on plan
continuation, refinement, or replanning.
"""
from typing import Any
from crewai.events.base_events import BaseEvent
class ObservationEvent(BaseEvent):
"""Base event for observation phase events."""
type: str
agent_role: str
step_number: int
step_description: str = ""
from_task: Any | None = None
from_agent: Any | None = None
def __init__(self, **data: Any) -> None:
super().__init__(**data)
self._set_task_params(data)
self._set_agent_params(data)
class StepObservationStartedEvent(ObservationEvent):
"""Emitted when the Planner begins observing a step's result.
Fires after every step execution, before the observation LLM call.
"""
type: str = "step_observation_started"
class StepObservationCompletedEvent(ObservationEvent):
"""Emitted when the Planner finishes observing a step's result.
Contains the full observation analysis: what was learned, whether
the plan is still valid, and what action to take next.
"""
type: str = "step_observation_completed"
step_completed_successfully: bool = True
key_information_learned: str = ""
remaining_plan_still_valid: bool = True
needs_full_replan: bool = False
replan_reason: str | None = None
goal_already_achieved: bool = False
suggested_refinements: list[str] | None = None
class StepObservationFailedEvent(ObservationEvent):
"""Emitted when the observation LLM call itself fails.
The system defaults to continuing the plan when this happens,
but the event allows monitoring/alerting on observation failures.
"""
type: str = "step_observation_failed"
error: str = ""
class PlanRefinementEvent(ObservationEvent):
"""Emitted when the Planner refines upcoming step descriptions.
This is the lightweight refinement path — no full replan, just
sharpening pending todo descriptions based on new information.
"""
type: str = "plan_refinement"
refined_step_count: int = 0
refinements: list[str] | None = None
class PlanReplanTriggeredEvent(ObservationEvent):
"""Emitted when the Planner triggers a full replan.
The remaining plan was deemed fundamentally wrong and will be
regenerated from scratch, preserving completed step results.
"""
type: str = "plan_replan_triggered"
replan_reason: str = ""
replan_count: int = 0
completed_steps_preserved: int = 0
class GoalAchievedEarlyEvent(ObservationEvent):
"""Emitted when the Planner detects the goal was achieved early.
Remaining steps will be skipped and execution will finalize.
"""
type: str = "goal_achieved_early"
steps_remaining: int = 0
steps_completed: int = 0

View File

@@ -9,7 +9,7 @@ class ReasoningEvent(BaseEvent):
type: str
attempt: int = 1
agent_role: str
task_id: str
task_id: str | None = None
task_name: str | None = None
from_task: Any | None = None
agent_id: str | None = None

View File

@@ -936,6 +936,152 @@ To enable tracing, do any one of these:
)
self.print_panel(error_content, "❌ Reasoning Error", "red")
# ----------- OBSERVATION EVENTS (Plan-and-Execute) -----------
def handle_observation_started(
self,
agent_role: str,
step_number: int,
step_description: str,
) -> None:
"""Handle step observation started event."""
if not self.verbose:
return
content = Text()
content.append("Observation Started\n", style="cyan bold")
content.append("Agent: ", style="white")
content.append(f"{agent_role}\n", style="cyan")
content.append("Step: ", style="white")
content.append(f"{step_number}\n", style="cyan")
if step_description:
desc_preview = step_description[:80] + (
"..." if len(step_description) > 80 else ""
)
content.append("Description: ", style="white")
content.append(f"{desc_preview}\n", style="cyan")
self.print_panel(content, "🔍 Observing Step Result", "cyan")
def handle_observation_completed(
self,
agent_role: str,
step_number: int,
step_completed: bool,
plan_valid: bool,
key_info: str,
needs_replan: bool,
goal_achieved: bool,
) -> None:
"""Handle step observation completed event."""
if not self.verbose:
return
if goal_achieved:
style = "green"
status = "Goal Achieved Early"
elif needs_replan:
style = "yellow"
status = "Replan Needed"
elif plan_valid:
style = "green"
status = "Plan Valid — Continue"
else:
style = "red"
status = "Step Failed"
content = Text()
content.append("Observation Complete\n", style=f"{style} bold")
content.append("Step: ", style="white")
content.append(f"{step_number}\n", style=style)
content.append("Status: ", style="white")
content.append(f"{status}\n", style=style)
if key_info:
info_preview = key_info[:120] + ("..." if len(key_info) > 120 else "")
content.append("Learned: ", style="white")
content.append(f"{info_preview}\n", style=style)
self.print_panel(content, "🔍 Observation Result", style)
def handle_observation_failed(
self,
step_number: int,
error: str,
) -> None:
"""Handle step observation failure event."""
if not self.verbose:
return
error_content = self.create_status_content(
"Observation Failed",
"Error",
"red",
Step=str(step_number),
Error=error,
)
self.print_panel(error_content, "❌ Observation Error", "red")
def handle_plan_refinement(
self,
step_number: int,
refined_count: int,
refinements: list[str] | None,
) -> None:
"""Handle plan refinement event."""
if not self.verbose:
return
content = Text()
content.append("Plan Refined\n", style="cyan bold")
content.append("After Step: ", style="white")
content.append(f"{step_number}\n", style="cyan")
content.append("Steps Updated: ", style="white")
content.append(f"{refined_count}\n", style="cyan")
if refinements:
for r in refinements[:3]:
content.append(f"{r[:80]}\n", style="white")
self.print_panel(content, "✏️ Plan Refinement", "cyan")
def handle_plan_replan(
self,
reason: str,
replan_count: int,
preserved_count: int,
) -> None:
"""Handle plan replan triggered event."""
if not self.verbose:
return
content = Text()
content.append("Full Replan Triggered\n", style="yellow bold")
content.append("Reason: ", style="white")
content.append(f"{reason}\n", style="yellow")
content.append("Replan #: ", style="white")
content.append(f"{replan_count}\n", style="yellow")
content.append("Preserved Steps: ", style="white")
content.append(f"{preserved_count}\n", style="yellow")
self.print_panel(content, "🔄 Dynamic Replan", "yellow")
def handle_goal_achieved_early(
self,
steps_completed: int,
steps_remaining: int,
) -> None:
"""Handle goal achieved early event."""
if not self.verbose:
return
content = Text()
content.append("Goal Achieved Early!\n", style="green bold")
content.append("Completed: ", style="white")
content.append(f"{steps_completed} steps\n", style="green")
content.append("Skipped: ", style="white")
content.append(f"{steps_remaining} remaining steps\n", style="green")
self.print_panel(content, "🎯 Early Goal Achievement", "green")
# ----------- AGENT LOGGING EVENTS -----------
def handle_agent_logs_started(

File diff suppressed because it is too large Load Diff

View File

@@ -6,9 +6,27 @@ from typing import Any
from pydantic import BaseModel, Field
from crewai.utilities.planning_types import TodoItem
from crewai.utilities.types import LLMMessage
class TodoExecutionResult(BaseModel):
"""Summary of a single todo execution."""
step_number: int = Field(description="Step number in the plan")
description: str = Field(description="What the todo was supposed to do")
tool_used: str | None = Field(
default=None, description="Tool that was used for this step"
)
status: str = Field(description="Final status: completed, failed, pending")
result: str | None = Field(
default=None, description="Result or error message from execution"
)
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this depended on"
)
class LiteAgentOutput(BaseModel):
"""Class that represents the result of a LiteAgent execution."""
@@ -24,12 +42,75 @@ class LiteAgentOutput(BaseModel):
)
messages: list[LLMMessage] = Field(description="Messages of the agent", default=[])
plan: str | None = Field(
default=None, description="The execution plan that was generated, if any"
)
todos: list[TodoExecutionResult] = Field(
default_factory=list,
description="List of todos that were executed with their results",
)
replan_count: int = Field(
default=0, description="Number of times the plan was regenerated"
)
last_replan_reason: str | None = Field(
default=None, description="Reason for the last replan, if any"
)
@classmethod
def from_todo_items(cls, todo_items: list[TodoItem]) -> list[TodoExecutionResult]:
"""Convert TodoItem objects to TodoExecutionResult summaries.
Args:
todo_items: List of TodoItem objects from execution.
Returns:
List of TodoExecutionResult summaries.
"""
return [
TodoExecutionResult(
step_number=item.step_number,
description=item.description,
tool_used=item.tool_to_use,
status=item.status,
result=item.result,
depends_on=item.depends_on,
)
for item in todo_items
]
def to_dict(self) -> dict[str, Any]:
"""Convert pydantic_output to a dictionary."""
if self.pydantic:
return self.pydantic.model_dump()
return {}
@property
def completed_todos(self) -> list[TodoExecutionResult]:
"""Get only the completed todos."""
return [t for t in self.todos if t.status == "completed"]
@property
def failed_todos(self) -> list[TodoExecutionResult]:
"""Get only the failed todos."""
return [t for t in self.todos if t.status == "failed"]
@property
def had_plan(self) -> bool:
"""Check if the agent executed with a plan."""
return self.plan is not None or len(self.todos) > 0
def __str__(self) -> str:
"""Return the raw output as a string."""
return self.raw
def __repr__(self) -> str:
"""Return a detailed representation including todo summary."""
parts = [f"LiteAgentOutput(role={self.agent_role!r}"]
if self.todos:
completed = len(self.completed_todos)
total = len(self.todos)
parts.append(f", todos={completed}/{total} completed")
if self.replan_count > 0:
parts.append(f", replans={self.replan_count}")
parts.append(")")
return "".join(parts)

View File

@@ -234,7 +234,7 @@ class BedrockCompletion(BaseLLM):
aws_access_key_id: str | None = None,
aws_secret_access_key: str | None = None,
aws_session_token: str | None = None,
region_name: str | None = None,
region_name: str = "us-east-1",
temperature: float | None = None,
max_tokens: int | None = None,
top_p: float | None = None,
@@ -287,6 +287,15 @@ class BedrockCompletion(BaseLLM):
**kwargs,
)
# Initialize Bedrock client with proper configuration
session = Session(
aws_access_key_id=aws_access_key_id or os.getenv("AWS_ACCESS_KEY_ID"),
aws_secret_access_key=aws_secret_access_key
or os.getenv("AWS_SECRET_ACCESS_KEY"),
aws_session_token=aws_session_token or os.getenv("AWS_SESSION_TOKEN"),
region_name=region_name,
)
# Configure client with timeouts and retries following AWS best practices
config = Config(
read_timeout=300,
@@ -297,12 +306,8 @@ class BedrockCompletion(BaseLLM):
tcp_keepalive=True,
)
self.region_name = (
region_name
or os.getenv("AWS_DEFAULT_REGION")
or os.getenv("AWS_REGION_NAME")
or "us-east-1"
)
self.client = session.client("bedrock-runtime", config=config)
self.region_name = region_name
self.aws_access_key_id = aws_access_key_id or os.getenv("AWS_ACCESS_KEY_ID")
self.aws_secret_access_key = aws_secret_access_key or os.getenv(
@@ -310,16 +315,6 @@ class BedrockCompletion(BaseLLM):
)
self.aws_session_token = aws_session_token or os.getenv("AWS_SESSION_TOKEN")
# Initialize Bedrock client with proper configuration
session = Session(
aws_access_key_id=self.aws_access_key_id,
aws_secret_access_key=self.aws_secret_access_key,
aws_session_token=self.aws_session_token,
region_name=self.region_name,
)
self.client = session.client("bedrock-runtime", config=config)
self._async_exit_stack = AsyncExitStack() if AIOBOTOCORE_AVAILABLE else None
self._async_client_initialized = False
@@ -1843,7 +1838,10 @@ class BedrockCompletion(BaseLLM):
)
# CRITICAL: Handle model-specific conversation requirements
# Cohere and some other models require conversation to end with user message
# Cohere and some other models require conversation to end with user message.
# Anthropic models on Bedrock also reject assistant messages in the final
# position when tools are present ("pre-filling the assistant response is
# not supported").
if converse_messages:
last_message = converse_messages[-1]
if last_message["role"] == "assistant":
@@ -1870,6 +1868,20 @@ class BedrockCompletion(BaseLLM):
"content": [{"text": "Continue your response."}],
}
)
# Anthropic (Claude) models reject assistant-last messages when
# tools are in the request. Append a user message so the
# Converse API accepts the payload.
elif "anthropic" in self.model.lower() or "claude" in self.model.lower():
converse_messages.append(
{
"role": "user",
"content": [
{
"text": "Please continue and provide your final answer."
}
],
}
)
# Ensure first message is from user (required by Converse API)
if not converse_messages:

View File

@@ -74,9 +74,28 @@
"consolidation_user": "New content to consider storing:\n{new_content}\n\nExisting similar memories:\n{records_summary}\n\nReturn the consolidation plan as structured output."
},
"reasoning": {
"initial_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are creating a strategic plan for a task that requires your expertise and unique perspective.",
"refine_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are refining a strategic plan for a task that requires your expertise and unique perspective.",
"create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience, exactly what tools to use and how to use them\n5. The expected outcome and how it aligns with your goal\n\nAfter creating your plan, assess whether you feel ready to execute the task or if you could do better.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties. As you refine your plan, be specific about which available tools you will use, how you will use them, and why they are the best choices for each step. Clearly outline your tool usage strategy as part of your improved plan.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\""
"initial_plan": "You are {role}. Create a focused execution plan using only the essential steps needed.",
"refine_plan": "You are {role}. Refine your plan to address the specific gap while keeping it minimal.",
"create_plan_prompt": "You are {role}.\n\nTask: {description}\n\nExpected output: {expected_output}\n\nAvailable tools: {tools}\n\nCreate a focused plan with ONLY the essential steps needed. Most tasks require just 2-5 steps. Do NOT pad with unnecessary steps like \"review\", \"verify\", \"document\", or \"finalize\" unless explicitly required.\n\nFor each step, specify the action and which tool to use (if any).\n\nConclude with:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "Your plan:\n{current_plan}\n\nYou indicated you're not ready. Address the specific gap while keeping the plan minimal.\n\nConclude with READY or NOT READY."
},
"planning": {
"system_prompt": "You are a strategic planning assistant. Create concrete, executable plans where every step produces a verifiable result.",
"create_plan_prompt": "Create an execution plan for the following task:\n\n## Task\n{description}\n\n## Expected Output\n{expected_output}\n\n## Available Tools\n{tools}\n\n## Planning Principles\nFocus on CONCRETE, EXECUTABLE steps. Each step must clearly state WHAT ACTION to take and HOW to verify it succeeded. The number of steps should match the task complexity. Hard limit: {max_steps} steps.\n\n## Rules:\n- Each step must have a clear DONE criterion\n- Do NOT group unrelated actions: if steps can fail independently, keep them separate\n- NO standalone \"thinking\" or \"planning\" steps — act, don't just observe\n- The last step must produce the required output\n\nAfter your plan, state READY or NOT READY.",
"refine_plan_prompt": "Your previous plan:\n{current_plan}\n\nYou indicated you weren't ready. Refine your plan to address the specific gap.\n\nKeep the plan minimal - only add steps that directly address the issue.\n\nConclude with READY or NOT READY as before.",
"observation_system_prompt": "You are a Planning Agent observing execution progress. After each step completes, you analyze what happened and decide whether the remaining plan is still valid.\n\nReason step-by-step about:\n1. Did this step produce a concrete, verifiable result? (file created, command succeeded, service running, etc.) — or did it only explore without acting?\n2. What new information was learned from this step's result?\n3. Whether the remaining steps still make sense given this new information\n4. What refinements, if any, are needed for upcoming steps\n5. Whether the overall goal has already been achieved\n\nCritical: mark `step_completed_successfully=false` if:\n- The step result is only exploratory (ls, pwd, cat) without producing the required artifact or action\n- A command returned a non-zero exit code and the error was not recovered\n- The step description required creating/building/starting something and the result shows it was not done\n\nBe conservative about triggering full replans — only do so when the remaining plan is fundamentally wrong, not just suboptimal.\n\nIMPORTANT: Set step_completed_successfully=false if:\n- The step's stated goal was NOT achieved (even if other things were done)\n- The first meaningful action returned an error (file not found, command not found, etc.)\n- The result is exploration/discovery output rather than the concrete action the step required\n- The step ran out of attempts without producing the required output\nSet needs_full_replan=true if the current plan's remaining steps reference paths or state that don't exist yet and need to be created first.",
"observation_user_prompt": "## Original task\n{task_description}\n\n## Expected output\n{task_goal}\n{completed_summary}\n\n## Just completed step {step_number}\nDescription: {step_description}\nResult: {step_result}\n{remaining_summary}\n\nAnalyze this step's result and provide your observation.",
"step_executor_system_prompt": "You are {role}. {backstory}\n\nYour goal: {goal}\n\nYou are executing ONE specific step in a larger plan. Your ONLY job is to fully complete this step — not to plan ahead.\n\nKey rules:\n- **ACT FIRST.** Execute the primary action of this step immediately. Do NOT read or explore files before attempting the main action unless exploration IS the step's goal.\n- If the step says 'run X', run X NOW. If it says 'write file Y', write Y NOW.\n- If the step requires producing an output file (e.g. /app/move.txt, report.jsonl, summary.csv), you MUST write that file using a tool call — do NOT just state the answer in text.\n- You may use tools MULTIPLE TIMES. After each tool use, check the result. If it failed, try a different approach.\n- Only output your Final Answer AFTER the concrete outcome is verified (file written, build succeeded, command exited 0).\n- If a command is not found or a path does not exist, fix it (different PATH, install missing deps, use absolute paths).\n- Do NOT spend more than 3 tool calls on exploration/analysis before attempting the primary action.{tools_section}",
"step_executor_tools_section": "\n\nAvailable tools: {tool_names}\n\nYou may call tools multiple times in sequence. Use this format for EACH tool call:\nThought: <what you observed and what you will try next>\nAction: <tool_name>\nAction Input: <input>\n\nAfter observing each result, decide: is the step complete? If yes:\nThought: The step is done because <evidence>\nFinal Answer: <concise summary of what was accomplished and the key result>",
"step_executor_user_prompt": "## Current Step\n{step_description}",
"step_executor_suggested_tool": "\nSuggested tool: {tool_to_use}",
"step_executor_context_header": "\n## Context from previous steps:",
"step_executor_context_entry": "Step {step_number} result: {result}",
"step_executor_complete_step": "\n**Execute the primary action of this step NOW.** If the step requires writing a file, write it. If it requires running a command, run it. Verify the outcome with a follow-up tool call, then give your Final Answer. Your Final Answer must confirm what was DONE (file created at path X, command succeeded), not just what should be done.",
"todo_system_prompt": "You are {role}. Your goal: {goal}\n\nYou are executing a specific step in a multi-step plan. Focus only on completing the current step. Use the suggested tool if one is provided. Be concise and provide clear results that can be used by subsequent steps.",
"synthesis_system_prompt": "You are {role}. You have completed a multi-step task. Synthesize the results from all steps into a single, coherent final response that directly addresses the original task. Do NOT list step numbers or say 'Step 1 result'. Produce a clean, polished answer as if you did it all at once.",
"synthesis_user_prompt": "## Original Task\n{task_description}\n\n## Results from each step\n{combined_steps}\n\nSynthesize these results into a single, coherent final answer.",
"replan_enhancement_prompt": "\n\nIMPORTANT: Previous execution attempt did not fully succeed. Please create a revised plan that accounts for the following context from the previous attempt:\n\n{previous_context}\n\nConsider:\n1. What steps succeeded and can be built upon\n2. What steps failed and why they might have failed\n3. Alternative approaches that might work better\n4. Whether dependencies need to be restructured",
"step_executor_task_context": "## Task Context\nThe following is the full task you are helping complete. Keep this in mind — especially any required output files, exact filenames, and expected formats.\n\n{task_context}\n\n---\n"
}
}
}

View File

@@ -3,6 +3,8 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable, Sequence
import concurrent.futures
from dataclasses import dataclass, field
from datetime import datetime
import inspect
import json
import re
@@ -39,6 +41,7 @@ from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.agents.tools_handler import ToolsHandler
from crewai.experimental.agent_executor import AgentExecutor
from crewai.lite_agent import LiteAgent
from crewai.llm import LLM
@@ -324,6 +327,66 @@ def enforce_rpm_limit(
request_within_rpm_limit()
def _prepare_llm_call(
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
messages: list[LLMMessage],
printer: Printer,
verbose: bool = True,
) -> list[LLMMessage]:
"""Shared pre-call logic: run before hooks and resolve messages.
Args:
executor_context: Optional executor context for hook invocation.
messages: The messages to send to the LLM.
printer: Printer instance for output.
verbose: Whether to print output.
Returns:
The resolved messages list (may come from executor_context).
Raises:
ValueError: If a before hook blocks the call.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
return messages
def _validate_and_finalize_llm_response(
answer: Any,
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
printer: Printer,
verbose: bool = True,
) -> str | BaseModel | Any:
"""Shared post-call logic: validate response and run after hooks.
Args:
answer: The raw LLM response.
executor_context: Optional executor context for hook invocation.
printer: Printer instance for output.
verbose: Whether to print output.
Returns:
The potentially modified response.
Raises:
ValueError: If the response is None or empty.
"""
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
)
def get_llm_response(
llm: LLM | BaseLLM,
messages: list[LLMMessage],
@@ -360,11 +423,7 @@ def get_llm_response(
Exception: If an error occurs.
ValueError: If the response is None or empty.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
messages = _prepare_llm_call(executor_context, messages, printer, verbose=verbose)
try:
answer = llm.call(
@@ -378,16 +437,9 @@ def get_llm_response(
)
except Exception as e:
raise e
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
return _validate_and_finalize_llm_response(
answer, executor_context, printer, verbose=verbose
)
@@ -417,6 +469,7 @@ async def aget_llm_response(
from_agent: Optional agent context for the LLM call.
response_model: Optional Pydantic model for structured outputs.
executor_context: Optional executor context for hook invocation.
verbose: Whether to print output.
Returns:
The response from the LLM as a string, Pydantic model (when response_model is provided),
@@ -426,10 +479,7 @@ async def aget_llm_response(
Exception: If an error occurs.
ValueError: If the response is None or empty.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
messages = _prepare_llm_call(executor_context, messages, printer, verbose=verbose)
try:
answer = await llm.acall(
@@ -443,16 +493,9 @@ async def aget_llm_response(
)
except Exception as e:
raise e
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
return _validate_and_finalize_llm_response(
answer, executor_context, printer, verbose=verbose
)
@@ -1146,6 +1189,382 @@ def extract_tool_call_info(
return None
def is_tool_call_list(response: list[Any]) -> bool:
"""Check if a response from the LLM is a list of tool calls.
Supports OpenAI, Anthropic, Bedrock, and Gemini formats.
Args:
response: The response to check.
Returns:
True if the response appears to be a list of tool calls.
"""
if not response:
return False
first_item = response[0]
# OpenAI-style
if hasattr(first_item, "function") or (
isinstance(first_item, dict) and "function" in first_item
):
return True
# Anthropic-style (ToolUseBlock)
if hasattr(first_item, "type") and getattr(first_item, "type", None) == "tool_use":
return True
if hasattr(first_item, "name") and hasattr(first_item, "input"):
return True
# Bedrock-style
if isinstance(first_item, dict) and "name" in first_item and "input" in first_item:
return True
# Gemini-style
if hasattr(first_item, "function_call") and first_item.function_call:
return True
return False
def check_native_tool_support(llm: Any, original_tools: list[BaseTool] | None) -> bool:
"""Check if the LLM supports native function calling and tools are available.
Args:
llm: The LLM instance.
original_tools: Original BaseTool instances.
Returns:
True if native function calling is supported and tools exist.
"""
return (
hasattr(llm, "supports_function_calling")
and callable(getattr(llm, "supports_function_calling", None))
and llm.supports_function_calling()
and bool(original_tools)
)
def setup_native_tools(
original_tools: list[BaseTool],
) -> tuple[list[dict[str, Any]], dict[str, Callable[..., Any]]]:
"""Convert tools to OpenAI schema format for native function calling.
Args:
original_tools: Original BaseTool instances.
Returns:
Tuple of (openai_tools_schema, available_functions_dict).
"""
return convert_tools_to_openai_schema(original_tools)
def build_tool_calls_assistant_message(
tool_calls: list[Any],
) -> tuple[LLMMessage | None, list[dict[str, Any]]]:
"""Build an assistant message containing tool call reports.
Extracts info from each tool call, builds the standard assistant message
format, and preserves raw Gemini parts when applicable.
Args:
tool_calls: Raw tool call objects from the LLM response.
Returns:
Tuple of (assistant_message, tool_calls_to_report).
assistant_message is None if no valid tool calls found.
"""
tool_calls_to_report: list[dict[str, Any]] = []
for tool_call in tool_calls:
info = extract_tool_call_info(tool_call)
if not info:
continue
call_id, func_name, func_args = info
tool_calls_to_report.append(
{
"id": call_id,
"type": "function",
"function": {
"name": func_name,
"arguments": func_args
if isinstance(func_args, str)
else json.dumps(func_args),
},
}
)
if not tool_calls_to_report:
return None, []
assistant_message: LLMMessage = {
"role": "assistant",
"content": None,
"tool_calls": tool_calls_to_report,
}
# Preserve raw parts for Gemini compatibility
if all(type(tc).__qualname__ == "Part" for tc in tool_calls):
assistant_message["raw_tool_call_parts"] = list(tool_calls)
return assistant_message, tool_calls_to_report
@dataclass
class NativeToolCallResult:
"""Result from executing a single native tool call."""
call_id: str
func_name: str
result: str
from_cache: bool = False
result_as_answer: bool = False
tool_message: LLMMessage = field(default_factory=dict) # type: ignore[assignment]
def execute_single_native_tool_call(
tool_call: Any,
*,
available_functions: dict[str, Callable[..., Any]],
original_tools: list[BaseTool],
structured_tools: list[CrewStructuredTool] | None,
tools_handler: ToolsHandler | None,
agent: Agent | None,
task: Task | None,
crew: Any | None,
event_source: Any,
printer: Printer | None = None,
verbose: bool = False,
) -> NativeToolCallResult:
"""Execute a single native tool call with full lifecycle management.
Handles: arg parsing, tool lookup, max-usage check, cache read/write,
before/after hooks, event emission, and result_as_answer detection.
Args:
tool_call: Raw tool call object from the LLM.
available_functions: Map of sanitized tool name -> callable.
original_tools: Original BaseTool list (for cache_function, result_as_answer).
structured_tools: Structured tools list (for hook context).
tools_handler: Optional handler with cache.
agent: The agent instance.
task: The current task.
crew: The crew instance.
event_source: The object to use as event emitter source.
printer: Optional printer for verbose logging.
verbose: Whether to print verbose output.
Returns:
NativeToolCallResult with all execution details.
"""
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageErrorEvent,
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.hooks.tool_hooks import (
ToolCallHookContext,
get_after_tool_call_hooks,
get_before_tool_call_hooks,
)
info = extract_tool_call_info(tool_call)
if not info:
return NativeToolCallResult(
call_id="", func_name="", result="Unrecognized tool call format"
)
call_id, func_name, func_args = info
# Parse arguments
if isinstance(func_args, str):
try:
args_dict = json.loads(func_args)
except json.JSONDecodeError:
args_dict = {}
else:
args_dict = func_args
agent_key = getattr(agent, "key", "unknown") if agent else "unknown"
# Find original tool for cache_function and result_as_answer
original_tool: BaseTool | None = None
for tool in original_tools:
if sanitize_tool_name(tool.name) == func_name:
original_tool = tool
break
# Check max usage count
max_usage_reached = False
if (
original_tool
and original_tool.max_usage_count is not None
and original_tool.current_usage_count >= original_tool.max_usage_count
):
max_usage_reached = True
# Check cache
from_cache = False
input_str = json.dumps(args_dict) if args_dict else ""
result = "Tool not found"
if tools_handler and tools_handler.cache:
cached_result = tools_handler.cache.read(tool=func_name, input=input_str)
if cached_result is not None:
result = (
str(cached_result)
if not isinstance(cached_result, str)
else cached_result
)
from_cache = True
# Emit tool started event
started_at = datetime.now()
crewai_event_bus.emit(
event_source,
event=ToolUsageStartedEvent(
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
),
)
track_delegation_if_needed(func_name, args_dict, task)
# Find structured tool for hooks
structured_tool: CrewStructuredTool | None = None
for structured in structured_tools or []:
if sanitize_tool_name(structured.name) == func_name:
structured_tool = structured
break
# Before hooks
hook_blocked = False
before_hook_context = ToolCallHookContext(
tool_name=func_name,
tool_input=args_dict,
tool=structured_tool, # type: ignore[arg-type]
agent=agent,
task=task,
crew=crew,
)
try:
for hook in get_before_tool_call_hooks():
if hook(before_hook_context) is False:
hook_blocked = True
break
except Exception: # noqa: S110
pass
error_event_emitted = False
if hook_blocked:
result = f"Tool execution blocked by hook. Tool: {func_name}"
elif not from_cache and not max_usage_reached:
if func_name in available_functions:
try:
tool_func = available_functions[func_name]
raw_result = tool_func(**args_dict)
# Cache result
if tools_handler and tools_handler.cache:
should_cache = True
if original_tool:
should_cache = original_tool.cache_function(
args_dict, raw_result
)
if should_cache:
tools_handler.cache.add(
tool=func_name, input=input_str, output=raw_result
)
result = (
str(raw_result) if not isinstance(raw_result, str) else raw_result
)
except Exception as e:
result = f"Error executing tool: {e}"
if task:
task.increment_tools_errors()
crewai_event_bus.emit(
event_source,
event=ToolUsageErrorEvent(
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
error=e,
),
)
error_event_emitted = True
elif max_usage_reached and original_tool:
result = (
f"Tool '{func_name}' has reached its usage limit of "
f"{original_tool.max_usage_count} times and cannot be used anymore."
)
# After hooks
after_hook_context = ToolCallHookContext(
tool_name=func_name,
tool_input=args_dict,
tool=structured_tool, # type: ignore[arg-type]
agent=agent,
task=task,
crew=crew,
tool_result=result,
)
try:
for after_hook in get_after_tool_call_hooks():
hook_result = after_hook(after_hook_context)
if hook_result is not None:
result = hook_result
after_hook_context.tool_result = result
except Exception: # noqa: S110
pass
# Emit tool finished event (only if error event wasn't already emitted)
if not error_event_emitted:
crewai_event_bus.emit(
event_source,
event=ToolUsageFinishedEvent(
output=result,
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
started_at=started_at,
finished_at=datetime.now(),
),
)
# Build tool result message
tool_message: LLMMessage = {
"role": "tool",
"tool_call_id": call_id,
"name": func_name,
"content": result,
}
if verbose and printer:
cache_info = " (from cache)" if from_cache else ""
printer.print(
content=f"Tool {func_name} executed with result{cache_info}: {result[:200]}...",
color="green",
)
# Check result_as_answer
is_result_as_answer = bool(
original_tool
and hasattr(original_tool, "result_as_answer")
and original_tool.result_as_answer
)
return NativeToolCallResult(
call_id=call_id,
func_name=func_name,
result=result,
from_cache=from_cache,
result_as_answer=is_result_as_answer,
tool_message=tool_message,
)
def _setup_before_llm_call_hooks(
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
printer: Printer,

View File

@@ -100,7 +100,13 @@ class I18N(BaseModel):
def retrieve(
self,
kind: Literal[
"slices", "errors", "tools", "reasoning", "hierarchical_manager_agent", "memory"
"slices",
"errors",
"tools",
"reasoning",
"planning",
"hierarchical_manager_agent",
"memory",
],
key: str,
) -> str:

View File

@@ -69,7 +69,7 @@ def create_llm(
UNACCEPTED_ATTRIBUTES: Final[list[str]] = [
"AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY",
"AWS_DEFAULT_REGION",
"AWS_REGION_NAME",
]
@@ -146,7 +146,7 @@ def _llm_via_environment_or_fallback() -> LLM | None:
unaccepted_attributes = [
"AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY",
"AWS_DEFAULT_REGION",
"AWS_REGION_NAME",
]
set_provider = model_name.partition("/")[0] if "/" in model_name else "openai"

View File

@@ -0,0 +1,256 @@
"""Types for agent planning and todo tracking."""
from __future__ import annotations
from typing import Literal
from uuid import uuid4
from pydantic import BaseModel, Field, field_validator
# Todo status type
TodoStatus = Literal["pending", "running", "completed"]
class PlanStep(BaseModel):
"""A single step in the reasoning plan."""
step_number: int = Field(description="Step number (1-based)")
description: str = Field(description="What to do in this step")
tool_to_use: str | None = Field(
default=None, description="Tool to use for this step, if any"
)
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this step depends on"
)
class TodoItem(BaseModel):
"""A single todo item representing a step in the execution plan."""
id: str = Field(default_factory=lambda: str(uuid4()))
step_number: int = Field(description="Order of this step in the plan (1-based)")
description: str = Field(description="What needs to be done")
tool_to_use: str | None = Field(
default=None, description="Tool to use for this step, if any"
)
status: TodoStatus = Field(default="pending", description="Current status")
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this depends on"
)
result: str | None = Field(
default=None, description="Result after completion, if any"
)
class TodoList(BaseModel):
"""Collection of todos for tracking plan execution."""
items: list[TodoItem] = Field(default_factory=list)
@property
def current_todo(self) -> TodoItem | None:
"""Get the currently running todo item."""
for item in self.items:
if item.status == "running":
return item
return None
@property
def next_pending(self) -> TodoItem | None:
"""Get the next pending todo item."""
for item in self.items:
if item.status == "pending":
return item
return None
@property
def is_complete(self) -> bool:
"""Check if all todos are completed."""
return len(self.items) > 0 and all(
item.status == "completed" for item in self.items
)
@property
def pending_count(self) -> int:
"""Count of pending todos."""
return sum(1 for item in self.items if item.status == "pending")
@property
def completed_count(self) -> int:
"""Count of completed todos."""
return sum(1 for item in self.items if item.status == "completed")
def get_by_step_number(self, step_number: int) -> TodoItem | None:
"""Get a todo by its step number."""
for item in self.items:
if item.step_number == step_number:
return item
return None
def mark_running(self, step_number: int) -> None:
"""Mark a todo as running by step number."""
item = self.get_by_step_number(step_number)
if item:
item.status = "running"
def mark_completed(self, step_number: int, result: str | None = None) -> None:
"""Mark a todo as completed by step number."""
item = self.get_by_step_number(step_number)
if item:
item.status = "completed"
if result:
item.result = result
def _dependencies_satisfied(self, item: TodoItem) -> bool:
"""Check if all dependencies for a todo item are completed.
Args:
item: The todo item to check dependencies for.
Returns:
True if all dependencies are completed, False otherwise.
"""
for dep_num in item.depends_on:
dep = self.get_by_step_number(dep_num)
if dep is None or dep.status != "completed":
return False
return True
def get_ready_todos(self) -> list[TodoItem]:
"""Get all todos that are ready to execute (pending with satisfied dependencies).
Returns:
List of TodoItem objects that can be executed now.
"""
ready: list[TodoItem] = []
for item in self.items:
if item.status != "pending":
continue
if self._dependencies_satisfied(item):
ready.append(item)
return ready
@property
def can_parallelize(self) -> bool:
"""Check if multiple todos can run in parallel.
Returns:
True if more than one todo is ready to execute.
"""
return len(self.get_ready_todos()) > 1
@property
def running_count(self) -> int:
"""Count of currently running todos."""
return sum(1 for item in self.items if item.status == "running")
def get_completed_todos(self) -> list[TodoItem]:
"""Get all completed todos.
Returns:
List of completed TodoItem objects.
"""
return [item for item in self.items if item.status == "completed"]
def get_pending_todos(self) -> list[TodoItem]:
"""Get all pending todos.
Returns:
List of pending TodoItem objects.
"""
return [item for item in self.items if item.status == "pending"]
def replace_pending_todos(self, new_items: list[TodoItem]) -> None:
"""Replace all pending todos with new items.
Preserves completed and running todos, replaces only pending ones.
Used during replanning to swap in a new plan for remaining work.
Args:
new_items: The new todo items to replace pending ones.
"""
non_pending = [item for item in self.items if item.status != "pending"]
self.items = non_pending + new_items
class StepRefinement(BaseModel):
"""A structured in-place update for a single pending step.
Returned as part of StepObservation when the Planner learns new
information that makes a pending step description more specific.
Applied directly — no second LLM call required.
"""
step_number: int = Field(description="The step number to update (1-based)")
new_description: str = Field(
description="The updated, more specific description for this step"
)
class StepObservation(BaseModel):
"""Planner's observation after a step execution completes.
Returned by the PlannerObserver after EVERY step — not just failures.
The Planner uses this to decide whether to continue, refine, or replan.
Based on PLAN-AND-ACT (Section 3.3): the Planner observes what the Executor
did and incorporates new information into the remaining plan.
Attributes:
step_completed_successfully: Whether the step achieved its objective.
key_information_learned: New information revealed by this step
(e.g., "Found 3 products: A, B, C"). Used to refine upcoming steps.
remaining_plan_still_valid: Whether pending todos still make sense
given the new information. True does NOT mean no refinement needed.
suggested_refinements: Structured in-place updates to pending step
descriptions. Each entry targets a specific step by number. These
are applied directly without a second LLM call.
Example: [{"step_number": 3, "new_description": "Select product B (highest rated)"}]
needs_full_replan: The remaining plan is fundamentally wrong and must
be regenerated from scratch. Mutually exclusive with
remaining_plan_still_valid (if this is True, that should be False).
replan_reason: Explanation of why a full replan is needed (None if not).
goal_already_achieved: The overall task goal has been satisfied early.
No more steps needed — skip remaining todos and finalize.
"""
step_completed_successfully: bool = Field(
description="Whether the step achieved what it was asked to do"
)
key_information_learned: str = Field(
default="",
description="What new information this step revealed",
)
remaining_plan_still_valid: bool = Field(
default=True,
description="Whether the remaining pending todos still make sense given new information",
)
suggested_refinements: list[StepRefinement] | None = Field(
default=None,
description=(
"Structured updates to pending step descriptions based on new information. "
"Each entry specifies a step_number and new_description. "
"Applied directly — no separate replan needed."
),
)
@field_validator("suggested_refinements", mode="before")
@classmethod
def coerce_single_refinement_to_list(cls, v):
"""Coerce a single dict refinement into a list to handle LLM returning a single object."""
if isinstance(v, dict):
return [v]
return v
needs_full_replan: bool = Field(
default=False,
description="The remaining plan is fundamentally wrong and must be regenerated",
)
replan_reason: str | None = Field(
default=None,
description="Explanation of why a full replan is needed",
)
goal_already_achieved: bool = Field(
default=False,
description="The overall task goal has been satisfied early; no more steps needed",
)

View File

@@ -1,10 +1,13 @@
"""Handles planning/reasoning for agents before task execution."""
from __future__ import annotations
import json
import logging
from typing import Any, Final, Literal, cast
from typing import TYPE_CHECKING, Any, Final, Literal, cast
from pydantic import BaseModel, Field
from crewai.agent import Agent
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent,
@@ -12,14 +15,24 @@ from crewai.events.types.reasoning_events import (
AgentReasoningStartedEvent,
)
from crewai.llm import LLM
from crewai.task import Task
from crewai.utilities.llm_utils import create_llm
from crewai.utilities.planning_types import PlanStep
from crewai.utilities.string_utils import sanitize_tool_name
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agent.planning_config import PlanningConfig
from crewai.task import Task
class ReasoningPlan(BaseModel):
"""Model representing a reasoning plan for a task."""
plan: str = Field(description="The detailed reasoning plan for the task.")
steps: list[PlanStep] = Field(
default_factory=list, description="Structured steps to execute"
)
ready: bool = Field(description="Whether the agent is ready to execute the task.")
@@ -29,24 +42,63 @@ class AgentReasoningOutput(BaseModel):
plan: ReasoningPlan = Field(description="The reasoning plan for the task.")
# Aliases for backward compatibility
PlanningPlan = ReasoningPlan
AgentPlanningOutput = AgentReasoningOutput
FUNCTION_SCHEMA: Final[dict[str, Any]] = {
"type": "function",
"function": {
"name": "create_reasoning_plan",
"description": "Create or refine a reasoning plan for a task",
"description": "Create or refine a reasoning plan for a task with structured steps",
"parameters": {
"type": "object",
"properties": {
"plan": {
"type": "string",
"description": "The detailed reasoning plan for the task.",
"description": "A brief summary of the overall plan.",
},
"steps": {
"type": "array",
"description": "List of discrete steps to execute the plan",
"items": {
"type": "object",
"properties": {
"step_number": {
"type": "integer",
"description": "Step number (1-based)",
},
"description": {
"type": "string",
"description": "What to do in this step",
},
"tool_to_use": {
"type": ["string", "null"],
"description": "Tool to use for this step, or null if no tool needed",
},
"depends_on": {
"type": "array",
"items": {"type": "integer"},
"description": "Step numbers this step depends on (empty array if none)",
},
},
"required": [
"step_number",
"description",
"tool_to_use",
"depends_on",
],
"additionalProperties": False,
},
},
"ready": {
"type": "boolean",
"description": "Whether the agent is ready to execute the task.",
},
},
"required": ["plan", "ready"],
"required": ["plan", "steps", "ready"],
"additionalProperties": False,
},
},
}
@@ -54,41 +106,101 @@ FUNCTION_SCHEMA: Final[dict[str, Any]] = {
class AgentReasoning:
"""
Handles the agent reasoning process, enabling an agent to reflect and create a plan
before executing a task.
Handles the agent planning/reasoning process, enabling an agent to reflect
and create a plan before executing a task.
Attributes:
task: The task for which the agent is reasoning.
agent: The agent performing the reasoning.
llm: The language model used for reasoning.
task: The task for which the agent is planning (optional).
agent: The agent performing the planning.
config: The planning configuration.
llm: The language model used for planning.
logger: Logger for logging events and errors.
description: Task description or input text for planning.
expected_output: Expected output description.
"""
def __init__(self, task: Task, agent: Agent) -> None:
"""Initialize the AgentReasoning with a task and an agent.
def __init__(
self,
agent: Agent,
task: Task | None = None,
*,
description: str | None = None,
expected_output: str | None = None,
) -> None:
"""Initialize the AgentReasoning with an agent and optional task.
Args:
task: The task for which the agent is reasoning.
agent: The agent performing the reasoning.
agent: The agent performing the planning.
task: The task for which the agent is planning (optional).
description: Task description or input text (used if task is None).
expected_output: Expected output (used if task is None).
"""
self.task = task
self.agent = agent
self.llm = cast(LLM, agent.llm)
self.task = task
# Use task attributes if available, otherwise use provided values
self._description = description or (
task.description if task else "Complete the requested task"
)
self._expected_output = expected_output or (
task.expected_output if task else "Complete the task successfully"
)
self.config = self._get_planning_config()
self.llm = self._resolve_llm()
self.logger = logging.getLogger(__name__)
def handle_agent_reasoning(self) -> AgentReasoningOutput:
"""Public method for the reasoning process that creates and refines a plan for the task until the agent is ready to execute it.
@property
def description(self) -> str:
"""Get the task/input description."""
return self._description
@property
def expected_output(self) -> str:
"""Get the expected output."""
return self._expected_output
def _get_planning_config(self) -> PlanningConfig:
"""Get the planning configuration from the agent.
Returns:
AgentReasoningOutput: The output of the agent reasoning process.
The planning configuration, using defaults if not set.
"""
# Emit a reasoning started event (attempt 1)
from crewai.agent.planning_config import PlanningConfig
if self.agent.planning_config is not None:
return self.agent.planning_config
# Fallback for backward compatibility
return PlanningConfig(
max_attempts=getattr(self.agent, "max_reasoning_attempts", None),
)
def _resolve_llm(self) -> LLM:
"""Resolve which LLM to use for planning.
Returns:
The LLM to use - either from config or the agent's LLM.
"""
if self.config.llm is not None:
if isinstance(self.config.llm, LLM):
return self.config.llm
return create_llm(self.config.llm)
return cast(LLM, self.agent.llm)
def handle_agent_reasoning(self) -> AgentReasoningOutput:
"""Public method for the planning process that creates and refines a plan
for the task until the agent is ready to execute it.
Returns:
AgentReasoningOutput: The output of the agent planning process.
"""
task_id = str(self.task.id) if self.task else "kickoff"
# Emit a planning started event (attempt 1)
try:
crewai_event_bus.emit(
self.agent,
AgentReasoningStartedEvent(
agent_role=self.agent.role,
task_id=str(self.task.id),
task_id=task_id,
attempt=1,
from_task=self.task,
),
@@ -98,13 +210,13 @@ class AgentReasoning:
pass
try:
output = self.__handle_agent_reasoning()
output = self._execute_planning()
crewai_event_bus.emit(
self.agent,
AgentReasoningCompletedEvent(
agent_role=self.agent.role,
task_id=str(self.task.id),
task_id=task_id,
plan=output.plan.plan,
ready=output.plan.ready,
attempt=1,
@@ -115,71 +227,76 @@ class AgentReasoning:
return output
except Exception as e:
# Emit reasoning failed event
# Emit planning failed event
try:
crewai_event_bus.emit(
self.agent,
AgentReasoningFailedEvent(
agent_role=self.agent.role,
task_id=str(self.task.id),
task_id=task_id,
error=str(e),
attempt=1,
from_task=self.task,
from_agent=self.agent,
),
)
except Exception as e:
logging.error(f"Error emitting reasoning failed event: {e}")
except Exception as event_error:
logging.error(f"Error emitting planning failed event: {event_error}")
raise
def __handle_agent_reasoning(self) -> AgentReasoningOutput:
"""Private method that handles the agent reasoning process.
def _execute_planning(self) -> AgentReasoningOutput:
"""Execute the planning process.
Returns:
The output of the agent reasoning process.
The output of the agent planning process.
"""
plan, ready = self.__create_initial_plan()
plan, steps, ready = self._create_initial_plan()
plan, steps, ready = self._refine_plan_if_needed(plan, steps, ready)
plan, ready = self.__refine_plan_if_needed(plan, ready)
reasoning_plan = ReasoningPlan(plan=plan, ready=ready)
reasoning_plan = ReasoningPlan(plan=plan, steps=steps, ready=ready)
return AgentReasoningOutput(plan=reasoning_plan)
def __create_initial_plan(self) -> tuple[str, bool]:
"""Creates the initial reasoning plan for the task.
def _create_initial_plan(self) -> tuple[str, list[PlanStep], bool]:
"""Creates the initial plan for the task.
Returns:
The initial plan and whether the agent is ready to execute the task.
A tuple of the plan summary, list of steps, and whether the agent is ready.
"""
reasoning_prompt = self.__create_reasoning_prompt()
planning_prompt = self._create_planning_prompt()
if self.llm.supports_function_calling():
plan, ready = self.__call_with_function(reasoning_prompt, "initial_plan")
return plan, ready
response = _call_llm_with_reasoning_prompt(
llm=self.llm,
prompt=reasoning_prompt,
task=self.task,
reasoning_agent=self.agent,
backstory=self.__get_agent_backstory(),
plan_type="initial_plan",
plan, steps, ready = self._call_with_function(
planning_prompt, "create_plan"
)
return plan, steps, ready
response = self._call_llm_with_prompt(
prompt=planning_prompt,
plan_type="create_plan",
)
return self.__parse_reasoning_response(str(response))
plan, ready = self._parse_planning_response(str(response))
return plan, [], ready # No structured steps from text parsing
def __refine_plan_if_needed(self, plan: str, ready: bool) -> tuple[str, bool]:
"""Refines the reasoning plan if the agent is not ready to execute the task.
def _refine_plan_if_needed(
self, plan: str, steps: list[PlanStep], ready: bool
) -> tuple[str, list[PlanStep], bool]:
"""Refines the plan if the agent is not ready to execute the task.
Args:
plan: The current reasoning plan.
plan: The current plan.
steps: The current list of steps.
ready: Whether the agent is ready to execute the task.
Returns:
The refined plan and whether the agent is ready to execute the task.
The refined plan, steps, and whether the agent is ready to execute.
"""
attempt = 1
max_attempts = self.agent.max_reasoning_attempts
max_attempts = self.config.max_attempts
task_id = str(self.task.id) if self.task else "kickoff"
current_attempt = attempt + 1
while not ready and (max_attempts is None or attempt < max_attempts):
# Emit event for each refinement attempt
@@ -188,62 +305,81 @@ class AgentReasoning:
self.agent,
AgentReasoningStartedEvent(
agent_role=self.agent.role,
task_id=str(self.task.id),
attempt=attempt + 1,
task_id=task_id,
attempt=current_attempt,
from_task=self.task,
),
)
except Exception: # noqa: S110
pass
refine_prompt = self.__create_refine_prompt(plan)
refine_prompt = self._create_refine_prompt(plan)
if self.llm.supports_function_calling():
plan, ready = self.__call_with_function(refine_prompt, "refine_plan")
plan, steps, ready = self._call_with_function(
refine_prompt, "refine_plan"
)
else:
response = _call_llm_with_reasoning_prompt(
llm=self.llm,
response = self._call_llm_with_prompt(
prompt=refine_prompt,
task=self.task,
reasoning_agent=self.agent,
backstory=self.__get_agent_backstory(),
plan_type="refine_plan",
)
plan, ready = self.__parse_reasoning_response(str(response))
plan, ready = self._parse_planning_response(str(response))
steps = [] # No structured steps from text parsing
# Emit completed event for this refinement attempt
try:
crewai_event_bus.emit(
self.agent,
AgentReasoningCompletedEvent(
agent_role=self.agent.role,
task_id=task_id,
plan=plan,
ready=ready,
attempt=current_attempt,
from_task=self.task,
from_agent=self.agent,
),
)
except Exception: # noqa: S110
pass
attempt += 1
if max_attempts is not None and attempt >= max_attempts:
self.logger.warning(
f"Agent reasoning reached maximum attempts ({max_attempts}) without being ready. Proceeding with current plan."
f"Agent planning reached maximum attempts ({max_attempts}) "
"without being ready. Proceeding with current plan."
)
break
return plan, ready
return plan, steps, ready
def __call_with_function(self, prompt: str, prompt_type: str) -> tuple[str, bool]:
"""Calls the LLM with function calling to get a reasoning plan.
def _call_with_function(
self, prompt: str, plan_type: Literal["create_plan", "refine_plan"]
) -> tuple[str, list[PlanStep], bool]:
"""Calls the LLM with function calling to get a plan.
Args:
prompt: The prompt to send to the LLM.
prompt_type: The type of prompt (initial_plan or refine_plan).
plan_type: The type of plan being created.
Returns:
A tuple containing the plan and whether the agent is ready.
A tuple containing the plan summary, list of steps, and whether the agent is ready.
"""
self.logger.debug(f"Using function calling for {prompt_type} reasoning")
self.logger.debug(f"Using function calling for {plan_type} planning")
try:
system_prompt = self.agent.i18n.retrieve("reasoning", prompt_type).format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
)
system_prompt = self._get_system_prompt()
# Prepare a simple callable that just returns the tool arguments as JSON
def _create_reasoning_plan(plan: str, ready: bool = True) -> str:
"""Return the reasoning plan result in JSON string form."""
return json.dumps({"plan": plan, "ready": ready})
def _create_reasoning_plan(
plan: str,
steps: list[dict[str, Any]] | None = None,
ready: bool = True,
) -> str:
"""Return the planning result in JSON string form."""
return json.dumps({"plan": plan, "steps": steps or [], "ready": ready})
response = self.llm.call(
[
@@ -255,19 +391,33 @@ class AgentReasoning:
from_task=self.task,
from_agent=self.agent,
)
self.logger.debug(f"Function calling response: {response[:100]}...")
try:
result = json.loads(response)
if "plan" in result and "ready" in result:
return result["plan"], result["ready"]
# Parse steps from the response
steps: list[PlanStep] = []
raw_steps = result.get("steps", [])
try:
for step_data in raw_steps:
step = PlanStep(
step_number=step_data.get("step_number", 0),
description=step_data.get("description", ""),
tool_to_use=step_data.get("tool_to_use"),
depends_on=step_data.get("depends_on", []),
)
steps.append(step)
except Exception as step_error:
self.logger.warning(
f"Failed to parse step: {step_data}, error: {step_error}"
)
return result["plan"], steps, result["ready"]
except (json.JSONDecodeError, KeyError):
pass
response_str = str(response)
return (
response_str,
[],
"READY: I am ready to execute the task." in response_str,
)
@@ -277,13 +427,7 @@ class AgentReasoning:
)
try:
system_prompt = self.agent.i18n.retrieve(
"reasoning", prompt_type
).format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
)
system_prompt = self._get_system_prompt()
fallback_response = self.llm.call(
[
@@ -297,78 +441,165 @@ class AgentReasoning:
fallback_str = str(fallback_response)
return (
fallback_str,
[],
"READY: I am ready to execute the task." in fallback_str,
)
except Exception as inner_e:
self.logger.error(f"Error during fallback text parsing: {inner_e!s}")
return (
"Failed to generate a plan due to an error.",
[],
True,
) # Default to ready to avoid getting stuck
def __get_agent_backstory(self) -> str:
"""
Safely gets the agent's backstory, providing a default if not available.
def _call_llm_with_prompt(
self,
prompt: str,
plan_type: Literal["create_plan", "refine_plan"],
) -> str:
"""Calls the LLM with the planning prompt.
Args:
prompt: The prompt to send to the LLM.
plan_type: The type of plan being created.
Returns:
str: The agent's backstory or a default value.
The LLM response.
"""
system_prompt = self._get_system_prompt()
response = self.llm.call(
[
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
],
from_task=self.task,
from_agent=self.agent,
)
return str(response)
def _get_system_prompt(self) -> str:
"""Get the system prompt for planning.
Returns:
The system prompt, either custom or from i18n.
"""
if self.config.system_prompt is not None:
return self.config.system_prompt
# Try new "planning" section first, fall back to "reasoning" for compatibility
try:
return self.agent.i18n.retrieve("planning", "system_prompt")
except (KeyError, AttributeError):
# Fallback to reasoning section for backward compatibility
return self.agent.i18n.retrieve("reasoning", "initial_plan").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
)
def _get_agent_backstory(self) -> str:
"""Safely gets the agent's backstory, providing a default if not available.
Returns:
The agent's backstory or a default value.
"""
return getattr(self.agent, "backstory", "No backstory provided")
def __create_reasoning_prompt(self) -> str:
"""
Creates a prompt for the agent to reason about the task.
def _create_planning_prompt(self) -> str:
"""Creates a prompt for the agent to plan the task.
Returns:
str: The reasoning prompt.
The planning prompt.
"""
available_tools = self.__format_available_tools()
available_tools = self._format_available_tools()
return self.agent.i18n.retrieve("reasoning", "create_plan_prompt").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
description=self.task.description,
expected_output=self.task.expected_output,
tools=available_tools,
)
# Use custom prompt if provided
if self.config.plan_prompt is not None:
return self.config.plan_prompt.format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
description=self.description,
expected_output=self.expected_output,
tools=available_tools,
max_steps=self.config.max_steps,
)
def __format_available_tools(self) -> str:
"""
Formats the available tools for inclusion in the prompt.
# Try new "planning" section first
try:
return self.agent.i18n.retrieve("planning", "create_plan_prompt").format(
description=self.description,
expected_output=self.expected_output,
tools=available_tools,
max_steps=self.config.max_steps,
)
except (KeyError, AttributeError):
# Fallback to reasoning section for backward compatibility
return self.agent.i18n.retrieve("reasoning", "create_plan_prompt").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
description=self.description,
expected_output=self.expected_output,
tools=available_tools,
)
def _format_available_tools(self) -> str:
"""Formats the available tools for inclusion in the prompt.
Returns:
str: Comma-separated list of tool names.
Comma-separated list of tool names.
"""
try:
return ", ".join(
[sanitize_tool_name(tool.name) for tool in (self.task.tools or [])]
)
# Try task tools first, then agent tools
tools = []
if self.task:
tools = self.task.tools or []
if not tools:
tools = getattr(self.agent, "tools", []) or []
if not tools:
return "No tools available"
return ", ".join([sanitize_tool_name(tool.name) for tool in tools])
except (AttributeError, TypeError):
return "No tools available"
def __create_refine_prompt(self, current_plan: str) -> str:
"""
Creates a prompt for the agent to refine its reasoning plan.
def _create_refine_prompt(self, current_plan: str) -> str:
"""Creates a prompt for the agent to refine its plan.
Args:
current_plan: The current reasoning plan.
current_plan: The current plan.
Returns:
str: The refine prompt.
The refine prompt.
"""
return self.agent.i18n.retrieve("reasoning", "refine_plan_prompt").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
current_plan=current_plan,
)
# Use custom prompt if provided
if self.config.refine_prompt is not None:
return self.config.refine_prompt.format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
current_plan=current_plan,
max_steps=self.config.max_steps,
)
# Try new "planning" section first
try:
return self.agent.i18n.retrieve("planning", "refine_plan_prompt").format(
current_plan=current_plan,
)
except (KeyError, AttributeError):
# Fallback to reasoning section for backward compatibility
return self.agent.i18n.retrieve("reasoning", "refine_plan_prompt").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
current_plan=current_plan,
)
@staticmethod
def __parse_reasoning_response(response: str) -> tuple[str, bool]:
"""
Parses the reasoning response to extract the plan and whether
the agent is ready to execute the task.
def _parse_planning_response(response: str) -> tuple[str, bool]:
"""Parses the planning response to extract the plan and readiness.
Args:
response: The LLM response.
@@ -380,25 +611,13 @@ class AgentReasoning:
return "No plan was generated.", False
plan = response
ready = False
if "READY: I am ready to execute the task." in response:
ready = True
ready = "READY: I am ready to execute the task." in response
return plan, ready
def _handle_agent_reasoning(self) -> AgentReasoningOutput:
"""
Deprecated method for backward compatibility.
Use handle_agent_reasoning() instead.
Returns:
AgentReasoningOutput: The output of the agent reasoning process.
"""
self.logger.warning(
"The _handle_agent_reasoning method is deprecated. Use handle_agent_reasoning instead."
)
return self.handle_agent_reasoning()
# Alias for backward compatibility
AgentPlanning = AgentReasoning
def _call_llm_with_reasoning_prompt(
@@ -409,7 +628,9 @@ def _call_llm_with_reasoning_prompt(
backstory: str,
plan_type: Literal["initial_plan", "refine_plan"],
) -> str:
"""Calls the LLM with the reasoning prompt.
"""Deprecated: Calls the LLM with the reasoning prompt.
This function is kept for backward compatibility.
Args:
llm: The language model to use.
@@ -417,7 +638,7 @@ def _call_llm_with_reasoning_prompt(
task: The task for which the agent is reasoning.
reasoning_agent: The agent performing the reasoning.
backstory: The agent's backstory.
plan_type: The type of plan being created ("initial_plan" or "refine_plan").
plan_type: The type of plan being created.
Returns:
The LLM response.

View File

@@ -0,0 +1,64 @@
"""Context and result types for isolated step execution in Plan-and-Execute architecture.
These types mediate between the AgentExecutor (orchestrator) and StepExecutor (per-step worker).
StepExecutionContext carries only final results from dependencies — never LLM message histories.
StepResult carries only the outcome of a step — never internal execution traces.
"""
from __future__ import annotations
from dataclasses import dataclass, field
@dataclass(frozen=True)
class StepExecutionContext:
"""Immutable context passed to a StepExecutor for a single todo.
Contains only the information the Executor needs to complete one step:
the task description, goal, and final results from dependency steps.
No LLM message history, no execution traces, no shared mutable state.
Attributes:
task_description: The original task description (from Task or kickoff input).
task_goal: The expected output / goal of the overall task.
dependency_results: Mapping of step_number → final result string
for all completed dependencies of the current step.
"""
task_description: str
task_goal: str
dependency_results: dict[int, str] = field(default_factory=dict)
def get_dependency_result(self, step_number: int) -> str | None:
"""Get the final result of a dependency step.
Args:
step_number: The step number to look up.
Returns:
The result string if available, None otherwise.
"""
return self.dependency_results.get(step_number)
@dataclass
class StepResult:
"""Result returned by a StepExecutor after executing a single todo.
Contains the final outcome and metadata for debugging/metrics.
Tool call details are for audit logging only — they are NOT passed
to subsequent steps or the Planner.
Attributes:
success: Whether the step completed successfully.
result: The final output string from the step.
error: Error message if the step failed (None on success).
tool_calls_made: List of tool names invoked (for debugging/logging only).
execution_time: Wall-clock time in seconds for the step execution.
"""
success: bool
result: str
error: str | None = None
tool_calls_made: list[str] = field(default_factory=list)
execution_time: float = 0.0

View File

@@ -1456,7 +1456,7 @@ def test_agent_execute_task_with_tool():
)
result = agent.execute_task(task)
assert "you should always think about what to do" in result
assert "test query" in result
@pytest.mark.vcr()
@@ -1475,9 +1475,9 @@ def test_agent_execute_task_with_custom_llm():
)
result = agent.execute_task(task)
assert "In circuits they thrive" in result
assert "Artificial minds awake" in result
assert "Future's coded drive" in result
assert "Artificial minds" in result
assert "Code and circuits" in result
assert "Future undefined" in result
@pytest.mark.vcr()

View File

@@ -4,16 +4,27 @@ Tests the Flow-based agent executor implementation including state management,
flow methods, routing logic, and error handling.
"""
import asyncio
import time
from unittest.mock import Mock, patch
from unittest.mock import AsyncMock, Mock, patch
import pytest
from crewai.agents.step_executor import StepExecutor
from crewai.agents.planner_observer import PlannerObserver
from crewai.experimental.agent_executor import (
AgentReActState,
AgentExecutor,
)
from crewai.agents.parser import AgentAction, AgentFinish
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.tools.tool_types import ToolResult
from crewai.utilities.step_execution_context import StepExecutionContext
from crewai.utilities.planning_types import TodoItem
class TestAgentReActState:
"""Test AgentReActState Pydantic model."""
@@ -26,6 +37,18 @@ class TestAgentReActState:
assert state.current_answer is None
assert state.is_finished is False
assert state.ask_for_human_input is False
# Planning state fields
assert state.plan is None
assert state.plan_ready is False
def test_state_with_plan(self):
"""Test AgentReActState initialization with planning fields."""
state = AgentReActState(
plan="Step 1: Do X\nStep 2: Do Y",
plan_ready=True,
)
assert state.plan == "Step 1: Do X\nStep 2: Do Y"
assert state.plan_ready is True
def test_state_with_values(self):
"""Test AgentReActState initialization with values."""
@@ -180,6 +203,88 @@ class TestAgentExecutor:
assert result == "skipped"
assert executor.state.is_finished is False
def test_finalize_skips_synthesis_for_strong_last_todo_result(
self, mock_dependencies
):
"""Finalize should skip synthesis when last todo is already a complete answer."""
with patch.object(AgentExecutor, "_show_logs") as mock_show_logs:
executor = AgentExecutor(**mock_dependencies)
executor.state.todos.items = [
TodoItem(
step_number=1,
description="Gather source details",
tool_to_use="search_tool",
status="completed",
result="Source A and Source B identified.",
),
TodoItem(
step_number=2,
description="Write final response",
tool_to_use=None,
status="completed",
result=(
"The final recommendation is to adopt a phased rollout plan with "
"weekly checkpoints, explicit ownership, and a rollback path for "
"each milestone. This approach keeps risk controlled while still "
"moving quickly, and it aligns delivery metrics with stakeholder "
"communication and operational readiness."
),
),
]
with patch.object(
executor, "_synthesize_final_answer_from_todos"
) as mock_synthesize:
result = executor.finalize()
assert result == "completed"
assert isinstance(executor.state.current_answer, AgentFinish)
assert (
executor.state.current_answer.output
== executor.state.todos.items[1].result
)
assert executor.state.is_finished is True
mock_synthesize.assert_not_called()
mock_show_logs.assert_called_once()
def test_finalize_keeps_synthesis_when_response_model_is_set(
self, mock_dependencies
):
"""Finalize should still synthesize when response_model is configured."""
with patch.object(AgentExecutor, "_show_logs"):
executor = AgentExecutor(**mock_dependencies)
executor.response_model = Mock()
executor.state.todos.items = [
TodoItem(
step_number=1,
description="Write final response",
tool_to_use=None,
status="completed",
result=(
"This is already detailed prose with multiple sentences. "
"It should still run synthesis because structured output "
"was requested via response_model."
),
)
]
def _set_current_answer() -> None:
executor.state.current_answer = AgentFinish(
thought="Synthesized",
output="structured-like-answer",
text="structured-like-answer",
)
with patch.object(
executor,
"_synthesize_final_answer_from_todos",
side_effect=_set_current_answer,
) as mock_synthesize:
result = executor.finalize()
assert result == "completed"
mock_synthesize.assert_called_once()
def test_format_prompt(self, mock_dependencies):
"""Test prompt formatting."""
executor = AgentExecutor(**mock_dependencies)
@@ -234,6 +339,113 @@ class TestAgentExecutor:
AgentFinish(thought="thinking", output="test", text="final")
)
@pytest.mark.asyncio
async def test_invoke_step_callback_async_inside_running_loop(
self, mock_dependencies
):
"""Test async step callback scheduling when already in an event loop."""
callback = AsyncMock()
mock_dependencies["step_callback"] = callback
executor = AgentExecutor(**mock_dependencies)
answer = AgentFinish(thought="thinking", output="test", text="final")
with patch("crewai.experimental.agent_executor.asyncio.run") as mock_run:
executor._invoke_step_callback(answer)
await asyncio.sleep(0)
callback.assert_awaited_once_with(answer)
mock_run.assert_not_called()
class TestStepExecutorCriticalFixes:
"""Regression tests for critical plan-and-execute issues."""
@pytest.fixture
def step_executor(self):
llm = Mock()
llm.supports_stop_words.return_value = True
agent = Mock()
agent.role = "Test Agent"
agent.goal = "Execute tasks"
agent.verbose = False
agent.key = "test-agent-key"
tool = Mock()
tool.name = "count_words"
task = Mock()
task.name = "test-task"
task.description = "test task description"
return StepExecutor(
llm=llm,
tools=[tool],
agent=agent,
original_tools=[],
tools_handler=Mock(),
task=task,
crew=Mock(),
function_calling_llm=None,
request_within_rpm_limit=None,
callbacks=[],
)
def test_step_executor_fails_when_expected_tool_is_not_called(self, step_executor):
"""Step should fail if a configured expected tool is not actually invoked."""
todo = TodoItem(
step_number=1,
description="Count words in input text.",
tool_to_use="count_words",
depends_on=[],
status="pending",
)
context = StepExecutionContext(task_description="task", task_goal="goal")
with patch.object(step_executor, "_build_isolated_messages", return_value=[]):
with patch.object(
step_executor, "_execute_text_parsed", return_value="No tool used."
):
result = step_executor.execute(todo, context)
assert result.success is False
assert result.error is not None
assert "Expected tool 'count_words' was not called" in result.error
def test_step_executor_text_tool_emits_usage_events(self, step_executor):
"""Text-parsed tool execution should emit started and finished events."""
started_events: list[ToolUsageStartedEvent] = []
finished_events: list[ToolUsageFinishedEvent] = []
tool_name = "count_words"
action = AgentAction(
thought="Need a tool",
tool=tool_name,
tool_input='{"text":"hello world"}',
text="Action: count_words",
)
@crewai_event_bus.on(ToolUsageStartedEvent)
def _on_started(_source, event):
if event.tool_name == tool_name:
started_events.append(event)
@crewai_event_bus.on(ToolUsageFinishedEvent)
def _on_finished(_source, event):
if event.tool_name == tool_name:
finished_events.append(event)
with patch(
"crewai.agents.step_executor.execute_tool_and_check_finality",
return_value=ToolResult(result="2", result_as_answer=False),
):
output = step_executor._execute_text_tool_with_events(action)
crewai_event_bus.flush()
assert output == "2"
assert len(started_events) >= 1
assert len(finished_events) >= 1
@patch("crewai.experimental.agent_executor.handle_output_parser_exception")
def test_recover_from_parser_error(
self, mock_handle_exception, mock_dependencies
@@ -636,3 +848,696 @@ class TestNativeToolExecution:
tool_messages = [m for m in executor.state.messages if m.get("role") == "tool"]
assert len(tool_messages) == 1
assert tool_messages[0]["tool_call_id"] == "call_1"
def test_check_native_todo_completion_requires_expected_tool_match(
self, mock_dependencies
):
from crewai.utilities.planning_types import TodoList
executor = AgentExecutor(**mock_dependencies)
running = TodoItem(
step_number=1,
description="Use the expected tool",
tool_to_use="expected_tool",
status="running",
)
executor.state.todos = TodoList(items=[running])
executor.state.last_native_tools_executed = ["other_tool"]
assert executor.check_native_todo_completion() == "todo_not_satisfied"
executor.state.last_native_tools_executed = ["expected_tool"]
assert executor.check_native_todo_completion() == "todo_satisfied"
running.tool_to_use = None
executor.state.last_native_tools_executed = ["any_tool"]
assert executor.check_native_todo_completion() == "todo_not_satisfied"
class TestPlannerObserver:
def test_observe_fallback_is_conservative_on_llm_error(self):
llm = Mock()
llm.call.side_effect = RuntimeError("llm unavailable")
agent = Mock()
agent.role = "Observer Test Agent"
agent.llm = llm
agent.planning_config = None
task = Mock()
task.description = "Test task"
task.expected_output = "Expected result"
observer = PlannerObserver(agent=agent, task=task)
completed_step = TodoItem(
step_number=1,
description="Do something",
status="running",
)
observation = observer.observe(
completed_step=completed_step,
result="Error: tool timeout",
all_completed=[],
remaining_todos=[],
)
assert observation.step_completed_successfully is False
assert observation.remaining_plan_still_valid is False
assert observation.needs_full_replan is True
assert observation.replan_reason == "Observer failed to evaluate step result safely"
class TestAgentExecutorPlanning:
"""Test planning functionality in AgentExecutor with real agent kickoff."""
@pytest.mark.vcr()
def test_agent_kickoff_with_planning_stores_plan_in_state(self):
"""Test that Agent.kickoff() with planning enabled stores plan in executor state."""
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve simple math problems",
backstory="A helpful assistant that solves math problems step by step",
llm=llm,
planning_config=PlanningConfig(max_attempts=1),
verbose=False,
)
# Execute kickoff with a simple task
result = agent.kickoff("What is 2 + 2?")
# Verify result
assert result is not None
assert "4" in str(result)
@pytest.mark.vcr()
def test_agent_kickoff_without_planning_skips_plan_generation(self):
"""Test that Agent.kickoff() without planning skips planning phase."""
from crewai import Agent
from crewai.llm import LLM
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve simple math problems",
backstory="A helpful assistant",
llm=llm,
# No planning_config = no planning
verbose=False,
)
# Execute kickoff
result = agent.kickoff("What is 3 + 3?")
# Verify we get a result
assert result is not None
assert "6" in str(result)
@pytest.mark.vcr()
def test_planning_disabled_skips_planning(self):
"""Test that planning=False skips planning."""
from crewai import Agent
from crewai.llm import LLM
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve simple math problems",
backstory="A helpful assistant",
llm=llm,
planning=False, # Explicitly disable planning
verbose=False,
)
result = agent.kickoff("What is 5 + 5?")
# Should still complete successfully
assert result is not None
assert "10" in str(result)
def test_backward_compat_reasoning_true_enables_planning(self):
"""Test that reasoning=True (deprecated) still enables planning."""
import warnings
from crewai import Agent
from crewai.llm import LLM
llm = LLM("gpt-4o-mini")
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
agent = Agent(
role="Test Agent",
goal="Complete tasks",
backstory="A helpful agent",
llm=llm,
reasoning=True, # Deprecated but should still work
verbose=False,
)
# Should have planning_config created from reasoning=True
assert agent.planning_config is not None
assert agent.planning_enabled is True
@pytest.mark.vcr()
def test_executor_state_contains_plan_after_planning(self):
"""Test that executor state contains plan after planning phase."""
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
from crewai.experimental.agent_executor import AgentExecutor
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve simple math problems",
backstory="A helpful assistant that solves math problems step by step",
llm=llm,
planning_config=PlanningConfig(max_attempts=1),
verbose=False,
)
# Track executor for inspection
executor_ref = [None]
original_invoke = AgentExecutor.invoke
def capture_executor(self, inputs):
executor_ref[0] = self
return original_invoke(self, inputs)
with patch.object(AgentExecutor, "invoke", capture_executor):
result = agent.kickoff("What is 7 + 7?")
# Verify result
assert result is not None
# If we captured an executor, check its state
if executor_ref[0] is not None:
# After planning, state should have plan info
assert hasattr(executor_ref[0].state, "plan")
assert hasattr(executor_ref[0].state, "plan_ready")
@pytest.mark.vcr()
def test_planning_creates_minimal_steps_for_multi_step_task(self):
"""Test that planning creates steps and executes them for a multi-step task.
This task requires multiple dependent steps:
1. Identify the first 3 prime numbers (2, 3, 5)
2. Sum them (2 + 3 + 5 = 10)
3. Multiply by 2 (10 * 2 = 20)
The plan-and-execute architecture should produce step results.
"""
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
from crewai.experimental.agent_executor import AgentExecutor
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Tutor",
goal="Solve multi-step math problems accurately",
backstory="An expert math tutor who breaks down problems step by step",
llm=llm,
planning_config=PlanningConfig(max_attempts=1, max_steps=10),
verbose=False,
)
# Track the plan that gets generated
captured_plan = [None]
original_invoke = AgentExecutor.invoke
def capture_plan(self, inputs):
result = original_invoke(self, inputs)
captured_plan[0] = self.state.plan
return result
with patch.object(AgentExecutor, "invoke", capture_plan):
result = agent.kickoff(
"Calculate the sum of the first 3 prime numbers, then multiply that result by 2. "
"Show your work for each step."
)
# Verify we got a result with step outputs
assert result is not None
result_str = str(result)
# Should contain at least some mathematical content from the steps
assert "prime" in result_str.lower() or "2" in result_str or "10" in result_str
# Verify a plan was generated
assert captured_plan[0] is not None
@pytest.mark.vcr()
def test_planning_handles_sequential_dependency_task(self):
"""Test planning for a task where step N depends on step N-1.
Task: Convert 100 Celsius to Fahrenheit, then round to nearest 10.
Step 1: Apply formula (C * 9/5 + 32) = 212
Step 2: Round 212 to nearest 10 = 210
This tests that the planner creates a plan and executes steps.
"""
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
from crewai.experimental.agent_executor import AgentExecutor
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Unit Converter",
goal="Accurately convert between units and apply transformations",
backstory="A precise unit conversion specialist",
llm=llm,
planning_config=PlanningConfig(max_attempts=1, max_steps=10),
verbose=False,
)
captured_plan = [None]
original_invoke = AgentExecutor.invoke
def capture_plan(self, inputs):
result = original_invoke(self, inputs)
captured_plan[0] = self.state.plan
return result
with patch.object(AgentExecutor, "invoke", capture_plan):
result = agent.kickoff(
"Convert 100 degrees Celsius to Fahrenheit, then round the result to the nearest 10."
)
assert result is not None
result_str = str(result)
# Should contain conversion-related content
assert "212" in result_str or "210" in result_str or "Fahrenheit" in result_str or "celsius" in result_str.lower()
# Plan should exist
assert captured_plan[0] is not None
class TestResponseFormatWithKickoff:
"""Test that Agent.kickoff(response_format=MyModel) returns structured output.
Real LLM calls via VCR cassettes. Tests both with and without planning,
using real tools for the planning case to exercise the full Plan-and-Execute
path including synthesis with response_model.
"""
@pytest.mark.vcr()
def test_kickoff_response_format_without_planning(self):
"""Test that kickoff(response_format) returns structured output without planning."""
from pydantic import BaseModel, Field
from crewai import Agent
from crewai.llm import LLM
class MathResult(BaseModel):
answer: int = Field(description="The numeric answer")
explanation: str = Field(description="Brief explanation of the solution")
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Solve math problems and return structured results",
backstory="A precise math assistant that always returns structured data",
llm=llm,
verbose=False,
)
result = agent.kickoff("What is 15 + 27?", response_format=MathResult)
assert result is not None
assert result.pydantic is not None
assert isinstance(result.pydantic, MathResult)
assert result.pydantic.answer == 42
assert len(result.pydantic.explanation) > 0
@pytest.mark.vcr()
def test_kickoff_response_format_with_planning_and_tools(self):
"""Test response_format with planning + tools (multi-step research).
This is the key test for _synthesize_final_answer_from_todos:
1. Planning generates steps that use the EXA search tool
2. StepExecutor runs each step in isolation with tool calls
3. The synthesis step produces a structured BaseModel output
The response_format should be respected by the synthesis LLM call,
NOT by intermediate step executions.
"""
from pydantic import BaseModel, Field
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
from crewai_tools import EXASearchTool
class ResearchSummary(BaseModel):
topic: str = Field(description="The research topic")
key_findings: list[str] = Field(description="List of 3-5 key findings")
conclusion: str = Field(description="A brief conclusion paragraph")
llm = LLM("gpt-4o-mini")
exa = EXASearchTool()
agent = Agent(
role="Research Analyst",
goal="Research topics using search tools and produce structured summaries",
backstory=(
"You are a research analyst who searches the web for information, "
"identifies key findings, and produces structured research summaries."
),
llm=llm,
planning_config=PlanningConfig(max_attempts=1, max_steps=5),
tools=[exa],
verbose=False,
)
result = agent.kickoff(
"Research the current state of autonomous AI agents in 2025. "
"Search for recent developments, then summarize the key findings.",
response_format=ResearchSummary,
)
assert result is not None
# The synthesis step should have produced structured output
assert result.pydantic is not None
assert isinstance(result.pydantic, ResearchSummary)
# Verify the structured fields are populated
assert len(result.pydantic.topic) > 0
assert len(result.pydantic.key_findings) >= 1
assert len(result.pydantic.conclusion) > 0
@pytest.mark.vcr()
def test_kickoff_no_response_format_returns_raw_text(self):
"""Test that kickoff without response_format returns plain text."""
from crewai import Agent
from crewai.llm import LLM
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Solve math problems",
backstory="A helpful math assistant",
llm=llm,
verbose=False,
)
result = agent.kickoff("What is 10 + 10?")
assert result is not None
assert result.pydantic is None
assert "20" in str(result)
class TestReasoningEffort:
"""Test reasoning_effort levels in PlanningConfig.
- low: observe() runs (validates step success), but skip decide/replan/refine
- medium: observe() runs, replan on failure only (mocked)
- high: full observation pipeline with decide/replan/refine/goal-achieved
"""
@pytest.mark.vcr()
def test_reasoning_effort_low_skips_decide_and_replan(self):
"""Low effort: observe runs but decide/replan/refine are never called.
Verifies that with reasoning_effort='low':
1. The agent produces a correct result
2. The observation phase still runs (observations are stored)
3. The decide_next_action/refine/replan pipeline is bypassed
"""
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
from crewai.experimental.agent_executor import AgentExecutor
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Tutor",
goal="Solve multi-step math problems accurately",
backstory="An expert math tutor who breaks down problems step by step",
llm=llm,
planning_config=PlanningConfig(
reasoning_effort="low",
max_attempts=1,
max_steps=10,
),
verbose=False,
)
# Capture the executor to inspect state after execution
executor_ref = [None]
original_invoke = AgentExecutor.invoke
def capture_executor(self, inputs):
result = original_invoke(self, inputs)
executor_ref[0] = self
return result
with patch.object(AgentExecutor, "invoke", capture_executor):
result = agent.kickoff(
"What is the sum of the first 3 prime numbers (2, 3, 5)?"
)
assert result is not None
assert "10" in str(result)
# Verify observations were still collected (observe() ran)
executor = executor_ref[0]
if executor is not None and executor.state.todos.items:
assert len(executor.state.observations) > 0, (
"Low effort should still run observe() to validate steps"
)
# Verify no replan was triggered
assert executor.state.replan_count == 0, (
"Low effort should never trigger replanning"
)
# Check execution log for reasoning_effort annotation
observation_logs = [
log for log in executor.state.execution_log
if log.get("type") == "observation"
]
for log in observation_logs:
assert log.get("reasoning_effort") == "low"
@pytest.mark.vcr()
def test_reasoning_effort_high_runs_full_observation_pipeline(self):
"""High effort: full observation pipeline with decide/replan/refine.
Verifies that with reasoning_effort='high':
1. The agent produces a correct result
2. Observations are stored
3. The full decide_next_action pipeline runs (the observation-driven
routing is exercised, even if it just routes to continue_plan)
"""
from crewai import Agent, PlanningConfig
from crewai.llm import LLM
from crewai.experimental.agent_executor import AgentExecutor
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Tutor",
goal="Solve multi-step math problems accurately",
backstory="An expert math tutor who breaks down problems step by step",
llm=llm,
planning_config=PlanningConfig(
reasoning_effort="high",
max_attempts=1,
max_steps=10,
),
verbose=False,
)
executor_ref = [None]
original_invoke = AgentExecutor.invoke
def capture_executor(self, inputs):
result = original_invoke(self, inputs)
executor_ref[0] = self
return result
with patch.object(AgentExecutor, "invoke", capture_executor):
result = agent.kickoff(
"What is the sum of the first 3 prime numbers (2, 3, 5)?"
)
assert result is not None
assert "10" in str(result)
# Verify observations were collected
executor = executor_ref[0]
if executor is not None and executor.state.todos.items:
assert len(executor.state.observations) > 0, (
"High effort should run observe() on every step"
)
# Check execution log shows high reasoning_effort
observation_logs = [
log for log in executor.state.execution_log
if log.get("type") == "observation"
]
for log in observation_logs:
assert log.get("reasoning_effort") == "high"
def test_reasoning_effort_medium_replans_on_failure(self):
"""Medium effort: replan triggered when observation reports failure.
This test mocks the PlannerObserver to simulate a failed step,
verifying that medium effort routes to replan_now on failure
but continues on success.
"""
from crewai.experimental.agent_executor import AgentExecutor
from crewai.utilities.planning_types import (
StepObservation,
TodoItem,
TodoList,
)
# --- Build a minimal mock executor with medium effort ---
executor = Mock(spec=AgentExecutor)
executor.agent = Mock()
executor.agent.verbose = False
executor.agent.planning_config = Mock()
executor.agent.planning_config.reasoning_effort = "medium"
# Provide the real method under test (bound to our mock)
executor.handle_step_observed_medium = (
AgentExecutor.handle_step_observed_medium.__get__(executor)
)
executor._printer = Mock()
# --- Case 1: step succeeded → should return "continue_plan" ---
success_todo = TodoItem(
step_number=1,
description="Calculate something",
status="running",
result="42",
)
success_observation = StepObservation(
step_completed_successfully=True,
key_information_learned="Got the answer",
remaining_plan_still_valid=True,
)
# Set up state
todo_list = TodoList(items=[success_todo])
executor.state = Mock()
executor.state.todos = todo_list
executor.state.observations = {1: success_observation}
route = executor.handle_step_observed_medium()
assert route == "continue_plan", (
"Medium effort should continue on successful step"
)
assert success_todo.status == "completed"
# --- Case 2: step failed → should return "replan_now" ---
failed_todo = TodoItem(
step_number=2,
description="Divide by zero",
status="running",
result="Error: division by zero",
)
failed_observation = StepObservation(
step_completed_successfully=False,
key_information_learned="Division failed",
remaining_plan_still_valid=False,
needs_full_replan=True,
replan_reason="Step failed with error",
)
todo_list_2 = TodoList(items=[failed_todo])
executor.state.todos = todo_list_2
executor.state.observations = {2: failed_observation}
executor.state.last_replan_reason = None
route = executor.handle_step_observed_medium()
assert route == "replan_now", (
"Medium effort should trigger replan on failed step"
)
assert executor.state.last_replan_reason == "Step did not complete successfully"
def test_reasoning_effort_low_marks_complete_without_deciding(self):
"""Low effort: mark_completed is called, decide_next_action is not.
Unit test verifying the low handler's behavior directly.
"""
from crewai.experimental.agent_executor import AgentExecutor
from crewai.utilities.planning_types import TodoItem, TodoList
executor = Mock(spec=AgentExecutor)
executor.agent = Mock()
executor.agent.verbose = False
executor.agent.planning_config = Mock()
executor.agent.planning_config.reasoning_effort = "low"
# Bind the real method
executor.handle_step_observed_low = (
AgentExecutor.handle_step_observed_low.__get__(executor)
)
executor._printer = Mock()
todo = TodoItem(
step_number=1,
description="Do something",
status="running",
result="Done successfully",
)
todo_list = TodoList(items=[todo])
executor.state = Mock()
executor.state.todos = todo_list
route = executor.handle_step_observed_low()
assert route == "continue_plan"
assert todo.status == "completed"
assert todo.result == "Done successfully"
def test_planning_config_reasoning_effort_default_is_low(self):
"""Verify PlanningConfig defaults reasoning_effort to 'low'."""
from crewai.agent.planning_config import PlanningConfig
config = PlanningConfig()
assert config.reasoning_effort == "low"
def test_planning_config_reasoning_effort_validation(self):
"""Verify PlanningConfig rejects invalid reasoning_effort values."""
from pydantic import ValidationError
from crewai.agent.planning_config import PlanningConfig
with pytest.raises(ValidationError):
PlanningConfig(reasoning_effort="ultra")
# Valid values should work
for level in ("low", "medium", "high"):
config = PlanningConfig(reasoning_effort=level)
assert config.reasoning_effort == level
def test_get_reasoning_effort_reads_from_config(self):
"""Verify _get_reasoning_effort reads from agent.planning_config."""
from crewai.experimental.agent_executor import AgentExecutor
executor = Mock(spec=AgentExecutor)
executor._get_reasoning_effort = (
AgentExecutor._get_reasoning_effort.__get__(executor)
)
# Case 1: planning_config with reasoning_effort set
executor.agent = Mock()
executor.agent.planning_config = Mock()
executor.agent.planning_config.reasoning_effort = "high"
assert executor._get_reasoning_effort() == "high"
# Case 2: no planning_config → defaults to "low"
executor.agent.planning_config = None
assert executor._get_reasoning_effort() == "low"
# Case 3: planning_config without reasoning_effort attr → defaults to "low"
executor.agent.planning_config = Mock(spec=[])
assert executor._get_reasoning_effort() == "low"

View File

@@ -1,240 +1,345 @@
"""Tests for reasoning in agents."""
"""Tests for planning/reasoning in agents."""
import json
import warnings
import pytest
from crewai import Agent, Task
from crewai import Agent, PlanningConfig, Task
from crewai.llm import LLM
@pytest.fixture
def mock_llm_responses():
"""Fixture for mock LLM responses."""
return {
"ready": "I'll solve this simple math problem.\n\nREADY: I am ready to execute the task.\n\n",
"not_ready": "I need to think about derivatives.\n\nNOT READY: I need to refine my plan because I'm not sure about the derivative rules.",
"ready_after_refine": "I'll use the power rule for derivatives where d/dx(x^n) = n*x^(n-1).\n\nREADY: I am ready to execute the task.",
"execution": "4",
}
# =============================================================================
# Tests for PlanningConfig configuration (no LLM calls needed)
# =============================================================================
def test_agent_with_reasoning(mock_llm_responses):
"""Test agent with reasoning."""
llm = LLM("gpt-3.5-turbo")
def test_planning_config_default_values():
"""Test PlanningConfig default values."""
config = PlanningConfig()
assert config.max_attempts is None
assert config.max_steps == 20
assert config.system_prompt is None
assert config.plan_prompt is None
assert config.refine_prompt is None
assert config.llm is None
def test_planning_config_custom_values():
"""Test PlanningConfig with custom values."""
config = PlanningConfig(
max_attempts=5,
max_steps=15,
system_prompt="Custom system",
plan_prompt="Custom plan: {description}",
refine_prompt="Custom refine: {current_plan}",
llm="gpt-4",
)
assert config.max_attempts == 5
assert config.max_steps == 15
assert config.system_prompt == "Custom system"
assert config.plan_prompt == "Custom plan: {description}"
assert config.refine_prompt == "Custom refine: {current_plan}"
assert config.llm == "gpt-4"
def test_agent_with_planning_config_custom_prompts():
"""Test agent with PlanningConfig using custom prompts."""
llm = LLM("gpt-4o-mini")
custom_system_prompt = "You are a specialized planner."
custom_plan_prompt = "Plan this task: {description}"
agent = Agent(
role="Test Agent",
goal="To test custom prompts",
backstory="I am a test agent.",
llm=llm,
planning_config=PlanningConfig(
system_prompt=custom_system_prompt,
plan_prompt=custom_plan_prompt,
max_steps=10,
),
verbose=False,
)
# Just test that the agent is created properly
assert agent.planning_config is not None
assert agent.planning_config.system_prompt == custom_system_prompt
assert agent.planning_config.plan_prompt == custom_plan_prompt
assert agent.planning_config.max_steps == 10
def test_agent_with_planning_config_disabled():
"""Test agent with PlanningConfig disabled."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test disabled planning",
backstory="I am a test agent.",
llm=llm,
planning=False,
verbose=False,
)
# Planning should be disabled
assert agent.planning_enabled is False
def test_planning_enabled_property():
"""Test the planning_enabled property on Agent."""
llm = LLM("gpt-4o-mini")
# With planning_config enabled
agent_with_planning = Agent(
role="Test Agent",
goal="Test",
backstory="Test",
llm=llm,
planning=True,
)
assert agent_with_planning.planning_enabled is True
# With planning_config disabled
agent_disabled = Agent(
role="Test Agent",
goal="Test",
backstory="Test",
llm=llm,
planning=False,
)
assert agent_disabled.planning_enabled is False
# Without planning_config
agent_no_planning = Agent(
role="Test Agent",
goal="Test",
backstory="Test",
llm=llm,
)
assert agent_no_planning.planning_enabled is False
# =============================================================================
# Tests for backward compatibility with reasoning=True (no LLM calls)
# =============================================================================
def test_agent_with_reasoning_backward_compat():
"""Test agent with reasoning=True (backward compatibility)."""
llm = LLM("gpt-4o-mini")
# This should emit a deprecation warning
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
llm=llm,
reasoning=True,
verbose=False,
)
# Should have created a PlanningConfig internally
assert agent.planning_config is not None
assert agent.planning_enabled is True
def test_agent_with_reasoning_and_max_attempts_backward_compat():
"""Test agent with reasoning=True and max_reasoning_attempts (backward compatibility)."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
backstory="I am a test agent.",
llm=llm,
reasoning=True,
verbose=True,
max_reasoning_attempts=5,
verbose=False,
)
task = Task(
description="Simple math task: What's 2+2?",
expected_output="The answer should be a number.",
agent=agent,
)
agent.llm.call = lambda messages, *args, **kwargs: (
mock_llm_responses["ready"]
if any("create a detailed plan" in msg.get("content", "") for msg in messages)
else mock_llm_responses["execution"]
)
result = agent.execute_task(task)
assert result == mock_llm_responses["execution"]
assert "Reasoning Plan:" in task.description
# Should have created a PlanningConfig with max_attempts
assert agent.planning_config is not None
assert agent.planning_config.max_attempts == 5
def test_agent_with_reasoning_not_ready_initially(mock_llm_responses):
"""Test agent with reasoning that requires refinement."""
llm = LLM("gpt-3.5-turbo")
# =============================================================================
# Tests for Agent.kickoff() with planning (uses AgentExecutor)
# =============================================================================
@pytest.mark.vcr()
def test_agent_kickoff_with_planning():
"""Test Agent.kickoff() with planning enabled generates a plan."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
role="Math Assistant",
goal="Help solve math problems step by step",
backstory="A helpful math tutor",
llm=llm,
reasoning=True,
max_reasoning_attempts=2,
verbose=True,
planning_config=PlanningConfig(max_attempts=1),
verbose=False,
)
task = Task(
description="Complex math task: What's the derivative of x²?",
expected_output="The answer should be a mathematical expression.",
agent=agent,
)
result = agent.kickoff("What is 15 + 27?")
call_count = [0]
def mock_llm_call(messages, *args, **kwargs):
if any(
"create a detailed plan" in msg.get("content", "") for msg in messages
) or any("refine your plan" in msg.get("content", "") for msg in messages):
call_count[0] += 1
if call_count[0] == 1:
return mock_llm_responses["not_ready"]
return mock_llm_responses["ready_after_refine"]
return "2x"
agent.llm.call = mock_llm_call
result = agent.execute_task(task)
assert result == "2x"
assert call_count[0] == 2 # Should have made 2 reasoning calls
assert "Reasoning Plan:" in task.description
assert result is not None
assert "42" in str(result)
def test_agent_with_reasoning_max_attempts_reached():
"""Test agent with reasoning that reaches max attempts without being ready."""
llm = LLM("gpt-3.5-turbo")
@pytest.mark.vcr()
def test_agent_kickoff_without_planning():
"""Test Agent.kickoff() without planning skips plan generation."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
role="Math Assistant",
goal="Help solve math problems",
backstory="A helpful assistant",
llm=llm,
reasoning=True,
max_reasoning_attempts=2,
verbose=True,
# No planning_config = no planning
verbose=False,
)
task = Task(
description="Complex math task: Solve the Riemann hypothesis.",
expected_output="A proof or disproof of the hypothesis.",
agent=agent,
)
result = agent.kickoff("What is 8 * 7?")
call_count = [0]
def mock_llm_call(messages, *args, **kwargs):
if any(
"create a detailed plan" in msg.get("content", "") for msg in messages
) or any("refine your plan" in msg.get("content", "") for msg in messages):
call_count[0] += 1
return f"Attempt {call_count[0]}: I need more time to think.\n\nNOT READY: I need to refine my plan further."
return "This is an unsolved problem in mathematics."
agent.llm.call = mock_llm_call
result = agent.execute_task(task)
assert result == "This is an unsolved problem in mathematics."
assert (
call_count[0] == 2
) # Should have made exactly 2 reasoning calls (max_attempts)
assert "Reasoning Plan:" in task.description
assert result is not None
assert "56" in str(result)
def test_agent_reasoning_error_handling():
"""Test error handling during the reasoning process."""
llm = LLM("gpt-3.5-turbo")
@pytest.mark.vcr()
def test_agent_kickoff_with_planning_disabled():
"""Test Agent.kickoff() with planning explicitly disabled via planning=False."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
role="Math Assistant",
goal="Help solve math problems",
backstory="A helpful assistant",
llm=llm,
reasoning=True,
planning=False, # Explicitly disable planning
verbose=False,
)
task = Task(
description="Task that will cause an error",
expected_output="Output that will never be generated",
agent=agent,
)
result = agent.kickoff("What is 100 / 4?")
call_count = [0]
def mock_llm_call_error(*args, **kwargs):
call_count[0] += 1
if call_count[0] <= 2: # First calls are for reasoning
raise Exception("LLM error during reasoning")
return "Fallback execution result" # Return a value for task execution
agent.llm.call = mock_llm_call_error
result = agent.execute_task(task)
assert result == "Fallback execution result"
assert call_count[0] > 2 # Ensure we called the mock multiple times
assert result is not None
assert "25" in str(result)
@pytest.mark.skip(reason="Test requires updates for native tool calling changes")
def test_agent_with_function_calling():
"""Test agent with reasoning using function calling."""
llm = LLM("gpt-3.5-turbo")
@pytest.mark.vcr()
def test_agent_kickoff_multi_step_task_with_planning():
"""Test Agent.kickoff() with a multi-step task that benefits from planning."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
role="Math Tutor",
goal="Solve multi-step math problems",
backstory="An expert tutor who explains step by step",
llm=llm,
reasoning=True,
verbose=True,
planning_config=PlanningConfig(max_attempts=1, max_steps=5),
verbose=False,
)
task = Task(
description="Simple math task: What's 2+2?",
expected_output="The answer should be a number.",
agent=agent,
# Task requires: find primes, sum them, then double
result = agent.kickoff(
"Find the first 3 prime numbers, add them together, then multiply by 2."
)
agent.llm.supports_function_calling = lambda: True
def mock_function_call(messages, *args, **kwargs):
if "tools" in kwargs:
return json.dumps(
{"plan": "I'll solve this simple math problem: 2+2=4.", "ready": True}
)
return "4"
agent.llm.call = mock_function_call
result = agent.execute_task(task)
assert result == "4"
assert "Reasoning Plan:" in task.description
assert "I'll solve this simple math problem: 2+2=4." in task.description
assert result is not None
# First 3 primes: 2, 3, 5 -> sum = 10 -> doubled = 20
assert "20" in str(result)
@pytest.mark.skip(reason="Test requires updates for native tool calling changes")
def test_agent_with_function_calling_fallback():
"""Test agent with reasoning using function calling that falls back to text parsing."""
llm = LLM("gpt-3.5-turbo")
# =============================================================================
# Tests for Agent.execute_task() with planning (uses CrewAgentExecutor)
# These test the legacy path via handle_reasoning()
# =============================================================================
@pytest.mark.vcr()
def test_agent_execute_task_with_planning():
"""Test Agent.execute_task() with planning via CrewAgentExecutor."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.",
role="Math Assistant",
goal="Help solve math problems",
backstory="A helpful math tutor",
llm=llm,
reasoning=True,
verbose=True,
planning_config=PlanningConfig(max_attempts=1),
verbose=False,
)
task = Task(
description="Simple math task: What's 2+2?",
expected_output="The answer should be a number.",
description="What is 9 + 11?",
expected_output="A number",
agent=agent,
)
agent.llm.supports_function_calling = lambda: True
result = agent.execute_task(task)
def mock_function_call(messages, *args, **kwargs):
if "tools" in kwargs:
return "Invalid JSON that will trigger fallback. READY: I am ready to execute the task."
return "4"
assert result is not None
assert "20" in str(result)
# Planning should be appended to task description
assert "Planning:" in task.description
agent.llm.call = mock_function_call
@pytest.mark.vcr()
def test_agent_execute_task_without_planning():
"""Test Agent.execute_task() without planning."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve math problems",
backstory="A helpful assistant",
llm=llm,
verbose=False,
)
task = Task(
description="What is 12 * 3?",
expected_output="A number",
agent=agent,
)
result = agent.execute_task(task)
assert result == "4"
assert "Reasoning Plan:" in task.description
assert "Invalid JSON that will trigger fallback" in task.description
assert result is not None
assert "36" in str(result)
# No planning should be added
assert "Planning:" not in task.description
@pytest.mark.vcr()
def test_agent_execute_task_with_planning_refine():
"""Test Agent.execute_task() with planning that requires refinement."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Tutor",
goal="Solve complex math problems step by step",
backstory="An expert tutor",
llm=llm,
planning_config=PlanningConfig(max_attempts=2),
verbose=False,
)
task = Task(
description="Calculate the area of a circle with radius 5 (use pi = 3.14)",
expected_output="The area as a number",
agent=agent,
)
result = agent.execute_task(task)
assert result is not None
# Area = pi * r^2 = 3.14 * 25 = 78.5
assert "78" in str(result) or "79" in str(result)
assert "Planning:" in task.description

View File

@@ -359,17 +359,34 @@ def test_sets_flow_context_when_inside_flow():
@pytest.mark.vcr()
def test_guardrail_is_called_using_string():
"""Test that a string guardrail triggers events and retries correctly.
Uses a callable guardrail that deterministically fails on the first
attempt and passes on the second. This tests the guardrail event
machinery (started/completed events, retry loop) without depending
on the LLM to comply with contradictory constraints.
"""
guardrail_events: dict[str, list] = defaultdict(list)
from crewai.events.event_types import (
LLMGuardrailCompletedEvent,
LLMGuardrailStartedEvent,
)
# Deterministic guardrail: fail first call, pass second
call_count = {"n": 0}
def fail_then_pass_guardrail(output):
call_count["n"] += 1
if call_count["n"] == 1:
return (False, "Missing required format — please use a numbered list")
return (True, output)
agent = Agent(
role="Sports Analyst",
goal="Gather information about the best soccer players",
backstory="""You are an expert at gathering and organizing information. You carefully collect details and present them in a structured way.""",
guardrail="""Only include Brazilian players, both women and men""",
goal="List the best soccer players",
backstory="You are an expert at gathering and organizing information.",
guardrail=fail_then_pass_guardrail,
guardrail_max_retries=3,
)
condition = threading.Condition()
@@ -388,7 +405,7 @@ def test_guardrail_is_called_using_string():
guardrail_events["completed"].append(event)
condition.notify()
result = agent.kickoff(messages="Top 10 best players in the world?")
result = agent.kickoff(messages="Top 5 best soccer players in the world?")
with condition:
success = condition.wait_for(

View File

@@ -1,15 +1,9 @@
interactions:
- request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What type of document is this?\n\nBegin! This is VERY important to you,
use the tools available and give your best Final Answer, your job depends on
it!\n\nThought:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -22,7 +16,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1351'
- '950'
content-type:
- application/json
host:
@@ -38,35 +32,35 @@ interactions:
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 0.71.1
- 0.73.0
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
x-stainless-timeout:
- NOT_GIVEN
method: POST
uri: https://api.anthropic.com/v1/messages
response:
body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01AcygCF93tRhc7A3bfXMqe7","type":"message","role":"assistant","content":[{"type":"text","text":"Thought:
I can see this is a PDF document, but the image appears to be completely white
or blank. Without any visible content, I cannot definitively determine the
specific type of document.\n\nFinal Answer: The document is a PDF file, but
the provided image shows a blank white page with no discernible content or
text. More information or a clearer image would be needed to identify the
precise type of document."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1750,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":89,"service_tier":"standard"}}'
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01C8ZkZMunUVDUDd8mh1r1We","type":"message","role":"assistant","content":[{"type":"text","text":"I
apologize, but the image appears to be completely blank or white. Without
any visible text, graphics, or distinguishing features, I cannot determine
the type of document. The file is a PDF, but the content page seems to be
empty or failed to render properly."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1658,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":58,"service_tier":"standard","inference_geo":"not_available"}}'
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type:
- application/json
Date:
- Fri, 23 Jan 2026 19:08:04 GMT
- Thu, 12 Feb 2026 19:30:55 GMT
Server:
- cloudflare
Transfer-Encoding:
@@ -92,7 +86,7 @@ interactions:
anthropic-ratelimit-requests-remaining:
- '3999'
anthropic-ratelimit-requests-reset:
- '2026-01-23T19:08:01Z'
- '2026-02-12T19:30:53Z'
anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining:
@@ -106,7 +100,112 @@ interactions:
strict-transport-security:
- STS-XXX
x-envoy-upstream-service-time:
- '2837'
- '2129'
status:
code: 200
message: OK
- request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
anthropic-version:
- '2023-06-01'
connection:
- keep-alive
content-length:
- '950'
content-type:
- application/json
host:
- api.anthropic.com
x-api-key:
- X-API-KEY-XXX
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 0.73.0
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
x-stainless-timeout:
- NOT_GIVEN
method: POST
uri: https://api.anthropic.com/v1/messages
response:
body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_013jb7edagayZxqGs6ioACyU","type":"message","role":"assistant","content":[{"type":"text","text":"I
apologize, but the image appears to be completely blank or white. There are
no visible contents or text that I can analyze to determine the type of document.
Without any discernible information, I cannot definitively state what type
of document this is."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1658,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":55,"service_tier":"standard","inference_geo":"not_available"}}'
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:30:58 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Robots-Tag:
- none
anthropic-organization-id:
- ANTHROPIC-ORGANIZATION-ID-XXX
anthropic-ratelimit-input-tokens-limit:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-input-tokens-remaining:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-input-tokens-reset:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
anthropic-ratelimit-output-tokens-limit:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-output-tokens-remaining:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-output-tokens-reset:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
anthropic-ratelimit-requests-limit:
- '4000'
anthropic-ratelimit-requests-remaining:
- '3999'
anthropic-ratelimit-requests-reset:
- '2026-02-12T19:30:56Z'
anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining:
- ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
anthropic-ratelimit-tokens-reset:
- ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
cf-cache-status:
- DYNAMIC
request-id:
- REQUEST-ID-XXX
strict-transport-security:
- STS-XXX
x-envoy-upstream-service-time:
- '2005'
status:
code: 200
message: OK

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,14 +1,9 @@
interactions:
- request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What is this document?\n\nBegin! This is VERY important to you, use the
tools available and give your best Final Answer, your job depends on it!\n\nThought:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
Task: What is this document?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -21,7 +16,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1343'
- '942'
content-type:
- application/json
host:
@@ -37,34 +32,35 @@ interactions:
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 0.71.1
- 0.73.0
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
x-stainless-timeout:
- NOT_GIVEN
method: POST
uri: https://api.anthropic.com/v1/messages
response:
body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01XwAhfdaMxwTNzTy7YhmA5e","type":"message","role":"assistant","content":[{"type":"text","text":"Thought:
I can see this is a PDF document, but the image appears to be blank or completely
white. Without any visible text or content, I cannot determine the specific
type or purpose of this document.\n\nFinal Answer: The document appears to
be a blank white PDF page with no discernible text, images, or content visible.
It could be an empty document, a scanning error, or a placeholder file."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1748,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":88,"service_tier":"standard"}}'
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01RnyTYpTE9Dd8BfwyMfuwum","type":"message","role":"assistant","content":[{"type":"text","text":"I
apologize, but the image appears to be blank or completely white. Without
any visible text or content, I cannot determine the type or nature of the
document. If you intended to share a specific document, you may want to check
the file and try uploading it again."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1656,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":59,"service_tier":"standard","inference_geo":"not_available"}}'
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type:
- application/json
Date:
- Fri, 23 Jan 2026 19:08:19 GMT
- Thu, 12 Feb 2026 19:29:25 GMT
Server:
- cloudflare
Transfer-Encoding:
@@ -90,7 +86,7 @@ interactions:
anthropic-ratelimit-requests-remaining:
- '3999'
anthropic-ratelimit-requests-reset:
- '2026-01-23T19:08:16Z'
- '2026-02-12T19:29:23Z'
anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining:
@@ -104,7 +100,111 @@ interactions:
strict-transport-security:
- STS-XXX
x-envoy-upstream-service-time:
- '3114'
- '2072'
status:
code: 200
message: OK
- request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What is this document?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
anthropic-version:
- '2023-06-01'
connection:
- keep-alive
content-length:
- '942'
content-type:
- application/json
host:
- api.anthropic.com
x-api-key:
- X-API-KEY-XXX
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 0.73.0
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
x-stainless-timeout:
- NOT_GIVEN
method: POST
uri: https://api.anthropic.com/v1/messages
response:
body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_011J2La8KpjxAK255NsSpePY","type":"message","role":"assistant","content":[{"type":"text","text":"I
apologize, but the document appears to be a blank white page. No text, images,
or discernible content is visible in this PDF file. Without any readable information,
I cannot determine the type or purpose of this document."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1656,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":51,"service_tier":"standard","inference_geo":"not_available"}}'
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:29:27 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Robots-Tag:
- none
anthropic-organization-id:
- ANTHROPIC-ORGANIZATION-ID-XXX
anthropic-ratelimit-input-tokens-limit:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-input-tokens-remaining:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-input-tokens-reset:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
anthropic-ratelimit-output-tokens-limit:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-output-tokens-remaining:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-output-tokens-reset:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
anthropic-ratelimit-requests-limit:
- '4000'
anthropic-ratelimit-requests-remaining:
- '3999'
anthropic-ratelimit-requests-reset:
- '2026-02-12T19:29:26Z'
anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining:
- ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
anthropic-ratelimit-tokens-reset:
- ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
cf-cache-status:
- DYNAMIC
request-id:
- REQUEST-ID-XXX
strict-transport-security:
- STS-XXX
x-envoy-upstream-service-time:
- '1802'
status:
code: 200
message: OK

View File

@@ -1,14 +1,9 @@
interactions:
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What is this document?\n\nBegin! This is VERY important to you, use the
tools available and give your best Final Answer, your job depends on it!\n\nThought:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
Task: What is this document?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -21,7 +16,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1235'
- '834'
content-type:
- application/json
host:
@@ -43,47 +38,37 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_059d23bc71d450aa006973c72416788197bddcc99157e3a313\",\n
\ \"object\": \"response\",\n \"created_at\": 1769195300,\n \"status\":
string: "{\n \"id\": \"resp_0751868929a7aa7500698e2a23d5508194b8e4092ff79a8f41\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924579,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1769195307,\n \"error\": null,\n
\"developer\"\n },\n \"completed_at\": 1770924581,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\\nTo give my best complete
final answer to the task respond using the exact following format:\\n\\nThought:
I now can give a great answer\\nFinal Answer: Your final answer must be the
great and the most complete as possible, it must be outcome described.\\n\\nI
MUST use these formats, my job depends on it!\",\n \"max_output_tokens\":
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"output\": [\n {\n \"id\": \"msg_059d23bc71d450aa006973c724b1d881979787b0eeb53bdbd2\",\n
\ \"output\": [\n {\n \"id\": \"msg_0751868929a7aa7500698e2a2474208194a7ea7e8d1179c3fa\",\n
\ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\":
[\n {\n \"type\": \"output_text\",\n \"annotations\":
[],\n \"logprobs\": [],\n \"text\": \"Thought: I now can
give a great answer. \\nFinal Answer: Without access to a specific document
or its contents, I cannot provide a detailed analysis. However, in general,
important aspects of a document can include its format (such as PDF, DOCX,
or TXT), purpose (such as legal, informative, or persuasive), and key elements
like headings, text structure, and any embedded media (such as images or charts).
For a thorough analysis, it's essential to understand the context, audience,
and intended use of the document. If you can provide the document itself or
more context about it, I would be able to give a complete assessment.\"\n
\ }\n ],\n \"role\": \"assistant\"\n }\n ],\n \"parallel_tool_calls\":
true,\n \"presence_penalty\": 0.0,\n \"previous_response_id\": null,\n \"prompt_cache_key\":
null,\n \"prompt_cache_retention\": null,\n \"reasoning\": {\n \"effort\":
null,\n \"summary\": null\n },\n \"safety_identifier\": null,\n \"service_tier\":
\"default\",\n \"store\": true,\n \"temperature\": 1.0,\n \"text\": {\n
\ \"format\": {\n \"type\": \"text\"\n },\n \"verbosity\": \"medium\"\n
\ },\n \"tool_choice\": \"auto\",\n \"tools\": [],\n \"top_logprobs\":
0,\n \"top_p\": 1.0,\n \"truncation\": \"disabled\",\n \"usage\": {\n \"input_tokens\":
137,\n \"input_tokens_details\": {\n \"cached_tokens\": 0\n },\n
\ \"output_tokens\": 132,\n \"output_tokens_details\": {\n \"reasoning_tokens\":
0\n },\n \"total_tokens\": 269\n },\n \"user\": null,\n \"metadata\":
{}\n}"
[],\n \"logprobs\": [],\n \"text\": \"It seems that you
have not uploaded any document or file for analysis. Please provide the file
you'd like me to review, and I'll be happy to help you with the analysis and
description.\"\n }\n ],\n \"role\": \"assistant\"\n }\n
\ ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n \"previous_response_id\":
null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\": null,\n
\ \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n },\n \"safety_identifier\":
null,\n \"service_tier\": \"default\",\n \"store\": true,\n \"temperature\":
1.0,\n \"text\": {\n \"format\": {\n \"type\": \"text\"\n },\n
\ \"verbosity\": \"medium\"\n },\n \"tool_choice\": \"auto\",\n \"tools\":
[],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\": \"disabled\",\n
\ \"usage\": {\n \"input_tokens\": 51,\n \"input_tokens_details\": {\n
\ \"cached_tokens\": 0\n },\n \"output_tokens\": 38,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 89\n },\n
\ \"user\": null,\n \"metadata\": {}\n}"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -92,11 +77,9 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 23 Jan 2026 19:08:27 GMT
- Thu, 12 Feb 2026 19:29:41 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
@@ -110,13 +93,132 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '7347'
- '1581'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '7350'
set-cookie:
- SET-COOKIE-XXX
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What is this document?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '834'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_0c3ca22d310deec300698e2a25842881929a9aad25ea18eb77\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924581,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1770924582,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"output\": [\n {\n \"id\": \"msg_0c3ca22d310deec300698e2a26058081929351f3632bd1aa8e\",\n
\ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\":
[\n {\n \"type\": \"output_text\",\n \"annotations\":
[],\n \"logprobs\": [],\n \"text\": \"Please upload the
document you would like me to analyze, and I'll provide you with a detailed
description and analysis of its contents.\"\n }\n ],\n \"role\":
\"assistant\"\n }\n ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\":
0.0,\n \"previous_response_id\": null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\":
null,\n \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n
\ },\n \"safety_identifier\": null,\n \"service_tier\": \"default\",\n \"store\":
true,\n \"temperature\": 1.0,\n \"text\": {\n \"format\": {\n \"type\":
\"text\"\n },\n \"verbosity\": \"medium\"\n },\n \"tool_choice\":
\"auto\",\n \"tools\": [],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\":
\"disabled\",\n \"usage\": {\n \"input_tokens\": 51,\n \"input_tokens_details\":
{\n \"cached_tokens\": 0\n },\n \"output_tokens\": 26,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 77\n },\n
\ \"user\": null,\n \"metadata\": {}\n}"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:29:42 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '870'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:

View File

@@ -1,16 +1,11 @@
interactions:
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text.\n\nBegin!
This is VERY important to you, use the tools available and give your best Final
Answer, your job depends on it!\n\nThought:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text.\n\nProvide
your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately\nTo give my best complete
final answer to the task respond using the exact following format:\n\nThought:
I now can give a great answer\nFinal Answer: Your final answer must be the great
and the most complete as possible, it must be outcome described.\n\nI MUST use
these formats, my job depends on it!"}], "role": "user"}, "generationConfig":
{"stopSequences": ["\nObservation:"]}}'
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -21,13 +16,13 @@ interactions:
connection:
- keep-alive
content-length:
- '1619'
- '1218'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.12.10
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
@@ -35,34 +30,101 @@ interactions:
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"Thought: This text provides guidelines
for giving effective feedback. I need to summarize these guidelines in a clear
and concise manner.\\n\\nFinal Answer: The text outlines eight guidelines
for providing effective feedback: be clear and concise, focus on behavior
and outcomes, be specific with examples, balance positive aspects with areas
for improvement, be respectful and constructive by offering solutions, use
objective criteria, suggest actionable next steps, and proofread for tone,
grammar, and clarity before submission. These guidelines aim to ensure feedback
is easily understood, impactful, and geared towards positive growth.\\n\"\n
\ }\n ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"avgLogprobs\": -0.24753604923282657\n }\n ],\n \"usageMetadata\":
{\n \"promptTokenCount\": 252,\n \"candidatesTokenCount\": 111,\n \"totalTokenCount\":
363,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 252\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 111\n
[\n {\n \"text\": \"The text provides guidelines for giving
effective feedback. Key principles include being clear, focusing on behavior
and outcomes with specific examples, balancing positive and constructive criticism,
remaining respectful, using objective criteria, suggesting actionable next
steps, and proofreading for clarity and tone. In essence, feedback should
be easily understood, objective, and geared towards improvement.\\n\"\n }\n
\ ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"avgLogprobs\": -0.24900928895864913\n }\n ],\n \"usageMetadata\":
{\n \"promptTokenCount\": 163,\n \"candidatesTokenCount\": 67,\n \"totalTokenCount\":
230,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 163\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 67\n
\ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n \"responseId\":
\"88lzae_VGaGOjMcPxNCokQI\"\n}\n"
\"SDSOaae8LLzRjMcPptjXkQ4\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Fri, 23 Jan 2026 19:20:20 GMT
- Thu, 12 Feb 2026 20:12:58 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1200
- gfet4t7; dur=1742
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
X-Frame-Options:
- X-FRAME-OPTIONS-XXX
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text.\n\nProvide
your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
content-length:
- '1218'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"The text provides guidelines for writing
effective feedback. Key recommendations include being clear, concise, specific,
and respectful. Feedback should focus on behavior and outcomes, balance positive
and negative aspects, use objective criteria, and suggest actionable next
steps. Proofreading is essential before submitting feedback.\\n\"\n }\n
\ ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"avgLogprobs\": -0.29874773892489348\n }\n ],\n \"usageMetadata\":
{\n \"promptTokenCount\": 163,\n \"candidatesTokenCount\": 55,\n \"totalTokenCount\":
218,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 163\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 55\n
\ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n \"responseId\":
\"SjSOab3-HaajjMcP38-yyQw\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Thu, 12 Feb 2026 20:12:59 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1198
Transfer-Encoding:
- chunked
Vary:

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,17 +1,11 @@
interactions:
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text
briefly.\n\nBegin! This is VERY important to you, use the tools available and
give your best Final Answer, your job depends on it!\n\nThought:"}, {"inlineData":
{"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
briefly.\n\nProvide your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately\nTo give my best complete
final answer to the task respond using the exact following format:\n\nThought:
I now can give a great answer\nFinal Answer: Your final answer must be the great
and the most complete as possible, it must be outcome described.\n\nI MUST use
these formats, my job depends on it!"}], "role": "user"}, "generationConfig":
{"stopSequences": ["\nObservation:"]}}'
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -22,13 +16,13 @@ interactions:
connection:
- keep-alive
content-length:
- '1627'
- '1226'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.12.10
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
@@ -36,30 +30,100 @@ interactions:
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"Thought: The text provides guidelines
for giving effective feedback. I need to summarize these guidelines concisely.\\n\\nFinal
Answer: The provided text outlines eight guidelines for delivering effective
feedback, emphasizing clarity, focus on behavior and outcomes, specificity,
balanced perspective, respect, objectivity, actionable suggestions, and proofreading.\\n\"\n
\ }\n ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"avgLogprobs\": -0.18550947507222493\n }\n ],\n \"usageMetadata\":
{\n \"promptTokenCount\": 253,\n \"candidatesTokenCount\": 60,\n \"totalTokenCount\":
313,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 253\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 60\n
[\n {\n \"text\": \"These guidelines provide instructions
for writing effective feedback. Feedback should be clear, concise, specific,
and balanced, focusing on behaviors and outcomes with examples. It should
also be respectful, constructive, and objective, suggesting actionable next
steps for improvement and be proofread before submission.\\n\"\n }\n
\ ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"avgLogprobs\": -0.27340631131772641\n }\n ],\n \"usageMetadata\":
{\n \"promptTokenCount\": 164,\n \"candidatesTokenCount\": 54,\n \"totalTokenCount\":
218,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 164\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 54\n
\ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n \"responseId\":
\"9MlzacewKpKMjMcPtu7joQI\"\n}\n"
\"kSqOadGYAsXQjMcP9YfmuAQ\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Fri, 23 Jan 2026 19:20:21 GMT
- Thu, 12 Feb 2026 19:31:29 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=890
- gfet4t7; dur=1041
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
X-Frame-Options:
- X-FRAME-OPTIONS-XXX
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text
briefly.\n\nProvide your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
content-length:
- '1226'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"These guidelines outline how to provide
effective feedback: be clear, concise, and specific, focusing on behavior
and outcomes with examples. Balance positive aspects with areas for improvement,
offering constructive, respectful suggestions and actionable next steps, all
while referencing objective criteria and ensuring the feedback is well-written
and proofread.\\n\"\n }\n ],\n \"role\": \"model\"\n
\ },\n \"finishReason\": \"STOP\",\n \"avgLogprobs\": -0.25106738043613119\n
\ }\n ],\n \"usageMetadata\": {\n \"promptTokenCount\": 164,\n \"candidatesTokenCount\":
61,\n \"totalTokenCount\": 225,\n \"promptTokensDetails\": [\n {\n
\ \"modality\": \"TEXT\",\n \"tokenCount\": 164\n }\n ],\n
\ \"candidatesTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 61\n }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n
\ \"responseId\": \"kiqOaePiC96RjMcP3auj8Q4\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Thu, 12 Feb 2026 19:31:31 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1024
Transfer-Encoding:
- chunked
Vary:

View File

@@ -0,0 +1,134 @@
interactions:
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text
briefly.\n\nProvide your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
content-length:
- '1226'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"These guidelines provide a framework
for giving effective feedback, emphasizing clarity, specificity, balance,
respect, objectivity, actionable next steps, and proofreading.\"\n }\n
\ ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"index\": 0\n }\n ],\n \"usageMetadata\": {\n \"promptTokenCount\":
166,\n \"candidatesTokenCount\": 29,\n \"totalTokenCount\": 223,\n \"promptTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 166\n
\ }\n ],\n \"thoughtsTokenCount\": 28\n },\n \"modelVersion\":
\"gemini-2.5-flash\",\n \"responseId\": \"PUqOaZ3pMYi8_uMP25m7gAQ\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Thu, 12 Feb 2026 21:46:37 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=671
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
X-Frame-Options:
- X-FRAME-OPTIONS-XXX
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text
briefly.\n\nProvide your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
content-length:
- '1226'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"These guidelines provide instructions
on how to deliver effective, constructive, and respectful feedback, emphasizing
clarity, specificity, balance, and actionable suggestions for improvement.\"\n
\ }\n ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"index\": 0\n }\n ],\n \"usageMetadata\": {\n \"promptTokenCount\":
166,\n \"candidatesTokenCount\": 29,\n \"totalTokenCount\": 269,\n \"promptTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 166\n
\ }\n ],\n \"thoughtsTokenCount\": 74\n },\n \"modelVersion\":
\"gemini-2.5-flash\",\n \"responseId\": \"PkqOaf-bLu-v_uMPnorr8Qs\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Thu, 12 Feb 2026 21:46:38 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=898
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
X-Frame-Options:
- X-FRAME-OPTIONS-XXX
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,15 +1,9 @@
interactions:
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What type of document is this?\n\nBegin! This is VERY important to you,
use the tools available and give your best Final Answer, your job depends on
it!\n\nThought:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -22,7 +16,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1243'
- '842'
content-type:
- application/json
host:
@@ -44,44 +38,36 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_00f57987a2fb291d006973c701938081939b336e7a0cb669cf\",\n
\ \"object\": \"response\",\n \"created_at\": 1769195265,\n \"status\":
string: "{\n \"id\": \"resp_0524700d6a86aa2600698e2a7b511c8196869afcb28543046c\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924667,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1769195269,\n \"error\": null,\n
\"developer\"\n },\n \"completed_at\": 1770924668,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\\nTo give my best complete
final answer to the task respond using the exact following format:\\n\\nThought:
I now can give a great answer\\nFinal Answer: Your final answer must be the
great and the most complete as possible, it must be outcome described.\\n\\nI
MUST use these formats, my job depends on it!\",\n \"max_output_tokens\":
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"output\": [\n {\n \"id\": \"msg_00f57987a2fb291d006973c7029b34819381420e8260962019\",\n
\ \"output\": [\n {\n \"id\": \"msg_0524700d6a86aa2600698e2a7c10c48196a1c0042a9a870127\",\n
\ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\":
[\n {\n \"type\": \"output_text\",\n \"annotations\":
[],\n \"logprobs\": [],\n \"text\": \"Thought: I now can
give a great answer. \\nFinal Answer: The document is an identifiable file
type based on its characteristics. If it contains structured content, it might
be a PDF, Word document, or Excel spreadsheet. If it's a text file, it could
be a .txt or .csv. If images are present, it may be a .jpg, .png, or .gif.
Additional metadata or content inspection can confirm its exact type. The
format and extension provide critical insights into its intended use and functionality
within various applications.\"\n }\n ],\n \"role\": \"assistant\"\n
\ }\n ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n
\ \"previous_response_id\": null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\":
null,\n \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n
\ },\n \"safety_identifier\": null,\n \"service_tier\": \"default\",\n \"store\":
true,\n \"temperature\": 1.0,\n \"text\": {\n \"format\": {\n \"type\":
\"text\"\n },\n \"verbosity\": \"medium\"\n },\n \"tool_choice\":
\"auto\",\n \"tools\": [],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\":
\"disabled\",\n \"usage\": {\n \"input_tokens\": 139,\n \"input_tokens_details\":
{\n \"cached_tokens\": 0\n },\n \"output_tokens\": 109,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 248\n },\n
[],\n \"logprobs\": [],\n \"text\": \"It appears there was
no document provided for analysis. Please upload the document you'd like me
to examine, and I'll be happy to help identify its type and provide a detailed
description.\"\n }\n ],\n \"role\": \"assistant\"\n }\n
\ ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n \"previous_response_id\":
null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\": null,\n
\ \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n },\n \"safety_identifier\":
null,\n \"service_tier\": \"default\",\n \"store\": true,\n \"temperature\":
1.0,\n \"text\": {\n \"format\": {\n \"type\": \"text\"\n },\n
\ \"verbosity\": \"medium\"\n },\n \"tool_choice\": \"auto\",\n \"tools\":
[],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\": \"disabled\",\n
\ \"usage\": {\n \"input_tokens\": 53,\n \"input_tokens_details\": {\n
\ \"cached_tokens\": 0\n },\n \"output_tokens\": 36,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 89\n },\n
\ \"user\": null,\n \"metadata\": {}\n}"
headers:
CF-RAY:
@@ -91,7 +77,7 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 23 Jan 2026 19:07:49 GMT
- Thu, 12 Feb 2026 19:31:08 GMT
Server:
- cloudflare
Set-Cookie:
@@ -109,13 +95,128 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '3854'
- '1439'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '842'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_061c22eec2c866c500698e2a7cd9348193929f5dfa4eba1ff6\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924668,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1770924669,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"output\": [\n {\n \"id\": \"msg_061c22eec2c866c500698e2a7d2b18819389b67df0fc6ffaf6\",\n
\ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\":
[\n {\n \"type\": \"output_text\",\n \"annotations\":
[],\n \"logprobs\": [],\n \"text\": \"To assist you accurately,
please upload the document you would like me to analyze.\"\n }\n ],\n
\ \"role\": \"assistant\"\n }\n ],\n \"parallel_tool_calls\": true,\n
\ \"presence_penalty\": 0.0,\n \"previous_response_id\": null,\n \"prompt_cache_key\":
null,\n \"prompt_cache_retention\": null,\n \"reasoning\": {\n \"effort\":
null,\n \"summary\": null\n },\n \"safety_identifier\": null,\n \"service_tier\":
\"default\",\n \"store\": true,\n \"temperature\": 1.0,\n \"text\": {\n
\ \"format\": {\n \"type\": \"text\"\n },\n \"verbosity\": \"medium\"\n
\ },\n \"tool_choice\": \"auto\",\n \"tools\": [],\n \"top_logprobs\":
0,\n \"top_p\": 1.0,\n \"truncation\": \"disabled\",\n \"usage\": {\n \"input_tokens\":
53,\n \"input_tokens_details\": {\n \"cached_tokens\": 0\n },\n
\ \"output_tokens\": 17,\n \"output_tokens_details\": {\n \"reasoning_tokens\":
0\n },\n \"total_tokens\": 70\n },\n \"user\": null,\n \"metadata\":
{}\n}"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:31:09 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '836'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '3857'
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:

View File

@@ -1,15 +1,9 @@
interactions:
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What type of document is this?\n\nBegin! This is VERY important to you,
use the tools available and give your best Final Answer, your job depends on
it!\n\nThought:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"o4-mini","instructions":"You
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"o4-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -22,7 +16,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1239'
- '838'
content-type:
- application/json
host:
@@ -44,41 +38,36 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_02b841f189494a24006973c705c84c81938ac9360927749cd2\",\n
\ \"object\": \"response\",\n \"created_at\": 1769195269,\n \"status\":
string: "{\n \"id\": \"resp_064e248119b2b15200698e2a850d908190ab1c6ba7b548c6c2\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924677,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1769195274,\n \"error\": null,\n
\"developer\"\n },\n \"completed_at\": 1770924678,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\\nTo give my best complete
final answer to the task respond using the exact following format:\\n\\nThought:
I now can give a great answer\\nFinal Answer: Your final answer must be the
great and the most complete as possible, it must be outcome described.\\n\\nI
MUST use these formats, my job depends on it!\",\n \"max_output_tokens\":
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"o4-mini-2025-04-16\",\n
\ \"output\": [\n {\n \"id\": \"rs_02b841f189494a24006973c70641dc81938955c83f790392bd\",\n
\ \"output\": [\n {\n \"id\": \"rs_064e248119b2b15200698e2a85b12081909cd1fbbe97495d44\",\n
\ \"type\": \"reasoning\",\n \"summary\": []\n },\n {\n \"id\":
\"msg_02b841f189494a24006973c709f6d081938e358e108f27434e\",\n \"type\":
\"msg_064e248119b2b15200698e2a8648488190a03c7b0dc1e83d9d\",\n \"type\":
\"message\",\n \"status\": \"completed\",\n \"content\": [\n {\n
\ \"type\": \"output_text\",\n \"annotations\": [],\n \"logprobs\":
[],\n \"text\": \"I\\u2019m sorry, but I don\\u2019t see a document
to analyze. Please provide the file or its content so I can determine its
type.\"\n }\n ],\n \"role\": \"assistant\"\n }\n ],\n
\ \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n \"previous_response_id\":
null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\": null,\n
\ \"reasoning\": {\n \"effort\": \"medium\",\n \"summary\": null\n },\n
\ \"safety_identifier\": null,\n \"service_tier\": \"default\",\n \"store\":
[],\n \"text\": \"Could you please upload or provide the document
you\\u2019d like me to analyze?\"\n }\n ],\n \"role\": \"assistant\"\n
\ }\n ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n
\ \"previous_response_id\": null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\":
null,\n \"reasoning\": {\n \"effort\": \"medium\",\n \"summary\": null\n
\ },\n \"safety_identifier\": null,\n \"service_tier\": \"default\",\n \"store\":
true,\n \"temperature\": 1.0,\n \"text\": {\n \"format\": {\n \"type\":
\"text\"\n },\n \"verbosity\": \"medium\"\n },\n \"tool_choice\":
\"auto\",\n \"tools\": [],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\":
\"disabled\",\n \"usage\": {\n \"input_tokens\": 138,\n \"input_tokens_details\":
{\n \"cached_tokens\": 0\n },\n \"output_tokens\": 418,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 384\n },\n \"total_tokens\": 556\n },\n
\"disabled\",\n \"usage\": {\n \"input_tokens\": 52,\n \"input_tokens_details\":
{\n \"cached_tokens\": 0\n },\n \"output_tokens\": 81,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 133\n },\n
\ \"user\": null,\n \"metadata\": {}\n}"
headers:
CF-RAY:
@@ -88,11 +77,9 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 23 Jan 2026 19:07:54 GMT
- Thu, 12 Feb 2026 19:31:18 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
@@ -106,13 +93,134 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '4864'
- '1769'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '4867'
set-cookie:
- SET-COOKIE-XXX
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"o4-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '838'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_05091b7975cea42100698e2a86f30881908983fbd92fbd48a4\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924679,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1770924683,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"o4-mini-2025-04-16\",\n
\ \"output\": [\n {\n \"id\": \"rs_05091b7975cea42100698e2a87b52c8190b25a662c10b2753f\",\n
\ \"type\": \"reasoning\",\n \"summary\": []\n },\n {\n \"id\":
\"msg_05091b7975cea42100698e2a8b3eec8190b6f7c247a04ea9ce\",\n \"type\":
\"message\",\n \"status\": \"completed\",\n \"content\": [\n {\n
\ \"type\": \"output_text\",\n \"annotations\": [],\n \"logprobs\":
[],\n \"text\": \"I don\\u2019t see a document attached. Could you
please upload the file or share its contents so I can determine what type
of document it is?\"\n }\n ],\n \"role\": \"assistant\"\n
\ }\n ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n
\ \"previous_response_id\": null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\":
null,\n \"reasoning\": {\n \"effort\": \"medium\",\n \"summary\": null\n
\ },\n \"safety_identifier\": null,\n \"service_tier\": \"default\",\n \"store\":
true,\n \"temperature\": 1.0,\n \"text\": {\n \"format\": {\n \"type\":
\"text\"\n },\n \"verbosity\": \"medium\"\n },\n \"tool_choice\":
\"auto\",\n \"tools\": [],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\":
\"disabled\",\n \"usage\": {\n \"input_tokens\": 52,\n \"input_tokens_details\":
{\n \"cached_tokens\": 0\n },\n \"output_tokens\": 254,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 192\n },\n \"total_tokens\": 306\n },\n
\ \"user\": null,\n \"metadata\": {}\n}"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:31:24 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '5181'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:

View File

@@ -37,13 +37,13 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D3qP75TkGfZcx59AyFhCifB7NeNve\",\n \"object\":
\"chat.completion\",\n \"created\": 1769808797,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
string: "{\n \"id\": \"chatcmpl-D8WiGEDTbwLcrRjnvxgSpt9XISVwN\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924744,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The sum of 2 + 2 is 4.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
@@ -52,7 +52,7 @@ interactions:
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_e01c6f58e1\"\n}\n"
\"default\",\n \"system_fingerprint\": \"fp_75546bd1a7\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -61,11 +61,9 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 30 Jan 2026 21:33:18 GMT
- Thu, 12 Feb 2026 19:32:25 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
@@ -81,11 +79,121 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1149'
- '988'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Research Analyst. Expert
researcher\nYour personal goal is: Find information"},{"role":"user","content":"\nCurrent
Task: What is 2 + 2?\n\nProvide your complete response:"}],"model":"gpt-4.1-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '246'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WiHquzE7A8dBalX3phbPaOSXEnQ\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924745,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The sum of 2 + 2 is 4.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
43,\n \"completion_tokens\": 12,\n \"total_tokens\": 55,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_75546bd1a7\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:26 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '415'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -37,13 +37,13 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D3qQLXvb3qeE7H25yFuZE7lYxOI0j\",\n \"object\":
\"chat.completion\",\n \"created\": 1769808873,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
string: "{\n \"id\": \"chatcmpl-D8WiFd3X8iE0Xk2N1S3L2k798qWFq\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924743,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"Hello! How can I assist you today?\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
@@ -52,7 +52,7 @@ interactions:
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_e01c6f58e1\"\n}\n"
\"default\",\n \"system_fingerprint\": \"fp_75546bd1a7\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -61,11 +61,9 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 30 Jan 2026 21:34:33 GMT
- Thu, 12 Feb 2026 19:32:23 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
@@ -81,11 +79,121 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '358'
- '346'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Simple Assistant. A helpful
assistant\nYour personal goal is: Help with basic tasks"},{"role":"user","content":"\nCurrent
Task: Say hello\n\nProvide your complete response:"}],"model":"gpt-4.1-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '248'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WiFOaYAAKsuxLAXe6PwTk5AjYdk\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924743,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"Hello! How can I assist you today?\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
41,\n \"completion_tokens\": 9,\n \"total_tokens\": 50,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_75546bd1a7\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:24 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '618'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -1,60 +1,9 @@
interactions:
- request:
body: ''
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
host:
- localhost:9999
method: GET
uri: http://localhost:9999/.well-known/agent-card.json
response:
body:
string: '{"capabilities":{"pushNotifications":true,"streaming":true},"defaultInputModes":["text/plain","application/json"],"defaultOutputModes":["text/plain","application/json"],"description":"An
AI assistant powered by OpenAI GPT with calculator and time tools. Ask questions,
perform calculations, or get the current time in any timezone.","name":"GPT
Assistant","preferredTransport":"JSONRPC","protocolVersion":"0.3.0","skills":[{"description":"Have
a general conversation with the AI assistant. Ask questions, get explanations,
or just chat.","examples":["Hello, how are you?","Explain quantum computing
in simple terms","What can you help me with?"],"id":"conversation","name":"General
Conversation","tags":["chat","conversation","general"]},{"description":"Perform
mathematical calculations including arithmetic, exponents, and more.","examples":["What
is 25 * 17?","Calculate 2^10","What''s (100 + 50) / 3?"],"id":"calculator","name":"Calculator","tags":["math","calculator","arithmetic"]},{"description":"Get
the current date and time in any timezone.","examples":["What time is it?","What''s
the current time in Tokyo?","What''s today''s date in New York?"],"id":"time","name":"Current
Time","tags":["time","date","timezone"]}],"url":"http://localhost:9999","version":"1.0.0"}'
headers:
content-length:
- '1272'
content-type:
- application/json
date:
- Fri, 30 Jan 2026 21:32:36 GMT
server:
- uvicorn
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Research Analyst. Expert
researcher with access to remote agents\nYour personal goal is: Find and analyze
information"},{"role":"user","content":"\nCurrent Task: Use the remote A2A agent
to calculate 10 plus 15.\n\nProvide your complete response:"}],"model":"gpt-4.1-mini","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"a2a_ids":{"description":"A2A
agent IDs to delegate to.","items":{"const":"http://localhost:9999/.well-known/agent-card.json","type":"string"},"maxItems":1,"title":"A2A
Ids","type":"array"},"message":{"description":"The message content. If is_a2a=true,
this is sent to the A2A agent. If is_a2a=false, this is your final answer ending
the conversation.","title":"Message","type":"string"},"is_a2a":{"description":"Set
to false when the remote agent has answered your question - extract their answer
and return it as your final message. Set to true ONLY if you need to ask a NEW,
DIFFERENT question. NEVER repeat the same request - if the conversation history
shows the agent already answered, set is_a2a=false immediately.","title":"Is
A2A","type":"boolean"}},"required":["a2a_ids","message","is_a2a"],"title":"AgentResponse","type":"object","additionalProperties":false},"name":"AgentResponse","strict":true}},"stream":false}'
to calculate 10 plus 15.\n\nProvide your complete response:"}],"model":"gpt-4.1-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -67,7 +16,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1326'
- '322'
content-type:
- application/json
host:
@@ -76,8 +25,6 @@ interactions:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
@@ -91,23 +38,23 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D3qOTnAG0KogwskyqSSZDRbSOtXHr\",\n \"object\":
\"chat.completion\",\n \"created\": 1769808757,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
string: "{\n \"id\": \"chatcmpl-D8WiD3djMj91vXlZgRexuoagt4YjK\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924741,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"a2a_ids\\\":[\\\"http://localhost:9999/.well-known/agent-card.json\\\"],\\\"message\\\":\\\"Calculate
the sum of 10 plus 15.\\\",\\\"is_a2a\\\":true}\",\n \"refusal\": null,\n
\ \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\":
\"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 266,\n \"completion_tokens\":
40,\n \"total_tokens\": 306,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
\"assistant\",\n \"content\": \"I am using the remote A2A agent to
calculate 10 plus 15.\\n\\nCalculation result: 10 + 15 = 25\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
57,\n \"completion_tokens\": 28,\n \"total_tokens\": 85,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_e01c6f58e1\"\n}\n"
\"default\",\n \"system_fingerprint\": \"fp_75546bd1a7\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -116,31 +63,31 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 30 Jan 2026 21:32:38 GMT
- Thu, 12 Feb 2026 19:32:21 GMT
Server:
- cloudflare
Set-Cookie:
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '633'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '832'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
@@ -160,108 +107,11 @@ interactions:
status:
code: 200
message: OK
- request:
body: ''
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
host:
- localhost:9999
method: GET
uri: http://localhost:9999/.well-known/agent-card.json
response:
body:
string: '{"capabilities":{"pushNotifications":true,"streaming":true},"defaultInputModes":["text/plain","application/json"],"defaultOutputModes":["text/plain","application/json"],"description":"An
AI assistant powered by OpenAI GPT with calculator and time tools. Ask questions,
perform calculations, or get the current time in any timezone.","name":"GPT
Assistant","preferredTransport":"JSONRPC","protocolVersion":"0.3.0","skills":[{"description":"Have
a general conversation with the AI assistant. Ask questions, get explanations,
or just chat.","examples":["Hello, how are you?","Explain quantum computing
in simple terms","What can you help me with?"],"id":"conversation","name":"General
Conversation","tags":["chat","conversation","general"]},{"description":"Perform
mathematical calculations including arithmetic, exponents, and more.","examples":["What
is 25 * 17?","Calculate 2^10","What''s (100 + 50) / 3?"],"id":"calculator","name":"Calculator","tags":["math","calculator","arithmetic"]},{"description":"Get
the current date and time in any timezone.","examples":["What time is it?","What''s
the current time in Tokyo?","What''s today''s date in New York?"],"id":"time","name":"Current
Time","tags":["time","date","timezone"]}],"url":"http://localhost:9999","version":"1.0.0"}'
headers:
content-length:
- '1272'
content-type:
- application/json
date:
- Fri, 30 Jan 2026 21:32:38 GMT
server:
- uvicorn
status:
code: 200
message: OK
- request:
body: '{"id":"11e7f105-5324-4e70-af42-2db3a3e96054","jsonrpc":"2.0","method":"message/stream","params":{"configuration":{"acceptedOutputModes":["application/json"],"blocking":true},"message":{"kind":"message","messageId":"8ba087b8-e647-4e46-ba32-d163f2ef3f3b","parts":[{"kind":"text","text":"Calculate
the sum of 10 plus 15."}],"referenceTaskIds":[],"role":"user"}}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*, text/event-stream'
accept-encoding:
- ACCEPT-ENCODING-XXX
cache-control:
- no-store
connection:
- keep-alive
content-length:
- '359'
content-type:
- application/json
host:
- localhost:9999
method: POST
uri: http://localhost:9999
response:
body:
string: "data: {\"id\":\"11e7f105-5324-4e70-af42-2db3a3e96054\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"2f5791a9-4dd2-4fe1-b637-ef4e8c7d3f78\",\"final\":false,\"kind\":\"status-update\",\"status\":{\"state\":\"submitted\"},\"taskId\":\"d5371a72-7ad4-4606-889d-040bdaf6dc62\"}}\r\n\r\ndata:
{\"id\":\"11e7f105-5324-4e70-af42-2db3a3e96054\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"2f5791a9-4dd2-4fe1-b637-ef4e8c7d3f78\",\"final\":false,\"kind\":\"status-update\",\"status\":{\"state\":\"working\"},\"taskId\":\"d5371a72-7ad4-4606-889d-040bdaf6dc62\"}}\r\n\r\ndata:
{\"id\":\"11e7f105-5324-4e70-af42-2db3a3e96054\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"2f5791a9-4dd2-4fe1-b637-ef4e8c7d3f78\",\"final\":true,\"kind\":\"status-update\",\"status\":{\"message\":{\"kind\":\"message\",\"messageId\":\"f9f4cc36-e504-4d2e-8e53-d061427adde6\",\"parts\":[{\"kind\":\"text\",\"text\":\"[Tool:
calculator] 10 + 15 = 25\\nThe sum of 10 plus 15 is 25.\"}],\"role\":\"agent\"},\"state\":\"completed\"},\"taskId\":\"d5371a72-7ad4-4606-889d-040bdaf6dc62\"}}\r\n\r\n"
headers:
cache-control:
- no-store
connection:
- keep-alive
content-type:
- text/event-stream; charset=utf-8
date:
- Fri, 30 Jan 2026 21:32:38 GMT
server:
- uvicorn
transfer-encoding:
- chunked
x-accel-buffering:
- 'no'
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Research Analyst. Expert
researcher with access to remote agents\nYour personal goal is: Find and analyze
information"},{"role":"user","content":"\nCurrent Task: Use the remote A2A agent
to calculate 10 plus 15.\n\nProvide your complete response:"}],"model":"gpt-4.1-mini","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"a2a_ids":{"description":"A2A
agent IDs to delegate to.","items":{"const":"http://localhost:9999/.well-known/agent-card.json","type":"string"},"maxItems":1,"title":"A2A
Ids","type":"array"},"message":{"description":"The message content. If is_a2a=true,
this is sent to the A2A agent. If is_a2a=false, this is your final answer ending
the conversation.","title":"Message","type":"string"},"is_a2a":{"description":"Set
to false when the remote agent has answered your question - extract their answer
and return it as your final message. Set to true ONLY if you need to ask a NEW,
DIFFERENT question. NEVER repeat the same request - if the conversation history
shows the agent already answered, set is_a2a=false immediately.","title":"Is
A2A","type":"boolean"}},"required":["a2a_ids","message","is_a2a"],"title":"AgentResponse","type":"object","additionalProperties":false},"name":"AgentResponse","strict":true}},"stream":false}'
to calculate 10 plus 15.\n\nProvide your complete response:"}],"model":"gpt-4.1-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -274,7 +124,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1326'
- '322'
content-type:
- application/json
cookie:
@@ -285,8 +135,6 @@ interactions:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
@@ -300,23 +148,23 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D3qOYv1S9VAwloC7LrWOUABqHUtDO\",\n \"object\":
\"chat.completion\",\n \"created\": 1769808762,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
string: "{\n \"id\": \"chatcmpl-D8WiEa5fOdnyGxf1o0YYZRjEVstUX\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924742,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"a2a_ids\\\":[\\\"http://localhost:9999/.well-known/agent-card.json\\\"],\\\"message\\\":\\\"Calculate
the sum of 10 plus 15.\\\",\\\"is_a2a\\\":true}\",\n \"refusal\": null,\n
\ \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\":
\"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 266,\n \"completion_tokens\":
40,\n \"total_tokens\": 306,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
\"assistant\",\n \"content\": \"Using the remote A2A agent to calculate
10 plus 15:\\n\\n10 + 15 = 25\\n\\nThe result is 25.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
57,\n \"completion_tokens\": 29,\n \"total_tokens\": 86,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_e01c6f58e1\"\n}\n"
\"default\",\n \"system_fingerprint\": \"fp_75546bd1a7\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -325,7 +173,7 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 30 Jan 2026 21:32:43 GMT
- Thu, 12 Feb 2026 19:32:22 GMT
Server:
- cloudflare
Strict-Transport-Security:
@@ -343,341 +191,13 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '658'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"id":"93d4ded2-251f-47da-ae7b-2a135ec7cbb9","jsonrpc":"2.0","method":"message/stream","params":{"configuration":{"acceptedOutputModes":["application/json"],"blocking":true},"message":{"kind":"message","messageId":"08032897-ffdc-4a5e-8ae9-1124d49bbf01","parts":[{"kind":"text","text":"Calculate
the sum of 10 plus 15."}],"referenceTaskIds":[],"role":"user"}}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*, text/event-stream'
accept-encoding:
- ACCEPT-ENCODING-XXX
cache-control:
- no-store
connection:
- keep-alive
content-length:
- '359'
content-type:
- application/json
host:
- localhost:9999
method: POST
uri: http://localhost:9999
response:
body:
string: "data: {\"id\":\"93d4ded2-251f-47da-ae7b-2a135ec7cbb9\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"a2b91c10-dc16-4dff-b807-3ea98016ff38\",\"final\":false,\"kind\":\"status-update\",\"status\":{\"state\":\"submitted\"},\"taskId\":\"2b0861b7-8d94-4325-97ab-aaae42f43581\"}}\r\n\r\ndata:
{\"id\":\"93d4ded2-251f-47da-ae7b-2a135ec7cbb9\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"a2b91c10-dc16-4dff-b807-3ea98016ff38\",\"final\":false,\"kind\":\"status-update\",\"status\":{\"state\":\"working\"},\"taskId\":\"2b0861b7-8d94-4325-97ab-aaae42f43581\"}}\r\n\r\ndata:
{\"id\":\"93d4ded2-251f-47da-ae7b-2a135ec7cbb9\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"a2b91c10-dc16-4dff-b807-3ea98016ff38\",\"final\":true,\"kind\":\"status-update\",\"status\":{\"message\":{\"kind\":\"message\",\"messageId\":\"e4e420da-aef9-489f-a3ca-39a97930dee8\",\"parts\":[{\"kind\":\"text\",\"text\":\"[Tool:
calculator] 10 + 15 = 25\\nThe sum of 10 plus 15 is 25.\"}],\"role\":\"agent\"},\"state\":\"completed\"},\"taskId\":\"2b0861b7-8d94-4325-97ab-aaae42f43581\"}}\r\n\r\n"
headers:
cache-control:
- no-store
connection:
- keep-alive
content-type:
- text/event-stream; charset=utf-8
date:
- Fri, 30 Jan 2026 21:32:43 GMT
server:
- uvicorn
transfer-encoding:
- chunked
x-accel-buffering:
- 'no'
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Research Analyst. Expert
researcher with access to remote agents\nYour personal goal is: Find and analyze
information"},{"role":"user","content":"\nCurrent Task: Use the remote A2A agent
to calculate 10 plus 15.\n\nProvide your complete response:"}],"model":"gpt-4.1-mini","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"a2a_ids":{"description":"A2A
agent IDs to delegate to.","items":{"const":"http://localhost:9999/.well-known/agent-card.json","type":"string"},"maxItems":1,"title":"A2A
Ids","type":"array"},"message":{"description":"The message content. If is_a2a=true,
this is sent to the A2A agent. If is_a2a=false, this is your final answer ending
the conversation.","title":"Message","type":"string"},"is_a2a":{"description":"Set
to false when the remote agent has answered your question - extract their answer
and return it as your final message. Set to true ONLY if you need to ask a NEW,
DIFFERENT question. NEVER repeat the same request - if the conversation history
shows the agent already answered, set is_a2a=false immediately.","title":"Is
A2A","type":"boolean"}},"required":["a2a_ids","message","is_a2a"],"title":"AgentResponse","type":"object","additionalProperties":false},"name":"AgentResponse","strict":true}},"stream":false}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '1326'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D3qOcC0ycRtx6l3V88o2KbMLXk24S\",\n \"object\":
\"chat.completion\",\n \"created\": 1769808766,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"a2a_ids\\\":[\\\"http://localhost:9999/.well-known/agent-card.json\\\"],\\\"message\\\":\\\"Calculate
the sum of 10 plus 15.\\\",\\\"is_a2a\\\":true}\",\n \"refusal\": null,\n
\ \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\":
\"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 266,\n \"completion_tokens\":
40,\n \"total_tokens\": 306,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_e01c6f58e1\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Fri, 30 Jan 2026 21:32:47 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '644'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"id":"be92898e-ac10-4bed-a54c-d40e747c85f3","jsonrpc":"2.0","method":"message/stream","params":{"configuration":{"acceptedOutputModes":["application/json"],"blocking":true},"message":{"kind":"message","messageId":"0f12aa81-afb8-419b-9d52-b47cc6c21329","parts":[{"kind":"text","text":"Calculate
the sum of 10 plus 15."}],"referenceTaskIds":[],"role":"user"}}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*, text/event-stream'
accept-encoding:
- ACCEPT-ENCODING-XXX
cache-control:
- no-store
connection:
- keep-alive
content-length:
- '359'
content-type:
- application/json
host:
- localhost:9999
method: POST
uri: http://localhost:9999
response:
body:
string: "data: {\"id\":\"be92898e-ac10-4bed-a54c-d40e747c85f3\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"e13fc32d-ead2-4f01-b852-7fd1b7b73983\",\"final\":false,\"kind\":\"status-update\",\"status\":{\"state\":\"submitted\"},\"taskId\":\"cdaba0fb-081e-4950-91da-9635c0bd1336\"}}\r\n\r\ndata:
{\"id\":\"be92898e-ac10-4bed-a54c-d40e747c85f3\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"e13fc32d-ead2-4f01-b852-7fd1b7b73983\",\"final\":false,\"kind\":\"status-update\",\"status\":{\"state\":\"working\"},\"taskId\":\"cdaba0fb-081e-4950-91da-9635c0bd1336\"}}\r\n\r\ndata:
{\"id\":\"be92898e-ac10-4bed-a54c-d40e747c85f3\",\"jsonrpc\":\"2.0\",\"result\":{\"contextId\":\"e13fc32d-ead2-4f01-b852-7fd1b7b73983\",\"final\":true,\"kind\":\"status-update\",\"status\":{\"message\":{\"kind\":\"message\",\"messageId\":\"bb905c5a-34c8-4a02-9ba3-5713790e2a00\",\"parts\":[{\"kind\":\"text\",\"text\":\"[Tool:
calculator] 10 + 15 = 25\\nThe sum of 10 plus 15 is 25.\"}],\"role\":\"agent\"},\"state\":\"completed\"},\"taskId\":\"cdaba0fb-081e-4950-91da-9635c0bd1336\"}}\r\n\r\n"
headers:
cache-control:
- no-store
connection:
- keep-alive
content-type:
- text/event-stream; charset=utf-8
date:
- Fri, 30 Jan 2026 21:32:47 GMT
server:
- uvicorn
transfer-encoding:
- chunked
x-accel-buffering:
- 'no'
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Research Analyst. Expert
researcher with access to remote agents\nYour personal goal is: Find and analyze
information"},{"role":"user","content":"\nCurrent Task: Use the remote A2A agent
to calculate 10 plus 15.\n\nProvide your complete response:"}],"model":"gpt-4.1-mini","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"a2a_ids":{"description":"A2A
agent IDs to delegate to.","items":{"const":"http://localhost:9999/.well-known/agent-card.json","type":"string"},"maxItems":1,"title":"A2A
Ids","type":"array"},"message":{"description":"The message content. If is_a2a=true,
this is sent to the A2A agent. If is_a2a=false, this is your final answer ending
the conversation.","title":"Message","type":"string"},"is_a2a":{"description":"Set
to false when the remote agent has answered your question - extract their answer
and return it as your final message. Set to true ONLY if you need to ask a NEW,
DIFFERENT question. NEVER repeat the same request - if the conversation history
shows the agent already answered, set is_a2a=false immediately.","title":"Is
A2A","type":"boolean"}},"required":["a2a_ids","message","is_a2a"],"title":"AgentResponse","type":"object","additionalProperties":false},"name":"AgentResponse","strict":true}},"stream":false}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '1326'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D3qOgAECMjCxhfMRaNqRNLVGefrXr\",\n \"object\":
\"chat.completion\",\n \"created\": 1769808770,\n \"model\": \"gpt-4.1-mini-2025-04-14\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"a2a_ids\\\":[\\\"http://localhost:9999/.well-known/agent-card.json\\\"],\\\"message\\\":\\\"Calculate
10 plus 15.\\\",\\\"is_a2a\\\":true}\",\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 266,\n \"completion_tokens\":
37,\n \"total_tokens\": 303,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_e01c6f58e1\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Fri, 30 Jan 2026 21:32:51 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '795'
- '581'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -0,0 +1,781 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nWhat is 2 + 2?\n\n##
Expected Output\nComplete the task successfully\n\n## Available Tools\nNo tools
available\n\n## Planning Principles\nFocus on WHAT needs to be accomplished,
not HOW. Group related actions into logical units. Fewer steps = better. Most
tasks need 3-6 steps. Hard limit: 20 steps.\n\n## Step Types (only these are
valid):\n1. **Tool Step**: Uses a tool to gather information or take action\n2.
**Output Step**: Synthesizes prior results into the final deliverable (usually
the last step)\n\n## Rules:\n- Each step must either USE A TOOL or PRODUCE THE
FINAL OUTPUT\n- Combine related tool calls: \"Research A, B, and C\" = ONE step,
not three\n- Combine all synthesis into ONE final output step\n- NO standalone
\"thinking\" steps (review, verify, confirm, refine, analyze) - these happen
naturally between steps\n\nFor each step: State the action, specify the tool
(if any), and note dependencies.\n\nAfter your plan, state READY or NOT READY."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task with structured steps","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"A
brief summary of the overall plan."},"steps":{"type":"array","description":"List
of discrete steps to execute the plan","items":{"type":"object","properties":{"step_number":{"type":"integer","description":"Step
number (1-based)"},"description":{"type":"string","description":"What to do
in this step"},"tool_to_use":{"type":["string","null"],"description":"Tool to
use for this step, or null if no tool needed"},"depends_on":{"type":"array","items":{"type":"integer"},"description":"Step
numbers this step depends on (empty array if none)"}},"required":["step_number","description","tool_to_use","depends_on"],"additionalProperties":false}},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","steps","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '2315'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7sucBVKCmsTak9j942bnJ6N1AuTp\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771750,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": null,\n \"tool_calls\": [\n {\n
\ \"id\": \"call_OBLxVBttHEOnE06W6eBk8udl\",\n \"type\":
\"function\",\n \"function\": {\n \"name\": \"create_reasoning_plan\",\n
\ \"arguments\": \"{\\\"plan\\\":\\\"Calculate the sum of 2 and
2.\\\",\\\"steps\\\":[{\\\"step_number\\\":1,\\\"description\\\":\\\"Perform
the addition of 2 and 2\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[]},{\\\"step_number\\\":2,\\\"description\\\":\\\"Output
the result of the addition\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[1]}],\\\"ready\\\":true}\"\n
\ }\n }\n ],\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"tool_calls\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 440,\n \"completion_tokens\":
84,\n \"total_tokens\": 524,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:33 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '2250'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant that solves math problems step by step\n\nYour goal: Help solve simple
math problems\n\nYou are executing a specific step in a multi-step plan. Focus
ONLY on completing\nthe current step. Do not plan ahead or worry about future
steps.\n\nBefore acting, briefly reason about what you need to do and which
approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nPerform the addition of 2 and 2\n\nComplete this step and provide
your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '602'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7sufbS872OOIMBzOVOZv0SDcR9OR\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771753,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"To perform the addition of 2 and 2,
I will combine the two numbers:\\n\\n2 + 2 = 4\\n\\nThe result is 4.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
115,\n \"completion_tokens\": 32,\n \"total_tokens\": 147,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:34 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1407'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n\\n## Just completed step 1\\nDescription:
Perform the addition of 2 and 2\\nResult: To perform the addition of 2 and 2,
I will combine the two numbers:\\n\\n2 + 2 = 4\\n\\nThe result is 4.\\n\\n##
Remaining plan steps:\\n Step 2: Output the result of the addition\\n\\nAnalyze
this step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4026'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7sugkxTuKbiOtwhKkgOPH9A8O7w2\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771754,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
addition operation was completed successfully, and the result is confirmed
as 4.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":false}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
788,\n \"completion_tokens\": 68,\n \"total_tokens\": 856,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:36 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1821'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant that solves math problems step by step\n\nYour goal: Help solve simple
math problems\n\nYou are executing a specific step in a multi-step plan. Focus
ONLY on completing\nthe current step. Do not plan ahead or worry about future
steps.\n\nBefore acting, briefly reason about what you need to do and which
approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nOutput the result of the addition\n\n## Context from previous
steps:\nStep 1 result: To perform the addition of 2 and 2, I will combine the
two numbers:\n\n2 + 2 = 4\n\nThe result is 4.\n\nComplete this step and provide
your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '756'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7suiXp4AZCC6jrd43DcjTgNIn2XM\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771756,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of the addition is 4.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
155,\n \"completion_tokens\": 9,\n \"total_tokens\": 164,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:36 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '385'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n## Previously completed steps:\\n
\ Step 1: Perform the addition of 2 and 2\\n Result: To perform the addition
of 2 and 2, I will combine the two numbers:\\n\\n2 + 2 = 4\\n\\nThe result is
4.\\n\\n## Just completed step 2\\nDescription: Output the result of the addition\\nResult:
The result of the addition is 4.\\n\\n\\nAnalyze this step's result and provide
your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4078'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7sujyEacBTnf7PFkgAkAVfwQKYdh\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771757,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
result of the addition is confirmed to be 4.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":true}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
800,\n \"completion_tokens\": 64,\n \"total_tokens\": 864,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:38 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1701'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. You have
completed a multi-step task. Synthesize the results from all steps into a single,
coherent final response that directly addresses the original task. Do NOT list
step numbers or say ''Step 1 result''. Produce a clean, polished answer as if
you did it all at once."},{"role":"user","content":"## Original Task\nWhat is
2 + 2?\n\n## Results from each step\nStep 1 (Perform the addition of 2 and 2):\nTo
perform the addition of 2 and 2, I will combine the two numbers:\n\n2 + 2 =
4\n\nThe result is 4.\n\nStep 2 (Output the result of the addition):\nThe result
of the addition is 4.\n\nSynthesize these results into a single, coherent final
answer."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '742'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7sukM0SIKXBdTM8rRUNeb0mRHpgt\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771758,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of adding 2 and 2 is 4.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
169,\n \"completion_tokens\": 13,\n \"total_tokens\": 182,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:39 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '780'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,216 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant\nYour personal goal is: Help solve simple math problems"},{"role":"user","content":"\nCurrent
Task: What is 3 + 3?\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '260'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7stdPdjlDvg5w2x6qhoEmJ9et77Z\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771689,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"3 + 3 equals 6.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
47,\n \"completion_tokens\": 8,\n \"total_tokens\": 55,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:01:29 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '418'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant\nYour personal goal is: Help solve simple math problems"},{"role":"user","content":"\nCurrent
Task: What is 3 + 3?\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '260'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7stdUbcdNE8BSmYasTJsGuoLDx3M\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771689,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"3 + 3 equals 6.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
47,\n \"completion_tokens\": 8,\n \"total_tokens\": 55,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:01:30 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '488'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,781 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nWhat is 7 + 7?\n\n##
Expected Output\nComplete the task successfully\n\n## Available Tools\nNo tools
available\n\n## Planning Principles\nFocus on WHAT needs to be accomplished,
not HOW. Group related actions into logical units. Fewer steps = better. Most
tasks need 3-6 steps. Hard limit: 20 steps.\n\n## Step Types (only these are
valid):\n1. **Tool Step**: Uses a tool to gather information or take action\n2.
**Output Step**: Synthesizes prior results into the final deliverable (usually
the last step)\n\n## Rules:\n- Each step must either USE A TOOL or PRODUCE THE
FINAL OUTPUT\n- Combine related tool calls: \"Research A, B, and C\" = ONE step,
not three\n- Combine all synthesis into ONE final output step\n- NO standalone
\"thinking\" steps (review, verify, confirm, refine, analyze) - these happen
naturally between steps\n\nFor each step: State the action, specify the tool
(if any), and note dependencies.\n\nAfter your plan, state READY or NOT READY."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task with structured steps","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"A
brief summary of the overall plan."},"steps":{"type":"array","description":"List
of discrete steps to execute the plan","items":{"type":"object","properties":{"step_number":{"type":"integer","description":"Step
number (1-based)"},"description":{"type":"string","description":"What to do
in this step"},"tool_to_use":{"type":["string","null"],"description":"Tool to
use for this step, or null if no tool needed"},"depends_on":{"type":"array","items":{"type":"integer"},"description":"Step
numbers this step depends on (empty array if none)"}},"required":["step_number","description","tool_to_use","depends_on"],"additionalProperties":false}},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","steps","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '2315'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7suSwcSHWUthCW5XkyuQHzQMXtIk\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771740,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": null,\n \"tool_calls\": [\n {\n
\ \"id\": \"call_U2TSsLt52oNJGF73yfdYDwSl\",\n \"type\":
\"function\",\n \"function\": {\n \"name\": \"create_reasoning_plan\",\n
\ \"arguments\": \"{\\\"plan\\\":\\\"Calculate the sum of 7 and
7 and output the result.\\\",\\\"steps\\\":[{\\\"step_number\\\":1,\\\"description\\\":\\\"Perform
the addition of 7 and 7.\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[]},{\\\"step_number\\\":2,\\\"description\\\":\\\"Output
the result of the addition.\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[1]}],\\\"ready\\\":true}\"\n
\ }\n }\n ],\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"tool_calls\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 440,\n \"completion_tokens\":
88,\n \"total_tokens\": 528,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:23 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '2181'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant that solves math problems step by step\n\nYour goal: Help solve simple
math problems\n\nYou are executing a specific step in a multi-step plan. Focus
ONLY on completing\nthe current step. Do not plan ahead or worry about future
steps.\n\nBefore acting, briefly reason about what you need to do and which
approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nPerform the addition of 7 and 7.\n\nComplete this step and provide
your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '603'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7suVIVWV7aDQ1ULGhJZ2IW2m3t8N\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771743,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"To complete the addition of 7 and 7,
I simply need to add the two numbers together.\\n\\n7 + 7 = 14\\n\\nThe result
of the addition is 14.\",\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 115,\n \"completion_tokens\":
38,\n \"total_tokens\": 153,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:24 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1307'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n\\n## Just completed step 1\\nDescription:
Perform the addition of 7 and 7.\\nResult: To complete the addition of 7 and
7, I simply need to add the two numbers together.\\n\\n7 + 7 = 14\\n\\nThe result
of the addition is 14.\\n\\n## Remaining plan steps:\\n Step 2: Output the
result of the addition.\\n\\nAnalyze this step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4062'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7suWrHhBaobGj8G7QfmDuaDNCQHC\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771744,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
addition of 7 and 7 was completed successfully, resulting in 14.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":false}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
794,\n \"completion_tokens\": 69,\n \"total_tokens\": 863,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:26 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '2183'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant that solves math problems step by step\n\nYour goal: Help solve simple
math problems\n\nYou are executing a specific step in a multi-step plan. Focus
ONLY on completing\nthe current step. Do not plan ahead or worry about future
steps.\n\nBefore acting, briefly reason about what you need to do and which
approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nOutput the result of the addition.\n\n## Context from previous
steps:\nStep 1 result: To complete the addition of 7 and 7, I simply need to
add the two numbers together.\n\n7 + 7 = 14\n\nThe result of the addition is
14.\n\nComplete this step and provide your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '791'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7suYOGXuRR5qr0lVonrkrXRWUA7p\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771746,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of the addition is 14.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
161,\n \"completion_tokens\": 9,\n \"total_tokens\": 170,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:27 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '545'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n## Previously completed steps:\\n
\ Step 1: Perform the addition of 7 and 7.\\n Result: To complete the addition
of 7 and 7, I simply need to add the two numbers together.\\n\\n7 + 7 = 14\\n\\nThe
result of the addition is 14.\\n\\n## Just completed step 2\\nDescription: Output
the result of the addition.\\nResult: The result of the addition is 14.\\n\\n\\nAnalyze
this step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4115'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7suZYBzACUtEwoXNEQb19EtNeYCp\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771747,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
result of the addition is confirmed to be 14.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":true}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
806,\n \"completion_tokens\": 64,\n \"total_tokens\": 870,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:29 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1923'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. You have
completed a multi-step task. Synthesize the results from all steps into a single,
coherent final response that directly addresses the original task. Do NOT list
step numbers or say ''Step 1 result''. Produce a clean, polished answer as if
you did it all at once."},{"role":"user","content":"## Original Task\nWhat is
7 + 7?\n\n## Results from each step\nStep 1 (Perform the addition of 7 and 7.):\nTo
complete the addition of 7 and 7, I simply need to add the two numbers together.\n\n7
+ 7 = 14\n\nThe result of the addition is 14.\n\nStep 2 (Output the result of
the addition.):\nThe result of the addition is 14.\n\nSynthesize these results
into a single, coherent final answer."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '779'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7sub1U8YMbFE8GIiDM24zWL3SsTC\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771749,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of adding 7 and 7 is 14.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
177,\n \"completion_tokens\": 13,\n \"total_tokens\": 190,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:02:30 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '970'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,288 @@
interactions:
- request:
body: '{"trace_id": "04bd841e-3789-4abb-98c6-687c1cff830e", "execution_type":
"crew", "user_identifier": null, "execution_context": {"crew_fingerprint": null,
"crew_name": "Unknown Crew", "flow_name": null, "crewai_version": "1.9.3", "privacy_level":
"standard"}, "execution_metadata": {"expected_duration_estimate": 300, "agent_count":
0, "task_count": 0, "flow_method_count": 0, "execution_started_at": "2026-02-11T01:01:27.459831+00:00"}}'
headers:
Accept:
- '*/*'
Connection:
- keep-alive
Content-Length:
- '434'
Content-Type:
- application/json
User-Agent:
- X-USER-AGENT-XXX
X-Crewai-Version:
- 1.9.3
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
method: POST
uri: https://app.crewai.com/crewai_plus/api/v1/tracing/batches
response:
body:
string: '{"id":"99dfdaf3-9dde-4f81-83dd-56f70fe6cf54","trace_id":"04bd841e-3789-4abb-98c6-687c1cff830e","execution_type":"crew","crew_name":"Unknown
Crew","flow_name":null,"status":"running","duration_ms":null,"crewai_version":"1.9.3","privacy_level":"standard","total_events":0,"execution_context":{"crew_fingerprint":null,"crew_name":"Unknown
Crew","flow_name":null,"crewai_version":"1.9.3","privacy_level":"standard"},"created_at":"2026-02-11T01:01:28.179Z","updated_at":"2026-02-11T01:01:28.179Z"}'
headers:
Connection:
- keep-alive
Content-Length:
- '492'
Content-Type:
- application/json; charset=utf-8
Date:
- Wed, 11 Feb 2026 01:01:28 GMT
cache-control:
- no-store
content-security-policy:
- CSP-FILTERED
etag:
- ETAG-XXX
expires:
- '0'
permissions-policy:
- PERMISSIONS-POLICY-XXX
pragma:
- no-cache
referrer-policy:
- REFERRER-POLICY-XXX
strict-transport-security:
- STS-XXX
vary:
- Accept
x-content-type-options:
- X-CONTENT-TYPE-XXX
x-frame-options:
- X-FRAME-OPTIONS-XXX
x-permitted-cross-domain-policies:
- X-PERMITTED-XXX
x-request-id:
- X-REQUEST-ID-XXX
x-runtime:
- X-RUNTIME-XXX
x-xss-protection:
- X-XSS-PROTECTION-XXX
status:
code: 201
message: Created
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant\nYour personal goal is: Help solve simple math problems"},{"role":"user","content":"\nCurrent
Task: What is 5 + 5?\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '260'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7stbtbYabIcBKefHLeVdVO1W2iYW\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771687,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"5 + 5 equals 10.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
47,\n \"completion_tokens\": 8,\n \"total_tokens\": 55,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:01:28 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '491'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant\nYour personal goal is: Help solve simple math problems"},{"role":"user","content":"\nCurrent
Task: What is 5 + 5?\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '260'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7stcH0BgLGTNbMg989g1vQNjKTaf\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771688,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"5 + 5 equals 10.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
47,\n \"completion_tokens\": 8,\n \"total_tokens\": 55,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:01:28 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '369'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,531 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nWhat is the sum
of the first 3 prime numbers (2, 3, 5)?\n\n## Expected Output\nComplete the
task successfully\n\n## Available Tools\nNo tools available\n\n## Planning Principles\nFocus
on WHAT needs to be accomplished, not HOW. Group related actions into logical
units. Fewer steps = better. Most tasks need 3-6 steps. Hard limit: 10 steps.\n\n##
Step Types (only these are valid):\n1. **Tool Step**: Uses a tool to gather
information or take action\n2. **Output Step**: Synthesizes prior results into
the final deliverable (usually the last step)\n\n## Rules:\n- Each step must
either USE A TOOL or PRODUCE THE FINAL OUTPUT\n- Combine related tool calls:
\"Research A, B, and C\" = ONE step, not three\n- Combine all synthesis into
ONE final output step\n- NO standalone \"thinking\" steps (review, verify, confirm,
refine, analyze) - these happen naturally between steps\n\nFor each step: State
the action, specify the tool (if any), and note dependencies.\n\nAfter your
plan, state READY or NOT READY."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task with structured steps","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"A
brief summary of the overall plan."},"steps":{"type":"array","description":"List
of discrete steps to execute the plan","items":{"type":"object","properties":{"step_number":{"type":"integer","description":"Step
number (1-based)"},"description":{"type":"string","description":"What to do
in this step"},"tool_to_use":{"type":["string","null"],"description":"Tool to
use for this step, or null if no tool needed"},"depends_on":{"type":"array","items":{"type":"integer"},"description":"Step
numbers this step depends on (empty array if none)"}},"required":["step_number","description","tool_to_use","depends_on"],"additionalProperties":false}},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","steps","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '2356'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASpXScmkjCvYXrpSglYS9VdaeLy\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839219,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": null,\n \"tool_calls\": [\n {\n
\ \"id\": \"call_K1VODILTLdRP4zq4We3Lij31\",\n \"type\":
\"function\",\n \"function\": {\n \"name\": \"create_reasoning_plan\",\n
\ \"arguments\": \"{\\\"plan\\\":\\\"Calculate and output the
sum of the first 3 prime numbers (2, 3, 5).\\\",\\\"steps\\\":[{\\\"step_number\\\":1,\\\"description\\\":\\\"Add
the first three prime numbers together (2 + 3 + 5).\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[]},{\\\"step_number\\\":2,\\\"description\\\":\\\"Output
the sum of the prime numbers calculated in the previous step.\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[1]}],\\\"ready\\\":true}\"\n
\ }\n }\n ],\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"tool_calls\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 452,\n \"completion_tokens\":
109,\n \"total_tokens\": 561,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:47:01 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1976'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Tutor. An expert
math tutor who breaks down problems step by step\n\nYour goal: Solve multi-step
math problems accurately\n\nYou are executing a specific step in a multi-step
plan. Focus ONLY on completing\nthe current step. Do not plan ahead or worry
about future steps.\n\nBefore acting, briefly reason about what you need to
do and which approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nAdd the first three prime numbers together (2 + 3 + 5).\n\nComplete
this step and provide your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '632'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASrd7C80o15d4VOVG0fLWigtQkX\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839221,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"To complete this step, I need to add
the first three prime numbers, which are 2, 3, and 5.\\n\\nLet's perform the
addition:\\n\\n1. Start with the first two numbers: \\n \\\\( 2 + 3 = 5
\\\\)\\n\\n2. Now, add the third prime number to this result:\\n \\\\( 5
+ 5 = 10 \\\\)\\n\\nSo, the sum of the first three prime numbers (2 + 3 +
5) is **10**.\",\n \"refusal\": null,\n \"annotations\": []\n
\ },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n
\ ],\n \"usage\": {\n \"prompt_tokens\": 123,\n \"completion_tokens\":
103,\n \"total_tokens\": 226,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:47:03 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '2039'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n\\n## Just completed step 1\\nDescription:
Add the first three prime numbers together (2 + 3 + 5).\\nResult: To complete
this step, I need to add the first three prime numbers, which are 2, 3, and
5.\\n\\nLet's perform the addition:\\n\\n1. Start with the first two numbers:
\\n \\\\( 2 + 3 = 5 \\\\)\\n\\n2. Now, add the third prime number to this
result:\\n \\\\( 5 + 5 = 10 \\\\)\\n\\nSo, the sum of the first three prime
numbers (2 + 3 + 5) is **10**.\\n\\n## Remaining plan steps:\\n Step 2: Output
the sum of the prime numbers calculated in the previous step.\\n\\nAnalyze this
step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4317'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASt8ugNp11vguwfJowpy4kzqrau\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839223,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":false,\\\"key_information_learned\\\":\\\"The
sum of the first three prime numbers was incorrectly calculated as 10 instead
of 10 (2 + 3 + 5 = 10).\\\",\\\"remaining_plan_still_valid\\\":false,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":true,\\\"replan_reason\\\":\\\"The
calculation of the sum was mistakenly described, leading to incorrect logic
in the execution despite the correct result. The methodology needs to be re-evaluated
for accuracy in the next steps.\\\",\\\"goal_already_achieved\\\":false}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
871,\n \"completion_tokens\": 118,\n \"total_tokens\": 989,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:47:05 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1596'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Tutor. You have completed
a multi-step task. Synthesize the results from all steps into a single, coherent
final response that directly addresses the original task. Do NOT list step numbers
or say ''Step 1 result''. Produce a clean, polished answer as if you did it
all at once."},{"role":"user","content":"## Original Task\nWhat is the sum of
the first 3 prime numbers (2, 3, 5)?\n\n## Results from each step\nStep 1 (Add
the first three prime numbers together (2 + 3 + 5).):\nTo complete this step,
I need to add the first three prime numbers, which are 2, 3, and 5.\n\nLet''s
perform the addition:\n\n1. Start with the first two numbers: \n \\( 2 + 3
= 5 \\)\n\n2. Now, add the third prime number to this result:\n \\( 5 + 5
= 10 \\)\n\nSo, the sum of the first three prime numbers (2 + 3 + 5) is **10**.\n\nSynthesize
these results into a single, coherent final answer."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '954'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASvDlPAChhWPWC7w9AlFIAHPzgs\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839225,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The sum of the first three prime numbers,
which are 2, 3, and 5, is \\\\( 10 \\\\). This is calculated by adding them
together: \\\\( 2 + 3 = 5 \\\\) and then \\\\( 5 + 5 = 10 \\\\).\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
239,\n \"completion_tokens\": 59,\n \"total_tokens\": 298,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:47:07 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1709'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,870 @@
interactions:
- request:
body: '{"trace_id": "348ec567-4ae7-4aef-b3a9-85676d0bf6b3", "execution_type":
"crew", "user_identifier": null, "execution_context": {"crew_fingerprint": null,
"crew_name": "Unknown Crew", "flow_name": null, "crewai_version": "1.9.3", "privacy_level":
"standard"}, "execution_metadata": {"expected_duration_estimate": 300, "agent_count":
0, "task_count": 0, "flow_method_count": 0, "execution_started_at": "2026-02-11T19:46:48.989695+00:00"}}'
headers:
Accept:
- '*/*'
Connection:
- keep-alive
Content-Length:
- '434'
Content-Type:
- application/json
User-Agent:
- X-USER-AGENT-XXX
X-Crewai-Organization-Id:
- 3433f0ee-8a94-4aa4-822b-2ac71aa38b18
X-Crewai-Version:
- 1.9.3
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
method: POST
uri: https://app.crewai.com/crewai_plus/api/v1/tracing/batches
response:
body:
string: '{"id":"5f9332a5-d726-4e6f-8ff7-44b5706c8d28","trace_id":"348ec567-4ae7-4aef-b3a9-85676d0bf6b3","execution_type":"crew","crew_name":"Unknown
Crew","flow_name":null,"status":"running","duration_ms":null,"crewai_version":"1.9.3","privacy_level":"standard","total_events":0,"execution_context":{"crew_fingerprint":null,"crew_name":"Unknown
Crew","flow_name":null,"crewai_version":"1.9.3","privacy_level":"standard"},"created_at":"2026-02-11T19:46:49.557Z","updated_at":"2026-02-11T19:46:49.557Z"}'
headers:
Connection:
- keep-alive
Content-Length:
- '492'
Content-Type:
- application/json; charset=utf-8
Date:
- Wed, 11 Feb 2026 19:46:49 GMT
cache-control:
- no-store
content-security-policy:
- CSP-FILTERED
etag:
- ETAG-XXX
expires:
- '0'
permissions-policy:
- PERMISSIONS-POLICY-XXX
pragma:
- no-cache
referrer-policy:
- REFERRER-POLICY-XXX
strict-transport-security:
- STS-XXX
vary:
- Accept
x-content-type-options:
- X-CONTENT-TYPE-XXX
x-frame-options:
- X-FRAME-OPTIONS-XXX
x-permitted-cross-domain-policies:
- X-PERMITTED-XXX
x-request-id:
- X-REQUEST-ID-XXX
x-runtime:
- X-RUNTIME-XXX
x-xss-protection:
- X-XSS-PROTECTION-XXX
status:
code: 201
message: Created
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nWhat is the sum
of the first 3 prime numbers (2, 3, 5)?\n\n## Expected Output\nComplete the
task successfully\n\n## Available Tools\nNo tools available\n\n## Planning Principles\nFocus
on WHAT needs to be accomplished, not HOW. Group related actions into logical
units. Fewer steps = better. Most tasks need 3-6 steps. Hard limit: 10 steps.\n\n##
Step Types (only these are valid):\n1. **Tool Step**: Uses a tool to gather
information or take action\n2. **Output Step**: Synthesizes prior results into
the final deliverable (usually the last step)\n\n## Rules:\n- Each step must
either USE A TOOL or PRODUCE THE FINAL OUTPUT\n- Combine related tool calls:
\"Research A, B, and C\" = ONE step, not three\n- Combine all synthesis into
ONE final output step\n- NO standalone \"thinking\" steps (review, verify, confirm,
refine, analyze) - these happen naturally between steps\n\nFor each step: State
the action, specify the tool (if any), and note dependencies.\n\nAfter your
plan, state READY or NOT READY."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task with structured steps","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"A
brief summary of the overall plan."},"steps":{"type":"array","description":"List
of discrete steps to execute the plan","items":{"type":"object","properties":{"step_number":{"type":"integer","description":"Step
number (1-based)"},"description":{"type":"string","description":"What to do
in this step"},"tool_to_use":{"type":["string","null"],"description":"Tool to
use for this step, or null if no tool needed"},"depends_on":{"type":"array","items":{"type":"integer"},"description":"Step
numbers this step depends on (empty array if none)"}},"required":["step_number","description","tool_to_use","depends_on"],"additionalProperties":false}},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","steps","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '2356'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASfR7yCL8K7uW5ViSLGB8kwbT5n\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839209,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": null,\n \"tool_calls\": [\n {\n
\ \"id\": \"call_paXM4kgZ7yZqVeW73tLwY9Lm\",\n \"type\":
\"function\",\n \"function\": {\n \"name\": \"create_reasoning_plan\",\n
\ \"arguments\": \"{\\\"plan\\\":\\\"Calculate the sum of the
first 3 prime numbers (2, 3, 5).\\\",\\\"steps\\\":[{\\\"step_number\\\":1,\\\"description\\\":\\\"Add
the first three prime numbers (2, 3, 5).\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[]},{\\\"step_number\\\":2,\\\"description\\\":\\\"Output
the result of the sum.\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[1]}],\\\"ready\\\":true}\"\n
\ }\n }\n ],\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"tool_calls\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 452,\n \"completion_tokens\":
100,\n \"total_tokens\": 552,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:46:51 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1840'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Tutor. An expert
math tutor who breaks down problems step by step\n\nYour goal: Solve multi-step
math problems accurately\n\nYou are executing a specific step in a multi-step
plan. Focus ONLY on completing\nthe current step. Do not plan ahead or worry
about future steps.\n\nBefore acting, briefly reason about what you need to
do and which approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nAdd the first three prime numbers (2, 3, 5).\n\nComplete this
step and provide your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '621'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8AShR9nlcqjmiKdUiBN8Ng5nrRdG\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839211,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"To complete this step, I need to add
the first three prime numbers: 2, 3, and 5.\\n\\n1. Start by adding 2 and
3:\\n \\\\( 2 + 3 = 5 \\\\)\\n\\n2. Now, add the result (5) to the next
prime number (5):\\n \\\\( 5 + 5 = 10 \\\\)\\n\\nSo, the result of adding
the first three prime numbers (2, 3, 5) is \\\\( 10 \\\\).\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
122,\n \"completion_tokens\": 104,\n \"total_tokens\": 226,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:46:53 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1740'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n\\n## Just completed step 1\\nDescription:
Add the first three prime numbers (2, 3, 5).\\nResult: To complete this step,
I need to add the first three prime numbers: 2, 3, and 5.\\n\\n1. Start by adding
2 and 3:\\n \\\\( 2 + 3 = 5 \\\\)\\n\\n2. Now, add the result (5) to the next
prime number (5):\\n \\\\( 5 + 5 = 10 \\\\)\\n\\nSo, the result of adding
the first three prime numbers (2, 3, 5) is \\\\( 10 \\\\).\\n\\n## Remaining
plan steps:\\n Step 2: Output the result of the sum.\\n\\nAnalyze this step's
result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4234'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASjbTOj2ZKC00OYm6l31jzxx6Qq\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839213,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
sum of the first three prime numbers (2, 3, 5) is 10.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":false}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
865,\n \"completion_tokens\": 73,\n \"total_tokens\": 938,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:46:55 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1735'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Tutor. An expert
math tutor who breaks down problems step by step\n\nYour goal: Solve multi-step
math problems accurately\n\nYou are executing a specific step in a multi-step
plan. Focus ONLY on completing\nthe current step. Do not plan ahead or worry
about future steps.\n\nBefore acting, briefly reason about what you need to
do and which approach\nor tool would be most helpful for this specific step."},{"role":"user","content":"##
Current Step\nOutput the result of the sum.\n\n## Context from previous steps:\nStep
1 result: To complete this step, I need to add the first three prime numbers:
2, 3, and 5.\n\n1. Start by adding 2 and 3:\n \\( 2 + 3 = 5 \\)\n\n2. Now,
add the result (5) to the next prime number (5):\n \\( 5 + 5 = 10 \\)\n\nSo,
the result of adding the first three prime numbers (2, 3, 5) is \\( 10 \\).\n\nComplete
this step and provide your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '957'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASlopJ8UB8AxMMqXRksWjQd8TLc\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839215,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of the sum of the first
three prime numbers (2, 3, and 5) is \\\\( 10 \\\\).\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
229,\n \"completion_tokens\": 27,\n \"total_tokens\": 256,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_7e4bf6ad56\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:46:56 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '858'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n## Previously completed steps:\\n
\ Step 1: Add the first three prime numbers (2, 3, 5).\\n Result: To complete
this step, I need to add the first three prime numbers: 2, 3, and 5.\\n\\n1.
Start by adding 2 and 3:\\n \\\\( 2 + 3 = 5 \\\\)\\n\\n2. Now, add the result
(5) to the next prime number (5):\\n \\\\( 5 + 5 =\\n\\n## Just completed
step 2\\nDescription: Output the result of the sum.\\nResult: The result of
the sum of the first three prime numbers (2, 3, and 5) is \\\\( 10 \\\\).\\n\\n\\nAnalyze
this step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4247'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASmDh5J9qri09YPH2oD1bNJGgDW\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839216,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
result of adding the first three prime numbers (2, 3, and 5) is 10.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":true}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
866,\n \"completion_tokens\": 75,\n \"total_tokens\": 941,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:46:58 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '2007'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Tutor. You have completed
a multi-step task. Synthesize the results from all steps into a single, coherent
final response that directly addresses the original task. Do NOT list step numbers
or say ''Step 1 result''. Produce a clean, polished answer as if you did it
all at once."},{"role":"user","content":"## Original Task\nWhat is the sum of
the first 3 prime numbers (2, 3, 5)?\n\n## Results from each step\nStep 1 (Add
the first three prime numbers (2, 3, 5).):\nTo complete this step, I need to
add the first three prime numbers: 2, 3, and 5.\n\n1. Start by adding 2 and
3:\n \\( 2 + 3 = 5 \\)\n\n2. Now, add the result (5) to the next prime number
(5):\n \\( 5 + 5 = 10 \\)\n\nSo, the result of adding the first three prime
numbers (2, 3, 5) is \\( 10 \\).\n\nStep 2 (Output the result of the sum.):\nThe
result of the sum of the first three prime numbers (2, 3, and 5) is \\( 10 \\).\n\nSynthesize
these results into a single, coherent final answer."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '1038'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8ASoGnijuByXL9Rn5y7TFT33YUbl\",\n \"object\":
\"chat.completion\",\n \"created\": 1770839218,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The sum of the first three prime numbers,
which are 2, 3, and 5, is 10.\",\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 278,\n \"completion_tokens\":
25,\n \"total_tokens\": 303,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 19:46:59 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '561'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,216 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
math assistant\nYour personal goal is: Solve math problems"},{"role":"user","content":"\nCurrent
Task: What is 10 + 10?\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '255'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7swcrn2VjQwfR0gqwJR3Yes9D8Oj\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771874,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"10 + 10 = 20.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
46,\n \"completion_tokens\": 8,\n \"total_tokens\": 54,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:04:35 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '588'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
math assistant\nYour personal goal is: Solve math problems"},{"role":"user","content":"\nCurrent
Task: What is 10 + 10?\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '255'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7swdwc9DFzA50gO8dbmTe7zrvlXr\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771875,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The sum of 10 + 10 is 20.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
46,\n \"completion_tokens\": 12,\n \"total_tokens\": 58,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:04:36 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '559'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,230 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A precise
math assistant that always returns structured data\nYour personal goal is: Solve
math problems and return structured results"},{"role":"user","content":"\nCurrent
Task: What is 15 + 27?\n\nProvide your complete response:"}],"model":"gpt-4o-mini","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"answer":{"description":"The
numeric answer","title":"Answer","type":"integer"},"explanation":{"description":"Brief
explanation of the solution","title":"Explanation","type":"string"}},"required":["answer","explanation"],"title":"MathResult","type":"object","additionalProperties":false},"name":"MathResult","strict":true}},"stream":false}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '739'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7swZDEfk1b9ktyWB0gNodkLy52oo\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771871,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"answer\\\":42,\\\"explanation\\\":\\\"To
find the sum of 15 and 27, you simply add the two numbers together: 15 + 27
= 42.\\\"}\",\n \"refusal\": null,\n \"annotations\": []\n },\n
\ \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n
\ \"usage\": {\n \"prompt_tokens\": 117,\n \"completion_tokens\": 37,\n
\ \"total_tokens\": 154,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:04:32 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1044'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A precise
math assistant that always returns structured data\nYour personal goal is: Solve
math problems and return structured results"},{"role":"user","content":"\nCurrent
Task: What is 15 + 27?\n\nProvide your complete response:"}],"model":"gpt-4o-mini","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"answer":{"description":"The
numeric answer","title":"Answer","type":"integer"},"explanation":{"description":"Brief
explanation of the solution","title":"Explanation","type":"string"}},"required":["answer","explanation"],"title":"MathResult","type":"object","additionalProperties":false},"name":"MathResult","strict":true}},"stream":false}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '739'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D7swbsTqOlBi8nU9034XxlMfDIfJY\",\n \"object\":
\"chat.completion\",\n \"created\": 1770771873,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"answer\\\":42,\\\"explanation\\\":\\\"To
solve 15 + 27, you simply add the two numbers together: 15 plus 27 equals
42.\\\"}\",\n \"refusal\": null,\n \"annotations\": []\n },\n
\ \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n
\ \"usage\": {\n \"prompt_tokens\": 117,\n \"completion_tokens\": 34,\n
\ \"total_tokens\": 151,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Wed, 11 Feb 2026 01:04:34 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1228'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -1,6 +1,10 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour personal goal is: test goal\nTo give my best complete final answer to the task respond using the exact following format:\n\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described.\n\nI MUST use these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task: Calculate 2 + 2\n\nThis is the expected criteria for your final answer: The result of the calculation\nyou MUST return the actual complete content as the final answer, not a summary.\n\nBegin! This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour
personal goal is: test goal"},{"role":"user","content":"\nCurrent Task: Calculate
2 + 2\n\nThis is the expected criteria for your final answer: The result of
the calculation\nyou MUST return the actual complete content as the final answer,
not a summary.\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -13,7 +17,7 @@ interactions:
connection:
- keep-alive
content-length:
- '797'
- '396'
content-type:
- application/json
host:
@@ -35,13 +39,23 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-CjDsYJQa2tIYBbNloukSWecpsTvdK\",\n \"object\": \"chat.completion\",\n \"created\": 1764894146,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"I now can give a great answer \\nFinal Answer: The result of the calculation 2 + 2 is 4.\",\n \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 161,\n \"completion_tokens\": 25,\n \"total_tokens\": 186,\n \"prompt_tokens_details\": {\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\": {\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\": 0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\": \"default\",\n \"system_fingerprint\": \"fp_11f3029f6b\"\
\n}\n"
string: "{\n \"id\": \"chatcmpl-D5DTjYe6n92Rjo4Ox6NiZpAAdBLF0\",\n \"object\":
\"chat.completion\",\n \"created\": 1770135823,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of the calculation 2 + 2
is 4.\",\n \"refusal\": null,\n \"annotations\": []\n },\n
\ \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n
\ \"usage\": {\n \"prompt_tokens\": 75,\n \"completion_tokens\": 14,\n
\ \"total_tokens\": 89,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_1590f93f9d\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -50,7 +64,7 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 05 Dec 2025 00:22:27 GMT
- Tue, 03 Feb 2026 16:23:43 GMT
Server:
- cloudflare
Set-Cookie:
@@ -70,13 +84,11 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '516'
- '636'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '529'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -1,6 +1,12 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour personal goal is: test goal\nTo give my best complete final answer to the task respond using the exact following format:\n\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described.\n\nI MUST use these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task: Summarize the given context in one sentence\n\nThis is the expected criteria for your final answer: A one-sentence summary\nyou MUST return the actual complete content as the final answer, not a summary.\n\nThis is the context you''re working with:\nThe quick brown fox jumps over the lazy dog. This sentence contains every letter of the alphabet.\n\nBegin! This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-3.5-turbo"}'
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour
personal goal is: test goal"},{"role":"user","content":"\nCurrent Task: Summarize
the given context in one sentence\n\nThis is the expected criteria for your
final answer: A one-sentence summary\nyou MUST return the actual complete content
as the final answer, not a summary.\n\nThis is the context you''re working with:\nThe
quick brown fox jumps over the lazy dog. This sentence contains every letter
of the alphabet.\n\nProvide your complete response:"}],"model":"gpt-3.5-turbo"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -13,7 +19,7 @@ interactions:
connection:
- keep-alive
content-length:
- '963'
- '562'
content-type:
- application/json
host:
@@ -35,13 +41,23 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-CjDtsaX0LJ0dzZz02KwKeRGYgazv1\",\n \"object\": \"chat.completion\",\n \"created\": 1764894228,\n \"model\": \"gpt-3.5-turbo-0125\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"I now can give a great answer\\n\\nFinal Answer: The quick brown fox jumps over the lazy dog. This sentence contains every letter of the alphabet.\",\n \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 191,\n \"completion_tokens\": 30,\n \"total_tokens\": 221,\n \"prompt_tokens_details\": {\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\": {\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\": 0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\"\
: \"default\",\n \"system_fingerprint\": null\n}\n"
string: "{\n \"id\": \"chatcmpl-D5DTn6yIQ7HpIn5j5Bsbag1efzXPa\",\n \"object\":
\"chat.completion\",\n \"created\": 1770135827,\n \"model\": \"gpt-3.5-turbo-0125\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The quick brown fox jumps over the
lazy dog. This sentence contains every letter of the alphabet.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
105,\n \"completion_tokens\": 19,\n \"total_tokens\": 124,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": null\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -50,7 +66,7 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 05 Dec 2025 00:23:49 GMT
- Tue, 03 Feb 2026 16:23:48 GMT
Server:
- cloudflare
Set-Cookie:
@@ -70,13 +86,11 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '506'
- '606'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '559'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -1,6 +1,10 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour personal goal is: test goal\nTo give my best complete final answer to the task respond using the exact following format:\n\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described.\n\nI MUST use these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task: Write a haiku about AI\n\nThis is the expected criteria for your final answer: A haiku (3 lines, 5-7-5 syllable pattern) about AI\nyou MUST return the actual complete content as the final answer, not a summary.\n\nBegin! This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-3.5-turbo","max_tokens":50,"temperature":0.7}'
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour
personal goal is: test goal"},{"role":"user","content":"\nCurrent Task: Write
a haiku about AI\n\nThis is the expected criteria for your final answer: A haiku
(3 lines, 5-7-5 syllable pattern) about AI\nyou MUST return the actual complete
content as the final answer, not a summary.\n\nProvide your complete response:"}],"model":"gpt-3.5-turbo","max_tokens":50,"temperature":0.7}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -13,7 +17,7 @@ interactions:
connection:
- keep-alive
content-length:
- '861'
- '460'
content-type:
- application/json
host:
@@ -35,13 +39,23 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-CjDqr2BmEXQ08QzZKslTZJZ5vV9lo\",\n \"object\": \"chat.completion\",\n \"created\": 1764894041,\n \"model\": \"gpt-3.5-turbo-0125\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"I now can give a great answer\\n\\nFinal Answer: \\nIn circuits they thrive, \\nArtificial minds awake, \\nFuture's coded drive.\",\n \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 174,\n \"completion_tokens\": 29,\n \"total_tokens\": 203,\n \"prompt_tokens_details\": {\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\": {\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\": 0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\": \"default\"\
,\n \"system_fingerprint\": null\n}\n"
string: "{\n \"id\": \"chatcmpl-D5DTgAqxaC8RmEvikXK0UDaxmVmf9\",\n \"object\":
\"chat.completion\",\n \"created\": 1770135820,\n \"model\": \"gpt-3.5-turbo-0125\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"Artificial minds,\\nCode and circuits
intertwine,\\nFuture undefined.\",\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 88,\n \"completion_tokens\":
13,\n \"total_tokens\": 101,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": null\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -50,7 +64,7 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 05 Dec 2025 00:20:41 GMT
- Tue, 03 Feb 2026 16:23:40 GMT
Server:
- cloudflare
Set-Cookie:
@@ -70,13 +84,11 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '434'
- '277'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '456'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -0,0 +1,231 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nWhat is 9 + 11?\n\n##
Expected Output\nA number\n\n## Available Tools\nNo tools available\n\n## Instructions\nCreate
ONLY the essential steps needed to complete this task. Use the MINIMUM number
of steps required - do NOT pad your plan with unnecessary steps. Most tasks
need only 2-5 steps.\n\nFor each step:\n- State the specific action to take\n-
Specify which tool to use (if any)\n\nDo NOT include:\n- Setup or preparation
steps that are obvious\n- Verification steps unless critical\n- Documentation
or cleanup steps unless explicitly required\n- Generic steps like \"review results\"
or \"finalize output\"\n\nAfter your plan, state:\n- \"READY: I am ready to
execute the task.\" if the plan is complete\n- \"NOT READY: I need to refine
my plan because [reason].\" if you need more thinking"}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"The
detailed reasoning plan for the task."},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '1520'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D4yVACNTzZcghQRwt5kFYQ4HAvbgI\",\n \"object\":
\"chat.completion\",\n \"created\": 1770078252,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"## Execution Plan\\n1. Calculate the
sum of 9 and 11.\\n \\nREADY: I am ready to execute the task.\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
279,\n \"completion_tokens\": 28,\n \"total_tokens\": 307,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_1590f93f9d\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Tue, 03 Feb 2026 00:24:13 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '951'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
math tutor\nYour personal goal is: Help solve math problems"},{"role":"user","content":"\nCurrent
Task: What is 9 + 11?\n\nPlanning:\n## Execution Plan\n1. Calculate the sum
of 9 and 11.\n \nREADY: I am ready to execute the task.\n\nThis is the expected
criteria for your final answer: A number\nyou MUST return the actual complete
content as the final answer, not a summary.\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '513'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D4yVBdTCKSdfcJYlIOX9BbzrObgFI\",\n \"object\":
\"chat.completion\",\n \"created\": 1770078253,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"9 + 11 = 20\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
105,\n \"completion_tokens\": 7,\n \"total_tokens\": 112,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_1590f93f9d\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Tue, 03 Feb 2026 00:24:13 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '477'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,243 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nCalculate the area
of a circle with radius 5 (use pi = 3.14)\n\n## Expected Output\nThe area as
a number\n\n## Available Tools\nNo tools available\n\n## Instructions\nCreate
ONLY the essential steps needed to complete this task. Use the MINIMUM number
of steps required - do NOT pad your plan with unnecessary steps. Most tasks
need only 2-5 steps.\n\nFor each step:\n- State the specific action to take\n-
Specify which tool to use (if any)\n\nDo NOT include:\n- Setup or preparation
steps that are obvious\n- Verification steps unless critical\n- Documentation
or cleanup steps unless explicitly required\n- Generic steps like \"review results\"
or \"finalize output\"\n\nAfter your plan, state:\n- \"READY: I am ready to
execute the task.\" if the plan is complete\n- \"NOT READY: I need to refine
my plan because [reason].\" if you need more thinking"}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"The
detailed reasoning plan for the task."},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '1577'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D4yVCdA1csIzfoHSQvxkfrA4gDn4z\",\n \"object\":
\"chat.completion\",\n \"created\": 1770078254,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"## Execution Plan\\n1. Multiply the
radius (5) by itself (5) to get the square of the radius.\\n2. Multiply the
squared radius by pi (3.14) to calculate the area.\\n\\nREADY: I am ready
to execute the task.\",\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 293,\n \"completion_tokens\":
54,\n \"total_tokens\": 347,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_1590f93f9d\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Tue, 03 Feb 2026 00:24:15 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '845'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Tutor. An expert
tutor\nYour personal goal is: Solve complex math problems step by step"},{"role":"user","content":"\nCurrent
Task: Calculate the area of a circle with radius 5 (use pi = 3.14)\n\nPlanning:\n##
Execution Plan\n1. Multiply the radius (5) by itself (5) to get the square of
the radius.\n2. Multiply the squared radius by pi (3.14) to calculate the area.\n\nREADY:
I am ready to execute the task.\n\nThis is the expected criteria for your final
answer: The area as a number\nyou MUST return the actual complete content as
the final answer, not a summary.\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '682'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D4yVDh2U2xx3qeYHcDQvbetOmVCxb\",\n \"object\":
\"chat.completion\",\n \"created\": 1770078255,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"To calculate the area of a circle with
a radius of 5, we will follow the steps outlined in the execution plan.\\n\\n1.
**Square the radius**:\\n \\\\[\\n 5 \\\\times 5 = 25\\n \\\\]\\n\\n2.
**Multiply the squared radius by pi (using \\\\(\\\\pi \\\\approx 3.14\\\\))**:\\n
\ \\\\[\\n \\\\text{Area} = \\\\pi \\\\times (\\\\text{radius})^2 = 3.14
\\\\times 25\\n \\\\]\\n\\n Now, let's perform the multiplication:\\n
\ \\\\[\\n 3.14 \\\\times 25 = 78.5\\n \\\\]\\n\\nThus, the area of the
circle is \\\\( \\\\boxed{78.5} \\\\).\",\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 147,\n \"completion_tokens\":
155,\n \"total_tokens\": 302,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_1590f93f9d\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Tue, 03 Feb 2026 00:24:18 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '2228'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -1,7 +1,11 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour personal goal is: test goal\nYou ONLY have access to the following tools, and should NEVER make up tools that are not listed here:\n\nTool Name: dummy_tool\nTool Arguments: {''query'': {''description'': None, ''type'': ''str''}}\nTool Description: Useful for when you need to get a dummy result for a query.\n\nIMPORTANT: Use the following format in your response:\n\n```\nThought: you should always think about what to do\nAction: the action to take, only one name of [dummy_tool], just the name, exactly as it''s written.\nAction Input: the input to the action, just a simple JSON object, enclosed in curly braces, using \" to wrap keys and values.\nObservation: the result of the action\n```\n\nOnce all necessary information is gathered, return the following format:\n\n```\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n```"},{"role":"user","content":"\nCurrent
Task: Use the dummy tool to get a result for ''test query''\n\nThis is the expected criteria for your final answer: The result from the dummy tool\nyou MUST return the actual complete content as the final answer, not a summary.\n\nBegin! This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-3.5-turbo"}'
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour
personal goal is: test goal"},{"role":"user","content":"\nCurrent Task: Use
the dummy tool to get a result for ''test query''\n\nThis is the expected criteria
for your final answer: The result from the dummy tool\nyou MUST return the actual
complete content as the final answer, not a summary."}],"model":"gpt-3.5-turbo","tool_choice":"auto","tools":[{"type":"function","function":{"name":"dummy_tool","description":"Useful
for when you need to get a dummy result for a query.","strict":true,"parameters":{"properties":{"query":{"title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
@@ -14,7 +18,7 @@ interactions:
connection:
- keep-alive
content-length:
- '1381'
- '712'
content-type:
- application/json
host:
@@ -36,12 +40,26 @@ interactions:
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.10
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-CjDrE1Z8bFQjjxI2vDPPKgtOTm28p\",\n \"object\": \"chat.completion\",\n \"created\": 1764894064,\n \"model\": \"gpt-3.5-turbo-0125\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"you should always think about what to do\",\n \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 289,\n \"completion_tokens\": 8,\n \"total_tokens\": 297,\n \"prompt_tokens_details\": {\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\": {\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\": 0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\": \"default\",\n \"system_fingerprint\": null\n}\n"
string: "{\n \"id\": \"chatcmpl-D5DTlUmKYee1DaS5AqnaUCZ6B14xV\",\n \"object\":
\"chat.completion\",\n \"created\": 1770135825,\n \"model\": \"gpt-3.5-turbo-0125\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": null,\n \"tool_calls\": [\n {\n
\ \"id\": \"call_tBCgelchfQjXXJrrM15MxqGJ\",\n \"type\":
\"function\",\n \"function\": {\n \"name\": \"dummy_tool\",\n
\ \"arguments\": \"{\\\"query\\\":\\\"test query\\\"}\"\n }\n
\ }\n ],\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"tool_calls\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 122,\n \"completion_tokens\":
16,\n \"total_tokens\": 138,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": null\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
@@ -50,7 +68,7 @@ interactions:
Content-Type:
- application/json
Date:
- Fri, 05 Dec 2025 00:21:05 GMT
- Tue, 03 Feb 2026 16:23:46 GMT
Server:
- cloudflare
Set-Cookie:
@@ -70,13 +88,124 @@ interactions:
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '379'
- '694'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are test role. test backstory\nYour
personal goal is: test goal"},{"role":"user","content":"\nCurrent Task: Use
the dummy tool to get a result for ''test query''\n\nThis is the expected criteria
for your final answer: The result from the dummy tool\nyou MUST return the actual
complete content as the final answer, not a summary."},{"role":"assistant","content":null,"tool_calls":[{"id":"call_tBCgelchfQjXXJrrM15MxqGJ","type":"function","function":{"name":"dummy_tool","arguments":"{\"query\":\"test
query\"}"}}]},{"role":"tool","tool_call_id":"call_tBCgelchfQjXXJrrM15MxqGJ","name":"dummy_tool","content":"Dummy
result for: test query"},{"role":"user","content":"Analyze the tool result.
If requirements are met, provide the Final Answer. Otherwise, call the next
tool. Deliver only the answer without meta-commentary."}],"model":"gpt-3.5-turbo","tool_choice":"auto","tools":[{"type":"function","function":{"name":"dummy_tool","description":"Useful
for when you need to get a dummy result for a query.","strict":true,"parameters":{"properties":{"query":{"title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '1202'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D5DTmxZJI2Ee7fHNc9dYtQkD7sIY2\",\n \"object\":
\"chat.completion\",\n \"created\": 1770135826,\n \"model\": \"gpt-3.5-turbo-0125\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"Dummy result for: test query\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
188,\n \"completion_tokens\": 7,\n \"total_tokens\": 195,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": null\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Tue, 03 Feb 2026 16:23:47 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '416'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '399'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:

View File

@@ -0,0 +1,110 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
assistant\nYour personal goal is: Help solve math problems"},{"role":"user","content":"\nCurrent
Task: What is 12 * 3?\n\nThis is the expected criteria for your final answer:
A number\nyou MUST return the actual complete content as the final answer, not
a summary.\n\nProvide your complete response:"}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '400'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D4yVCw0CGLFmcVvniplwCCt8avtRb\",\n \"object\":
\"chat.completion\",\n \"created\": 1770078254,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"12 * 3 = 36\",\n \"refusal\":
null,\n \"annotations\": []\n },\n \"logprobs\": null,\n
\ \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
75,\n \"completion_tokens\": 7,\n \"total_tokens\": 82,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_1590f93f9d\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Tue, 03 Feb 2026 00:24:14 GMT
Server:
- cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '331'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,780 @@
interactions:
- request:
body: '{"messages":[{"role":"system","content":"You are a strategic planning assistant.
Create minimal, effective execution plans. Prefer fewer steps over more."},{"role":"user","content":"Create
a focused execution plan for the following task:\n\n## Task\nWhat is 15 + 27?\n\n##
Expected Output\nComplete the task successfully\n\n## Available Tools\nNo tools
available\n\n## Planning Principles\nFocus on WHAT needs to be accomplished,
not HOW. Group related actions into logical units. Fewer steps = better. Most
tasks need 3-6 steps. Hard limit: 20 steps.\n\n## Step Types (only these are
valid):\n1. **Tool Step**: Uses a tool to gather information or take action\n2.
**Output Step**: Synthesizes prior results into the final deliverable (usually
the last step)\n\n## Rules:\n- Each step must either USE A TOOL or PRODUCE THE
FINAL OUTPUT\n- Combine related tool calls: \"Research A, B, and C\" = ONE step,
not three\n- Combine all synthesis into ONE final output step\n- NO standalone
\"thinking\" steps (review, verify, confirm, refine, analyze) - these happen
naturally between steps\n\nFor each step: State the action, specify the tool
(if any), and note dependencies.\n\nAfter your plan, state READY or NOT READY."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"create_reasoning_plan","description":"Create
or refine a reasoning plan for a task with structured steps","strict":true,"parameters":{"type":"object","properties":{"plan":{"type":"string","description":"A
brief summary of the overall plan."},"steps":{"type":"array","description":"List
of discrete steps to execute the plan","items":{"type":"object","properties":{"step_number":{"type":"integer","description":"Step
number (1-based)"},"description":{"type":"string","description":"What to do
in this step"},"tool_to_use":{"type":["string","null"],"description":"Tool to
use for this step, or null if no tool needed"},"depends_on":{"type":"array","items":{"type":"integer"},"description":"Step
numbers this step depends on (empty array if none)"}},"required":["step_number","description","tool_to_use","depends_on"],"additionalProperties":false}},"ready":{"type":"boolean","description":"Whether
the agent is ready to execute the task."}},"required":["plan","steps","ready"],"additionalProperties":false}}}]}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '2317'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WidmJV9i13DE24mvjP1ZirakLg4\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924767,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": null,\n \"tool_calls\": [\n {\n
\ \"id\": \"call_xQ7jJFuvWGusUiruNaZjuV6F\",\n \"type\":
\"function\",\n \"function\": {\n \"name\": \"create_reasoning_plan\",\n
\ \"arguments\": \"{\\\"plan\\\":\\\"Calculate the sum of 15 and
27.\\\",\\\"steps\\\":[{\\\"step_number\\\":1,\\\"description\\\":\\\"Add
the numbers 15 and 27 together.\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[]},{\\\"step_number\\\":2,\\\"description\\\":\\\"Provide
the final result of 15 + 27.\\\",\\\"tool_to_use\\\":null,\\\"depends_on\\\":[1]}],\\\"ready\\\":true}\"\n
\ }\n }\n ],\n \"refusal\": null,\n \"annotations\":
[]\n },\n \"logprobs\": null,\n \"finish_reason\": \"tool_calls\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 440,\n \"completion_tokens\":
88,\n \"total_tokens\": 528,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:49 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1643'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
math tutor\n\nYour goal: Help solve math problems step by step\n\nYou are executing
a specific step in a multi-step plan. Focus ONLY on completing\nthe current
step. Do not plan ahead or worry about future steps.\n\nBefore acting, briefly
reason about what you need to do and which approach\nor tool would be most helpful
for this specific step."},{"role":"user","content":"## Current Step\nAdd the
numbers 15 and 27 together.\n\nComplete this step and provide your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '574'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WifMUkNsYirBHDQMHraML9Sfxhr\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924769,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"To add the numbers 15 and 27 together,
I will perform the addition:\\n\\n15 + 27 = 42\\n\\nThe result is 42.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
111,\n \"completion_tokens\": 31,\n \"total_tokens\": 142,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:50 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '975'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n\\n## Just completed step 1\\nDescription:
Add the numbers 15 and 27 together.\\nResult: To add the numbers 15 and 27 together,
I will perform the addition:\\n\\n15 + 27 = 42\\n\\nThe result is 42.\\n\\n##
Remaining plan steps:\\n Step 2: Provide the final result of 15 + 27.\\n\\nAnalyze
this step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4037'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8Wihvgjx16HF7G4R4Hv2hHRX95Zi\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924771,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
result of adding 15 and 27 is 42.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":true}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
791,\n \"completion_tokens\": 65,\n \"total_tokens\": 856,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:52 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1742'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. A helpful
math tutor\n\nYour goal: Help solve math problems step by step\n\nYou are executing
a specific step in a multi-step plan. Focus ONLY on completing\nthe current
step. Do not plan ahead or worry about future steps.\n\nBefore acting, briefly
reason about what you need to do and which approach\nor tool would be most helpful
for this specific step."},{"role":"user","content":"## Current Step\nProvide
the final result of 15 + 27.\n\n## Context from previous steps:\nStep 1 result:
To add the numbers 15 and 27 together, I will perform the addition:\n\n15 +
27 = 42\n\nThe result is 42.\n\nComplete this step and provide your result."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '731'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WijqXxzVDrvN694O3RU1ndqJEOT\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924773,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The final result of 15 + 27 is 42.\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
154,\n \"completion_tokens\": 13,\n \"total_tokens\": 167,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_7e4bf6ad56\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:53 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '545'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a Planning Agent
observing execution progress. After each step completes, you analyze what happened
and decide whether the remaining plan is still valid.\\n\\nReason step-by-step
about:\\n1. What new information was learned from this step's result\\n2. Whether
the remaining steps still make sense given this new information\\n3. What refinements,
if any, are needed for upcoming steps\\n4. Whether the overall goal has already
been achieved\\n\\nBe conservative about triggering full replans \u2014 only
do so when the remaining plan is fundamentally wrong, not just suboptimal.\"},{\"role\":\"user\",\"content\":\"##
Original task\\n\\n\\n## Expected output\\n\\n\\n## Previously completed steps:\\n
\ Step 1: Add the numbers 15 and 27 together.\\n Result: To add the numbers
15 and 27 together, I will perform the addition:\\n\\n15 + 27 = 42\\n\\nThe
result is 42.\\n\\n## Just completed step 2\\nDescription: Provide the final
result of 15 + 27.\\nResult: The final result of 15 + 27 is 42.\\n\\n\\nAnalyze
this step's result and provide your observation.\"}],\"model\":\"gpt-4o-mini\",\"response_format\":{\"type\":\"json_schema\",\"json_schema\":{\"schema\":{\"description\":\"Planner's
observation after a step execution completes.\\n\\nReturned by the PlannerObserver
after EVERY step \u2014 not just failures.\\nThe Planner uses this to decide
whether to continue, refine, or replan.\\n\\nBased on PLAN-AND-ACT (Section
3.3): the Planner observes what the Executor\\ndid and incorporates new information
into the remaining plan.\\n\\nAttributes:\\n step_completed_successfully:
Whether the step achieved its objective.\\n key_information_learned: New
information revealed by this step\\n (e.g., \\\"Found 3 products: A,
B, C\\\"). Used to refine upcoming steps.\\n remaining_plan_still_valid:
Whether pending todos still make sense\\n given the new information.
True does NOT mean no refinement needed.\\n suggested_refinements: Minor
tweaks to upcoming step descriptions.\\n These are lightweight in-place
updates, not a full replan.\\n Example: [\\\"Step 3 should select product
B instead of 'best product'\\\"]\\n needs_full_replan: The remaining plan
is fundamentally wrong and must\\n be regenerated from scratch. Mutually
exclusive with\\n remaining_plan_still_valid (if this is True, that should
be False).\\n replan_reason: Explanation of why a full replan is needed (None
if not).\\n goal_already_achieved: The overall task goal has been satisfied
early.\\n No more steps needed \u2014 skip remaining todos and finalize.\",\"properties\":{\"step_completed_successfully\":{\"description\":\"Whether
the step achieved what it was asked to do\",\"title\":\"Step Completed Successfully\",\"type\":\"boolean\"},\"key_information_learned\":{\"default\":\"\",\"description\":\"What
new information this step revealed\",\"title\":\"Key Information Learned\",\"type\":\"string\"},\"remaining_plan_still_valid\":{\"default\":true,\"description\":\"Whether
the remaining pending todos still make sense given new information\",\"title\":\"Remaining
Plan Still Valid\",\"type\":\"boolean\"},\"suggested_refinements\":{\"anyOf\":[{\"items\":{\"type\":\"string\"},\"type\":\"array\"},{\"type\":\"null\"}],\"description\":\"Minor
tweaks to descriptions of upcoming steps (lightweight, no full replan)\",\"title\":\"Suggested
Refinements\"},\"needs_full_replan\":{\"default\":false,\"description\":\"The
remaining plan is fundamentally wrong and must be regenerated\",\"title\":\"Needs
Full Replan\",\"type\":\"boolean\"},\"replan_reason\":{\"anyOf\":[{\"type\":\"string\"},{\"type\":\"null\"}],\"description\":\"Explanation
of why a full replan is needed\",\"title\":\"Replan Reason\"},\"goal_already_achieved\":{\"default\":false,\"description\":\"The
overall task goal has been satisfied early; no more steps needed\",\"title\":\"Goal
Already Achieved\",\"type\":\"boolean\"}},\"required\":[\"step_completed_successfully\",\"key_information_learned\",\"remaining_plan_still_valid\",\"suggested_refinements\",\"needs_full_replan\",\"replan_reason\",\"goal_already_achieved\"],\"title\":\"StepObservation\",\"type\":\"object\",\"additionalProperties\":false},\"name\":\"StepObservation\",\"strict\":true}},\"stream\":false}"
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '4091'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-helper-method:
- beta.chat.completions.parse
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WijHLIWu3UpiSTF4oondCkkFWnq\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924773,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"{\\\"step_completed_successfully\\\":true,\\\"key_information_learned\\\":\\\"The
final result of the addition of 15 and 27 is confirmed to be 42, consistent
with the previous step.\\\",\\\"remaining_plan_still_valid\\\":true,\\\"suggested_refinements\\\":null,\\\"needs_full_replan\\\":false,\\\"replan_reason\\\":null,\\\"goal_already_achieved\\\":true}\",\n
\ \"refusal\": null,\n \"annotations\": []\n },\n \"logprobs\":
null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\":
807,\n \"completion_tokens\": 77,\n \"total_tokens\": 884,\n \"prompt_tokens_details\":
{\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:55 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1843'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"messages":[{"role":"system","content":"You are Math Assistant. You have
completed a multi-step task. Synthesize the results from all steps into a single,
coherent final response that directly addresses the original task. Do NOT list
step numbers or say ''Step 1 result''. Produce a clean, polished answer as if
you did it all at once."},{"role":"user","content":"## Original Task\nWhat is
15 + 27?\n\n## Results from each step\nStep 1 (Add the numbers 15 and 27 together.):\nTo
add the numbers 15 and 27 together, I will perform the addition:\n\n15 + 27
= 42\n\nThe result is 42.\n\nStep 2 (Provide the final result of 15 + 27.):\nThe
final result of 15 + 27 is 42.\n\nSynthesize these results into a single, coherent
final answer."}],"model":"gpt-4o-mini"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '757'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-D8WilXbtsRhcSEKpM0tfkI49oNjmn\",\n \"object\":
\"chat.completion\",\n \"created\": 1770924775,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"The result of adding 15 and 27 together
is 42.\",\n \"refusal\": null,\n \"annotations\": []\n },\n
\ \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n
\ \"usage\": {\n \"prompt_tokens\": 178,\n \"completion_tokens\": 14,\n
\ \"total_tokens\": 192,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": \"fp_f4ae844694\"\n}\n"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:32:56 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '492'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

Some files were not shown because too many files have changed in this diff Show More