Merge branch 'main' into lorenze/improve-docs-flows

2026-01-26 00:28:13 +00:00 · 2025-12-10 08:53:05 -08:00
parent b56a3fd480 34b909367b
commit 80e35fddd3
226 changed files with 38898 additions and 38905 deletions
--- a/docs/en/concepts/crews.mdx
+++ b/docs/en/concepts/crews.mdx
@@ -307,12 +307,27 @@ print(result)

 ### Different Ways to Kick Off a Crew

-Once your crew is assembled, initiate the workflow with the appropriate kickoff method. CrewAI provides several methods for better control over the kickoff process: `kickoff()`, `kickoff_for_each()`, `kickoff_async()`, and `kickoff_for_each_async()`.
+Once your crew is assembled, initiate the workflow with the appropriate kickoff method. CrewAI provides several methods for better control over the kickoff process.
+
+#### Synchronous Methods

 - `kickoff()`: Starts the execution process according to the defined process flow.
 - `kickoff_for_each()`: Executes tasks sequentially for each provided input event or item in the collection.
- `kickoff_async()`: Initiates the workflow asynchronously.
- `kickoff_for_each_async()`: Executes tasks concurrently for each provided input event or item, leveraging asynchronous processing.
+
+#### Asynchronous Methods
+
+CrewAI offers two approaches for async execution:
+
+| Method | Type | Description |
+|--------|------|-------------|
+| `akickoff()` | Native async | True async/await throughout the entire execution chain |
+| `akickoff_for_each()` | Native async | Native async execution for each input in a list |
+| `kickoff_async()` | Thread-based | Wraps synchronous execution in `asyncio.to_thread` |
+| `kickoff_for_each_async()` | Thread-based | Thread-based async for each input in a list |
+
+<Note>
+For high-concurrency workloads, `akickoff()` and `akickoff_for_each()` are recommended as they use native async for task execution, memory operations, and knowledge retrieval.
+</Note>

 ```python Code
 # Start the crew's task execution
@@ -325,19 +340,30 @@ results = my_crew.kickoff_for_each(inputs=inputs_array)
 for result in results:
    print(result)

-# Example of using kickoff_async
+# Example of using native async with akickoff
+inputs = {'topic': 'AI in healthcare'}
+async_result = await my_crew.akickoff(inputs=inputs)
+print(async_result)
+
+# Example of using native async with akickoff_for_each
+inputs_array = [{'topic': 'AI in healthcare'}, {'topic': 'AI in finance'}]
+async_results = await my_crew.akickoff_for_each(inputs=inputs_array)
+for async_result in async_results:
+    print(async_result)
+
+# Example of using thread-based kickoff_async
 inputs = {'topic': 'AI in healthcare'}
 async_result = await my_crew.kickoff_async(inputs=inputs)
 print(async_result)

-# Example of using kickoff_for_each_async
+# Example of using thread-based kickoff_for_each_async
 inputs_array = [{'topic': 'AI in healthcare'}, {'topic': 'AI in finance'}]
 async_results = await my_crew.kickoff_for_each_async(inputs=inputs_array)
 for async_result in async_results:
    print(async_result)
 ```

-These methods provide flexibility in how you manage and execute tasks within your crew, allowing for both synchronous and asynchronous workflows tailored to your needs.
+These methods provide flexibility in how you manage and execute tasks within your crew, allowing for both synchronous and asynchronous workflows tailored to your needs. For detailed async examples, see the [Kickoff Crew Asynchronously](/en/learn/kickoff-async) guide.

 ### Streaming Crew Execution

--- a/docs/en/concepts/llms.mdx
+++ b/docs/en/concepts/llms.mdx
@@ -283,11 +283,54 @@ In this section, you'll find detailed examples that help you select, configure,
    )
    ```

+    **Extended Thinking (Claude Sonnet 4 and Beyond):**
+
+    CrewAI supports Anthropic's Extended Thinking feature, which allows Claude to think through problems in a more human-like way before responding. This is particularly useful for complex reasoning, analysis, and problem-solving tasks.
+
+    ```python Code
+    from crewai import LLM
+
+    # Enable extended thinking with default settings
+    llm = LLM(
+        model="anthropic/claude-sonnet-4",
+        thinking={"type": "enabled"},
+        max_tokens=10000
+    )
+
+    # Configure thinking with budget control
+    llm = LLM(
+        model="anthropic/claude-sonnet-4",
+        thinking={
+            "type": "enabled",
+            "budget_tokens": 5000  # Limit thinking tokens
+        },
+        max_tokens=10000
+    )
+    ```
+
+    **Thinking Configuration Options:**
+    - `type`: Set to `"enabled"` to activate extended thinking mode
+    - `budget_tokens` (optional): Maximum tokens to use for thinking (helps control costs)
+
+    **Models Supporting Extended Thinking:**
+    - `claude-sonnet-4` and newer models
+    - `claude-3-7-sonnet` (with extended thinking capabilities)
+
+    **When to Use Extended Thinking:**
+    - Complex reasoning and multi-step problem solving
+    - Mathematical calculations and proofs
+    - Code analysis and debugging
+    - Strategic planning and decision making
+    - Research and analytical tasks
+
+    **Note:** Extended thinking consumes additional tokens but can significantly improve response quality for complex tasks.
+
    **Supported Environment Variables:**
    - `ANTHROPIC_API_KEY`: Your Anthropic API key (required)

    **Features:**
    - Native tool use support for Claude 3+ models
+    - Extended Thinking support for Claude Sonnet 4+
    - Streaming support for real-time responses
    - Automatic system message handling
    - Stop sequences for controlled output
@@ -305,6 +348,7 @@ In this section, you'll find detailed examples that help you select, configure,

    | Model                        | Context Window | Best For                                      |
    |------------------------------|----------------|-----------------------------------------------|
+    | claude-sonnet-4              | 200,000 tokens | Latest with extended thinking capabilities    |
    | claude-3-7-sonnet            | 200,000 tokens | Advanced reasoning and agentic tasks          |
    | claude-3-5-sonnet-20241022   | 200,000 tokens | Latest Sonnet with best performance           |
    | claude-3-5-haiku             | 200,000 tokens | Fast, compact model for quick responses       |
@@ -1089,6 +1133,50 @@ CrewAI supports streaming responses from LLMs, allowing your application to rece
  </Tab>
 </Tabs>

+## Async LLM Calls
+
+CrewAI supports asynchronous LLM calls for improved performance and concurrency in your AI workflows. Async calls allow you to run multiple LLM requests concurrently without blocking, making them ideal for high-throughput applications and parallel agent operations.
+
+<Tabs>
+  <Tab title="Basic Usage">
+    Use the `acall` method for asynchronous LLM requests:
+
+    ```python
+    import asyncio
+    from crewai import LLM
+
+    async def main():
+        llm = LLM(model="openai/gpt-4o")
+
+        # Single async call
+        response = await llm.acall("What is the capital of France?")
+        print(response)
+
+    asyncio.run(main())
+    ```
+
+    The `acall` method supports all the same parameters as the synchronous `call` method, including messages, tools, and callbacks.
+  </Tab>
+
+  <Tab title="With Streaming">
+    Combine async calls with streaming for real-time concurrent responses:
+
+    ```python
+    import asyncio
+    from crewai import LLM
+
+    async def stream_async():
+        llm = LLM(model="openai/gpt-4o", stream=True)
+
+        response = await llm.acall("Write a short story about AI")
+
+        print(response)
+
+    asyncio.run(stream_async())
+    ```
+  </Tab>
+</Tabs>
+
 ## Structured LLM Calls

 CrewAI supports structured responses from LLM calls by allowing you to define a `response_format` using a Pydantic model. This enables the framework to automatically parse and validate the output, making it easier to integrate the response into your application without manual post-processing.
--- a/docs/en/concepts/memory.mdx
+++ b/docs/en/concepts/memory.mdx
@@ -515,8 +515,7 @@ crew = Crew(
        "provider": "huggingface",
        "config": {
            "api_key": "your-hf-token",  # Optional for public models
-            "model": "sentence-transformers/all-MiniLM-L6-v2",
-            "api_url": "https://api-inference.huggingface.co"  # or your custom endpoint
+            "model": "sentence-transformers/all-MiniLM-L6-v2"
        }
    }
 )