mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-07-01 05:08:12 +00:00
docs(enterprise): register structured logs + Datadog dashboard in pt-BR/ko/ar
Adds stub MDX pages with translated frontmatter and a "translation in progress" note in each locale. Body content is English while waiting on full translations, matching the discoverability of every other Enterprise guide (registered across all four edge locales). Also translates the Datadog OTLP /v1/logs touch-up and the new cross-links in pt-BR/ko/ar versions of capture_telemetry_logs.mdx. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -8649,6 +8649,8 @@
|
||||
"edge/pt-BR/enterprise/guides/update-crew",
|
||||
"edge/pt-BR/enterprise/guides/enable-crew-studio",
|
||||
"edge/pt-BR/enterprise/guides/capture_telemetry_logs",
|
||||
"edge/pt-BR/enterprise/guides/structured_logs",
|
||||
"edge/pt-BR/enterprise/guides/datadog_dashboard",
|
||||
"edge/pt-BR/enterprise/guides/azure-openai-setup",
|
||||
"edge/pt-BR/enterprise/guides/tool-repository",
|
||||
"edge/pt-BR/enterprise/guides/custom-mcp-server",
|
||||
@@ -16512,6 +16514,8 @@
|
||||
"edge/ko/enterprise/guides/update-crew",
|
||||
"edge/ko/enterprise/guides/enable-crew-studio",
|
||||
"edge/ko/enterprise/guides/capture_telemetry_logs",
|
||||
"edge/ko/enterprise/guides/structured_logs",
|
||||
"edge/ko/enterprise/guides/datadog_dashboard",
|
||||
"edge/ko/enterprise/guides/azure-openai-setup",
|
||||
"edge/ko/enterprise/guides/tool-repository",
|
||||
"edge/ko/enterprise/guides/custom-mcp-server",
|
||||
@@ -24567,6 +24571,8 @@
|
||||
"edge/ar/enterprise/guides/update-crew",
|
||||
"edge/ar/enterprise/guides/enable-crew-studio",
|
||||
"edge/ar/enterprise/guides/capture_telemetry_logs",
|
||||
"edge/ar/enterprise/guides/structured_logs",
|
||||
"edge/ar/enterprise/guides/datadog_dashboard",
|
||||
"edge/ar/enterprise/guides/azure-openai-setup",
|
||||
"edge/ar/enterprise/guides/tool-repository",
|
||||
"edge/ar/enterprise/guides/custom-mcp-server",
|
||||
|
||||
@@ -49,7 +49,9 @@ mode: "wide"
|
||||
- `otlp.ap1.datadoghq.com` (AP1)
|
||||
- **API Key** — مفتاح واجهة برمجة تطبيقات Datadog الخاص بك. راجع [كيفية إنشاء واحد](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys).
|
||||
|
||||
يصدّر تكامل Datadog **التتبعات**.
|
||||
يرسل قالب Datadog الافتراضي **التتبعات** إلى المسار `/v1/traces`. لتصدير **السجلات** عبر OTLP بدلاً من ذلك، أضف مجمّع **OpenTelemetry Logs** يشير إلى نفس مضيف OTLP الخاص بـ Datadog مع تعيين المسار إلى `/v1/logs` — يمكن للإشارتين العمل جنبًا إلى جنب.
|
||||
|
||||
لشحن السجلات عبر stdout (مسار Datadog Agent) بدلاً من OTLP، راجع [سجلات JSON المنظمة](/ar/enterprise/guides/structured_logs) و[لوحة معلومات Datadog لـ crewAI](/ar/enterprise/guides/datadog_dashboard).
|
||||
|
||||
<Frame></Frame>
|
||||
</Tab>
|
||||
|
||||
140
docs/edge/ar/enterprise/guides/datadog_dashboard.mdx
Normal file
140
docs/edge/ar/enterprise/guides/datadog_dashboard.mdx
Normal file
@@ -0,0 +1,140 @@
|
||||
---
|
||||
title: "لوحة معلومات Datadog لـ crewAI"
|
||||
description: "استورد لوحة معلومات Datadog جاهزة لمراقبة عمليات نشر CrewAI AMP المُستضافة ذاتيًا — التنفيذات والأخطاء وتكلفة الرموز وتوزيع الإصدار. يعمل مع وكيل Datadog ومع استيعاب OTLP من Datadog."
|
||||
icon: "dog"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**الترجمة قيد التقدم** — يتم عرض المحتوى باللغة الإنجليزية.
|
||||
</Note>
|
||||
|
||||
CrewAI ships a ready-made Datadog dashboard for self-hosted AMP deployments. Once your logs are flowing into Datadog, you can import the dashboard JSON and have an operations view live in your account in under five minutes.
|
||||
|
||||
The dashboard works with either of Datadog's two log-ingestion paths — pick whichever fits your infrastructure:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Datadog Agent (stdout)">
|
||||
The Datadog Agent runs alongside your CrewAI containers (typically as a DaemonSet on Kubernetes) and tails their stdout. This path requires enabling [Structured JSON Logs](/ar/enterprise/guides/structured_logs) so each log event is a single billable line instead of a multi-line traceback.
|
||||
|
||||
**Setup:**
|
||||
1. Set `CREWAI_LOG_FORMAT=json` on every CrewAI container — see [Structured JSON Logs](/ar/enterprise/guides/structured_logs) for full details.
|
||||
2. Install the Datadog Agent in your cluster following [Datadog's Kubernetes setup guide](https://docs.datadoghq.com/containers/kubernetes/installation/). Enable log collection (`logs_enabled: true`) and container log collection (`logs_config.container_collect_all: true`).
|
||||
3. Confirm logs are landing in Datadog by searching `service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you already run the Datadog Agent for infrastructure metrics, or you want logs without configuring an OTel collector in AMP.
|
||||
</Tab>
|
||||
<Tab title="Datadog OTLP intake (no agent)">
|
||||
Datadog accepts OTLP traffic directly at its intake endpoint, no agent required. Configure CrewAI AMP's built-in OTel collector to point at Datadog's OTLP host.
|
||||
|
||||
**Setup:**
|
||||
1. In CrewAI AMP: **Settings → OpenTelemetry Collectors → Add Collector → Datadog**. See [OpenTelemetry Export](/ar/enterprise/guides/capture_telemetry_logs) for the full collector setup.
|
||||
2. The default Datadog template ships **traces** to `/v1/traces`. For log export, switch the endpoint path to `/v1/logs` on the OpenTelemetry Logs collector (use the same Datadog OTLP host).
|
||||
3. Confirm logs are landing by searching `source:otlp service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you can't or don't want to run the Datadog Agent, or you're already using OTLP for traces and want a single export pipeline.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
Either path lands the same structured facets in Datadog (`@automation_id`, `@kickoff_id`, `@execution_id`, `@automation_name`, `@crewai_version`, `@exception.type`, `@gen_ai.*`), so the dashboard works identically with either choice.
|
||||
|
||||
## Prerequisite: promote facets
|
||||
|
||||
Datadog auto-discovers fields the first time it sees them but doesn't make them queryable in widgets until they're promoted to **facets**. This is a one-time setup in your Datadog account.
|
||||
|
||||
<Steps>
|
||||
<Step title="Search for a CrewAI log">
|
||||
Open [Logs Explorer](https://app.datadoghq.com/logs) and search `service:crewai*`. You should see at least one log event.
|
||||
</Step>
|
||||
<Step title="Promote each field">
|
||||
Click any log entry to open the right-hand details panel. For each field below, hover the field name → click the gear icon → **Create facet**.
|
||||
|
||||
- `automation_id`, `automation_name`, `execution_id`, `kickoff_id`, `task_id`
|
||||
- `crewai_version`, `model_id`
|
||||
- `exception.type`, `exception.message`
|
||||
|
||||
Skip any field that already shows a star icon next to its name — that means it's already a facet. The `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, and `gen_ai.request.model` facets are typically promoted automatically by Datadog's LLM Observability auto-discovery, but verify they exist before importing the dashboard.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Import the dashboard
|
||||
|
||||
<Steps>
|
||||
<Step title="Download the dashboard JSON">
|
||||
Save [`datadog_dashboard.json`](https://raw.githubusercontent.com/crewAIInc/crewAI/main/docs/edge/en/enterprise/guides/datadog_dashboard.json) to your machine.
|
||||
</Step>
|
||||
<Step title="Open the import dialog in Datadog">
|
||||
Navigate to **Dashboards → New Dashboard**. Click the **gear icon** in the top right of the empty dashboard and select **Import Dashboard JSON**.
|
||||
</Step>
|
||||
<Step title="Paste or upload the JSON">
|
||||
Paste the contents of `datadog_dashboard.json` into the import dialog (or drag the file in). Click **Import**.
|
||||
|
||||
Datadog creates the dashboard immediately and lands you on it. The first load may show empty widgets for a few seconds while queries execute against the time range.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
Datadog's [Dashboard API](https://docs.datadoghq.com/api/latest/dashboards/#create-a-new-dashboard) accepts the same JSON via `POST /api/v1/dashboard`. Use it if you manage dashboards through Terraform, Pulumi, or CI.
|
||||
</Tip>
|
||||
|
||||
## What you get
|
||||
|
||||
The dashboard is organized into four sections plus a placeholder for a custom drill-down widget:
|
||||
|
||||
| Section | Widgets | Useful for |
|
||||
|---------|---------|------------|
|
||||
| **Header** | Total Executions · Error Rate (%) · Active Automations · CrewAI Versions in Use | At-a-glance health for the last hour. Error Rate is conditionally formatted (green ≤ 5%, yellow ≤ 10%, red > 10%). |
|
||||
| **Throughput** | Executions per Hour by Automation (top 10, stacked bars) | Spotting traffic shifts, surfacing busy automations, validating that a rollout didn't change baseline volume. |
|
||||
| **Errors** | Errors by Exception Type (top 5, stacked bars) · Top Exception Types by Count (toplist) | Triaging failures — which exception types are spiking, which automations they're hitting. |
|
||||
| **Cost** | Total Tokens per Hour by Model (input + output, stacked area) | Tracking LLM token spend by model. Useful for catching cost regressions when an automation switches model or starts looping. |
|
||||
| **Drill-Down** | _(empty placeholder)_ | See [Customization](#customization) for adding a recent-errors log stream here. |
|
||||
|
||||
Three template variables at the top of the dashboard re-scope every widget at once:
|
||||
|
||||
- **`$automation`** — filter to a single automation by name.
|
||||
- **`$version`** — filter to a single `crewai` SDK version (useful for comparing pre- and post-upgrade behavior).
|
||||
- **`$service`** — filter to a specific Datadog `service` tag (useful when multiple CrewAI deployments share one Datadog account).
|
||||
|
||||
## Customization
|
||||
|
||||
The dashboard ships with deliberate gaps so you can extend it without uninstalling and re-importing.
|
||||
|
||||
### Add a Recent Errors log stream
|
||||
|
||||
The **Drill-Down** section is intentionally empty. Add a Log Stream widget to it for an inline view of recent failures:
|
||||
|
||||
1. Edit the dashboard and click **+ Add Widgets** inside the Drill-Down group.
|
||||
2. Drag in a **Log Stream** widget.
|
||||
3. Set the filter query to `status:error $automation $version $service`.
|
||||
4. Choose columns: `@timestamp`, `@automation_name`, `@exception.type`, `@exception.message`, `@execution_id`.
|
||||
5. Sort by most recent, limit to 25 entries.
|
||||
|
||||
Clicking any row jumps to Logs Explorer with the same filter pre-applied.
|
||||
|
||||
### Add p95 latency
|
||||
|
||||
Logs don't include execution duration by default. Two ways to add a latency widget:
|
||||
|
||||
- **From APM traces** — if you also export OTLP traces to Datadog, add a Timeseries widget with data source **Traces**, query `service:crewai*`, aggregation `p95 of @duration`. Datadog APM auto-tracks span duration.
|
||||
- **From metric extraction** — extract a `flow.duration_ms` metric from logs via [Datadog's log-to-metric pipeline](https://docs.datadoghq.com/logs/log_configuration/logs_to_metrics/), then chart it like any other metric. Useful if you don't run APM.
|
||||
|
||||
### Re-scope to multiple deployments
|
||||
|
||||
The `$service` template variable defaults to `*` and will catch every CrewAI deployment in your Datadog account. Change the default to a specific service name in **Configure → Template Variables** if you want the dashboard to focus on one deployment by default.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| All widgets show "No data" | Facets aren't promoted | Re-do the [Promote facets](#prerequisite-promote-facets) step. Datadog won't query against an un-promoted field. |
|
||||
| Error Rate widget shows `NaN` | No executions in the time window | Either no traffic, or `@execution_id` isn't faceted. Expand the time range and re-check facets. |
|
||||
| Throughput chart is flat at the same value | Logs aren't reaching Datadog | Search `service:crewai*` in Logs Explorer. If nothing shows, verify the Datadog Agent is running (Agent path) or the OTel collector endpoint is correct (OTLP path). |
|
||||
| `crewai_version` shows fewer values than expected | Some containers predate the structured-logs work | The `crewai_version` field was added alongside JSON output. Older deployments running text mode (or older AMP builds) won't emit it. Upgrade those deployments to pick up the field. |
|
||||
| Template variables don't filter widgets | The widget's filter line doesn't reference the template variable | Edit the widget and confirm the search includes `$automation $version $service`. |
|
||||
|
||||
## References
|
||||
|
||||
- [Structured JSON Logs](/ar/enterprise/guides/structured_logs) — the underlying log format the dashboard queries against.
|
||||
- [OpenTelemetry Export](/ar/enterprise/guides/capture_telemetry_logs) — set up the OTLP path if you're not using the Datadog Agent.
|
||||
- [Datadog Log Search Syntax](https://docs.datadoghq.com/logs/explorer/search_syntax/) — reference for customizing widget queries.
|
||||
- [Datadog Dashboard JSON Schema](https://docs.datadoghq.com/dashboards/graphing_json/) — full reference for the dashboard file format if you want to script changes.
|
||||
146
docs/edge/ar/enterprise/guides/structured_logs.mdx
Normal file
146
docs/edge/ar/enterprise/guides/structured_logs.mdx
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
title: "سجلات JSON المنظمة"
|
||||
description: "أصدر أحداث سجل JSON ذات سطر واحد من عمليات نشر CrewAI AMP للحصول على استيعاب منظم وأقل تكلفة في Datadog وSplunk وLoki وواجهات سجل أخرى."
|
||||
icon: "brackets-curly"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**الترجمة قيد التقدم** — يتم عرض المحتوى باللغة الإنجليزية.
|
||||
</Note>
|
||||
|
||||
CrewAI AMP can emit one JSON object per log event on stdout instead of the default multi-line text format. Each event ships with typed context fields (automation, kickoff, execution, trace IDs, exception details), making logs cheaper to index, easier to search, and trivially correlatable with traces.
|
||||
|
||||
This page describes the JSON schema, how to enable it, and how to verify it's working. For a ready-made Datadog dashboard built on top of these fields, see [Datadog Dashboard for crewAI](/ar/enterprise/guides/datadog_dashboard).
|
||||
|
||||
## Why use JSON output
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Lower ingestion cost" icon="dollar-sign">
|
||||
Most managed log backends bill per event. A Python traceback in text format is counted as one event per line — 30+ events for a single error. JSON output collapses each traceback into a single event with the stack trace as an escaped string field.
|
||||
</Card>
|
||||
<Card title="Structured search" icon="magnifying-glass">
|
||||
Search by `@automation_id`, `@exception.type`, `@kickoff_id` instead of grepping free-text. Build dashboards on typed facets without parser configuration.
|
||||
</Card>
|
||||
<Card title="APM ↔ logs correlation" icon="link">
|
||||
Every event carries `trace_id` and `span_id` when fired inside a recording span, so backends auto-link logs to traces.
|
||||
</Card>
|
||||
<Card title="Backend agnostic" icon="server">
|
||||
The format is plain JSON — Datadog, Splunk, Loki, Elasticsearch, and CloudWatch all parse it natively without custom log pipelines.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Enabling JSON output
|
||||
|
||||
Set the `CREWAI_LOG_FORMAT` environment variable to `json` on every container that runs your deployment (API + workers).
|
||||
|
||||
```shell
|
||||
CREWAI_LOG_FORMAT=json
|
||||
```
|
||||
|
||||
Restart the deployment to pick up the change. Every log line on stdout from that point on is a single JSON object.
|
||||
|
||||
<Note>
|
||||
The default value is `text`, which preserves the legacy human-readable line format byte-for-byte. Setting any value other than `json` falls back to text mode. There is no migration step — the variable is read at process start and the format switches immediately.
|
||||
</Note>
|
||||
|
||||
## What a log event looks like
|
||||
|
||||
A single info-level log inside an active automation kickoff:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:23.482914Z",
|
||||
"level": "INFO",
|
||||
"logger": "crewai_enterprise.utilities.pii_redaction",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "PII tracking state reset (engines preserved)",
|
||||
"automation_id": "12",
|
||||
"task_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow"
|
||||
}
|
||||
```
|
||||
|
||||
An error with a Python exception is collapsed into a single event with the traceback as a string:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:31.218450Z",
|
||||
"level": "ERROR",
|
||||
"logger": "api.tasks.flow_run_task",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "Flow execution failed",
|
||||
"automation_id": "12",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow",
|
||||
"exception": {
|
||||
"type": "ValueError",
|
||||
"message": "Topic cannot be empty",
|
||||
"stacktrace": "Traceback (most recent call last):\n File \"/app/flow.py\", line 42, in summarize\n ...\nValueError: Topic cannot be empty\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The same error in legacy text mode would have produced ~25 separate log events (one per traceback line) — all of which the backend would bill and index individually.
|
||||
|
||||
## Schema v1 field reference
|
||||
|
||||
Within the `v1` schema, fields are only added, never renamed or removed. New fields will appear as soon as a deployment is upgraded.
|
||||
|
||||
| Field | Type | Always present | Source |
|
||||
|-------|------|----------------|--------|
|
||||
| `schema` | string | Yes | Constant `"v1"`. Increment indicates a breaking schema change. |
|
||||
| `ts` | string (ISO-8601 UTC, microseconds) | Yes | Record creation time, e.g. `2026-06-17T16:14:23.482914Z`. |
|
||||
| `level` | string | Yes | Python log level name: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`. |
|
||||
| `logger` | string | Yes | Dotted logger name, e.g. `api.tasks.flow_run_task`. |
|
||||
| `crewai_version` | string | Yes (when `crewai` package metadata is resolvable) | Installed `crewai` package version, e.g. `"1.14.7"`. |
|
||||
| `msg` | string | Yes | Rendered log message (after `%`-formatting / `{}`-formatting). |
|
||||
| `automation_id` | string | When `CREWAI_PLUS_ID` env var is set | Numeric deployment ID (AMP provisions this on every container). |
|
||||
| `task_id` | string | On Celery worker logs | Celery task UUID, or `"no-task"` for non-task contexts. |
|
||||
| `kickoff_id` | string | Inside an automation kickoff | UUID of the current kickoff. |
|
||||
| `execution_id` | string | Inside an automation kickoff | UUID of the current sub-execution. Equal to `kickoff_id` at the top level; differs for nested flow methods that spawn sub-executions. |
|
||||
| `automation_name` | string | Inside an automation kickoff | Human-readable automation/flow name, e.g. `"research_flow"`. |
|
||||
| `trace_id` | string (32-hex) | Inside a recording OpenTelemetry span | Hex trace ID. Omitted when no span is active. |
|
||||
| `span_id` | string (16-hex) | Inside a recording OpenTelemetry span | Hex span ID. Omitted when no span is active. |
|
||||
| `exception` | object | When the log record has `exc_info` | `{type, message, stacktrace}` — full traceback as a single escaped string. |
|
||||
|
||||
<Tip>
|
||||
Any additional `extra={...}` kwargs passed to a logger call appear as top-level JSON fields verbatim. Reserved field names above always win to keep the schema stable.
|
||||
</Tip>
|
||||
|
||||
## Verifying it's working
|
||||
|
||||
After enabling the env var and restarting, fetch the latest container logs and confirm each line is a single JSON object:
|
||||
|
||||
```shell
|
||||
# Example: docker logs <api-container> --tail 10
|
||||
docker logs $(docker ps -qf name=crewai-api) --tail 10 | jq -r '.msg'
|
||||
```
|
||||
|
||||
If the output is JSON, each line will parse successfully and `jq` will print only the `msg` field. If you see "parse error", the env var didn't take effect — confirm it's set in the running container and that the deployment was restarted after the change.
|
||||
|
||||
## Compatibility and versioning
|
||||
|
||||
The `schema` field declares the contract. Within `v1`, CrewAI commits to:
|
||||
|
||||
- **Never removing a field** that customers may have built queries or dashboards against.
|
||||
- **Never renaming a field** in place — renames happen via a schema bump (e.g. `v2`), with the old name kept as a deprecated alias for at least one release cycle.
|
||||
- **Adding new fields** at any time. Consumers should ignore unknown top-level keys.
|
||||
|
||||
When a `v2` is introduced, both the `schema` field and the migration guide will be published in advance, and `v1` will continue to be emitted for one release cycle so dashboards and queries have time to migrate.
|
||||
|
||||
## What's next
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Datadog Dashboard for crewAI" icon="dog" href="/ar/enterprise/guides/datadog_dashboard">
|
||||
Import a ready-made operations dashboard built on these facets — executions, errors, token cost, version distribution. Works with both the Datadog Agent and Datadog's OTLP intake.
|
||||
</Card>
|
||||
<Card title="OpenTelemetry Export" icon="magnifying-glass-chart" href="/ar/enterprise/guides/capture_telemetry_logs">
|
||||
Ship logs and traces to your own OTel collector or directly to a backend's OTLP intake. The same context fields land as OTLP attributes, so the dashboard works regardless of which path you use.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
@@ -49,7 +49,9 @@ CrewAI AMP는 배포에서 OpenTelemetry **트레이스**와 **로그**를 자
|
||||
- `otlp.ap1.datadoghq.com` (AP1)
|
||||
- **API Key** — Datadog API 키입니다. [키 생성 방법](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys)을 참고하세요.
|
||||
|
||||
Datadog 통합은 **트레이스**를 내보냅니다.
|
||||
기본 Datadog 템플릿은 **트레이스**를 `/v1/traces` 경로로 전송합니다. **로그**를 OTLP로 내보내려면 동일한 Datadog OTLP 호스트에 경로를 `/v1/logs`로 설정한 **OpenTelemetry Logs** 수집기를 추가하세요 — 두 신호는 나란히 실행될 수 있습니다.
|
||||
|
||||
OTLP 대신 stdout 기반 로그 전송(Datadog Agent 경로)을 원하면 [구조화된 JSON 로그](/ko/enterprise/guides/structured_logs)와 [crewAI용 Datadog 대시보드](/ko/enterprise/guides/datadog_dashboard)를 참고하세요.
|
||||
|
||||
<Frame></Frame>
|
||||
</Tab>
|
||||
|
||||
140
docs/edge/ko/enterprise/guides/datadog_dashboard.mdx
Normal file
140
docs/edge/ko/enterprise/guides/datadog_dashboard.mdx
Normal file
@@ -0,0 +1,140 @@
|
||||
---
|
||||
title: "crewAI용 Datadog 대시보드"
|
||||
description: "자체 호스팅 CrewAI AMP 배포 모니터링을 위한 기성 Datadog 대시보드를 가져오세요 — 실행, 오류, 토큰 비용 및 버전 분포. Datadog Agent와 Datadog의 OTLP 수집 모두에서 작동합니다."
|
||||
icon: "dog"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**번역 진행 중** — 콘텐츠가 영어로 표시됩니다.
|
||||
</Note>
|
||||
|
||||
CrewAI ships a ready-made Datadog dashboard for self-hosted AMP deployments. Once your logs are flowing into Datadog, you can import the dashboard JSON and have an operations view live in your account in under five minutes.
|
||||
|
||||
The dashboard works with either of Datadog's two log-ingestion paths — pick whichever fits your infrastructure:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Datadog Agent (stdout)">
|
||||
The Datadog Agent runs alongside your CrewAI containers (typically as a DaemonSet on Kubernetes) and tails their stdout. This path requires enabling [Structured JSON Logs](/ko/enterprise/guides/structured_logs) so each log event is a single billable line instead of a multi-line traceback.
|
||||
|
||||
**Setup:**
|
||||
1. Set `CREWAI_LOG_FORMAT=json` on every CrewAI container — see [Structured JSON Logs](/ko/enterprise/guides/structured_logs) for full details.
|
||||
2. Install the Datadog Agent in your cluster following [Datadog's Kubernetes setup guide](https://docs.datadoghq.com/containers/kubernetes/installation/). Enable log collection (`logs_enabled: true`) and container log collection (`logs_config.container_collect_all: true`).
|
||||
3. Confirm logs are landing in Datadog by searching `service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you already run the Datadog Agent for infrastructure metrics, or you want logs without configuring an OTel collector in AMP.
|
||||
</Tab>
|
||||
<Tab title="Datadog OTLP intake (no agent)">
|
||||
Datadog accepts OTLP traffic directly at its intake endpoint, no agent required. Configure CrewAI AMP's built-in OTel collector to point at Datadog's OTLP host.
|
||||
|
||||
**Setup:**
|
||||
1. In CrewAI AMP: **Settings → OpenTelemetry Collectors → Add Collector → Datadog**. See [OpenTelemetry Export](/ko/enterprise/guides/capture_telemetry_logs) for the full collector setup.
|
||||
2. The default Datadog template ships **traces** to `/v1/traces`. For log export, switch the endpoint path to `/v1/logs` on the OpenTelemetry Logs collector (use the same Datadog OTLP host).
|
||||
3. Confirm logs are landing by searching `source:otlp service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you can't or don't want to run the Datadog Agent, or you're already using OTLP for traces and want a single export pipeline.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
Either path lands the same structured facets in Datadog (`@automation_id`, `@kickoff_id`, `@execution_id`, `@automation_name`, `@crewai_version`, `@exception.type`, `@gen_ai.*`), so the dashboard works identically with either choice.
|
||||
|
||||
## Prerequisite: promote facets
|
||||
|
||||
Datadog auto-discovers fields the first time it sees them but doesn't make them queryable in widgets until they're promoted to **facets**. This is a one-time setup in your Datadog account.
|
||||
|
||||
<Steps>
|
||||
<Step title="Search for a CrewAI log">
|
||||
Open [Logs Explorer](https://app.datadoghq.com/logs) and search `service:crewai*`. You should see at least one log event.
|
||||
</Step>
|
||||
<Step title="Promote each field">
|
||||
Click any log entry to open the right-hand details panel. For each field below, hover the field name → click the gear icon → **Create facet**.
|
||||
|
||||
- `automation_id`, `automation_name`, `execution_id`, `kickoff_id`, `task_id`
|
||||
- `crewai_version`, `model_id`
|
||||
- `exception.type`, `exception.message`
|
||||
|
||||
Skip any field that already shows a star icon next to its name — that means it's already a facet. The `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, and `gen_ai.request.model` facets are typically promoted automatically by Datadog's LLM Observability auto-discovery, but verify they exist before importing the dashboard.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Import the dashboard
|
||||
|
||||
<Steps>
|
||||
<Step title="Download the dashboard JSON">
|
||||
Save [`datadog_dashboard.json`](https://raw.githubusercontent.com/crewAIInc/crewAI/main/docs/edge/en/enterprise/guides/datadog_dashboard.json) to your machine.
|
||||
</Step>
|
||||
<Step title="Open the import dialog in Datadog">
|
||||
Navigate to **Dashboards → New Dashboard**. Click the **gear icon** in the top right of the empty dashboard and select **Import Dashboard JSON**.
|
||||
</Step>
|
||||
<Step title="Paste or upload the JSON">
|
||||
Paste the contents of `datadog_dashboard.json` into the import dialog (or drag the file in). Click **Import**.
|
||||
|
||||
Datadog creates the dashboard immediately and lands you on it. The first load may show empty widgets for a few seconds while queries execute against the time range.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
Datadog's [Dashboard API](https://docs.datadoghq.com/api/latest/dashboards/#create-a-new-dashboard) accepts the same JSON via `POST /api/v1/dashboard`. Use it if you manage dashboards through Terraform, Pulumi, or CI.
|
||||
</Tip>
|
||||
|
||||
## What you get
|
||||
|
||||
The dashboard is organized into four sections plus a placeholder for a custom drill-down widget:
|
||||
|
||||
| Section | Widgets | Useful for |
|
||||
|---------|---------|------------|
|
||||
| **Header** | Total Executions · Error Rate (%) · Active Automations · CrewAI Versions in Use | At-a-glance health for the last hour. Error Rate is conditionally formatted (green ≤ 5%, yellow ≤ 10%, red > 10%). |
|
||||
| **Throughput** | Executions per Hour by Automation (top 10, stacked bars) | Spotting traffic shifts, surfacing busy automations, validating that a rollout didn't change baseline volume. |
|
||||
| **Errors** | Errors by Exception Type (top 5, stacked bars) · Top Exception Types by Count (toplist) | Triaging failures — which exception types are spiking, which automations they're hitting. |
|
||||
| **Cost** | Total Tokens per Hour by Model (input + output, stacked area) | Tracking LLM token spend by model. Useful for catching cost regressions when an automation switches model or starts looping. |
|
||||
| **Drill-Down** | _(empty placeholder)_ | See [Customization](#customization) for adding a recent-errors log stream here. |
|
||||
|
||||
Three template variables at the top of the dashboard re-scope every widget at once:
|
||||
|
||||
- **`$automation`** — filter to a single automation by name.
|
||||
- **`$version`** — filter to a single `crewai` SDK version (useful for comparing pre- and post-upgrade behavior).
|
||||
- **`$service`** — filter to a specific Datadog `service` tag (useful when multiple CrewAI deployments share one Datadog account).
|
||||
|
||||
## Customization
|
||||
|
||||
The dashboard ships with deliberate gaps so you can extend it without uninstalling and re-importing.
|
||||
|
||||
### Add a Recent Errors log stream
|
||||
|
||||
The **Drill-Down** section is intentionally empty. Add a Log Stream widget to it for an inline view of recent failures:
|
||||
|
||||
1. Edit the dashboard and click **+ Add Widgets** inside the Drill-Down group.
|
||||
2. Drag in a **Log Stream** widget.
|
||||
3. Set the filter query to `status:error $automation $version $service`.
|
||||
4. Choose columns: `@timestamp`, `@automation_name`, `@exception.type`, `@exception.message`, `@execution_id`.
|
||||
5. Sort by most recent, limit to 25 entries.
|
||||
|
||||
Clicking any row jumps to Logs Explorer with the same filter pre-applied.
|
||||
|
||||
### Add p95 latency
|
||||
|
||||
Logs don't include execution duration by default. Two ways to add a latency widget:
|
||||
|
||||
- **From APM traces** — if you also export OTLP traces to Datadog, add a Timeseries widget with data source **Traces**, query `service:crewai*`, aggregation `p95 of @duration`. Datadog APM auto-tracks span duration.
|
||||
- **From metric extraction** — extract a `flow.duration_ms` metric from logs via [Datadog's log-to-metric pipeline](https://docs.datadoghq.com/logs/log_configuration/logs_to_metrics/), then chart it like any other metric. Useful if you don't run APM.
|
||||
|
||||
### Re-scope to multiple deployments
|
||||
|
||||
The `$service` template variable defaults to `*` and will catch every CrewAI deployment in your Datadog account. Change the default to a specific service name in **Configure → Template Variables** if you want the dashboard to focus on one deployment by default.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| All widgets show "No data" | Facets aren't promoted | Re-do the [Promote facets](#prerequisite-promote-facets) step. Datadog won't query against an un-promoted field. |
|
||||
| Error Rate widget shows `NaN` | No executions in the time window | Either no traffic, or `@execution_id` isn't faceted. Expand the time range and re-check facets. |
|
||||
| Throughput chart is flat at the same value | Logs aren't reaching Datadog | Search `service:crewai*` in Logs Explorer. If nothing shows, verify the Datadog Agent is running (Agent path) or the OTel collector endpoint is correct (OTLP path). |
|
||||
| `crewai_version` shows fewer values than expected | Some containers predate the structured-logs work | The `crewai_version` field was added alongside JSON output. Older deployments running text mode (or older AMP builds) won't emit it. Upgrade those deployments to pick up the field. |
|
||||
| Template variables don't filter widgets | The widget's filter line doesn't reference the template variable | Edit the widget and confirm the search includes `$automation $version $service`. |
|
||||
|
||||
## References
|
||||
|
||||
- [Structured JSON Logs](/ko/enterprise/guides/structured_logs) — the underlying log format the dashboard queries against.
|
||||
- [OpenTelemetry Export](/ko/enterprise/guides/capture_telemetry_logs) — set up the OTLP path if you're not using the Datadog Agent.
|
||||
- [Datadog Log Search Syntax](https://docs.datadoghq.com/logs/explorer/search_syntax/) — reference for customizing widget queries.
|
||||
- [Datadog Dashboard JSON Schema](https://docs.datadoghq.com/dashboards/graphing_json/) — full reference for the dashboard file format if you want to script changes.
|
||||
146
docs/edge/ko/enterprise/guides/structured_logs.mdx
Normal file
146
docs/edge/ko/enterprise/guides/structured_logs.mdx
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
title: "구조화된 JSON 로그"
|
||||
description: "CrewAI AMP 배포에서 한 줄 JSON 로그 이벤트를 내보내 Datadog, Splunk, Loki 및 기타 로그 백엔드에서 더 저렴하고 구조화된 수집을 수행하세요."
|
||||
icon: "brackets-curly"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**번역 진행 중** — 콘텐츠가 영어로 표시됩니다.
|
||||
</Note>
|
||||
|
||||
CrewAI AMP can emit one JSON object per log event on stdout instead of the default multi-line text format. Each event ships with typed context fields (automation, kickoff, execution, trace IDs, exception details), making logs cheaper to index, easier to search, and trivially correlatable with traces.
|
||||
|
||||
This page describes the JSON schema, how to enable it, and how to verify it's working. For a ready-made Datadog dashboard built on top of these fields, see [Datadog Dashboard for crewAI](/ko/enterprise/guides/datadog_dashboard).
|
||||
|
||||
## Why use JSON output
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Lower ingestion cost" icon="dollar-sign">
|
||||
Most managed log backends bill per event. A Python traceback in text format is counted as one event per line — 30+ events for a single error. JSON output collapses each traceback into a single event with the stack trace as an escaped string field.
|
||||
</Card>
|
||||
<Card title="Structured search" icon="magnifying-glass">
|
||||
Search by `@automation_id`, `@exception.type`, `@kickoff_id` instead of grepping free-text. Build dashboards on typed facets without parser configuration.
|
||||
</Card>
|
||||
<Card title="APM ↔ logs correlation" icon="link">
|
||||
Every event carries `trace_id` and `span_id` when fired inside a recording span, so backends auto-link logs to traces.
|
||||
</Card>
|
||||
<Card title="Backend agnostic" icon="server">
|
||||
The format is plain JSON — Datadog, Splunk, Loki, Elasticsearch, and CloudWatch all parse it natively without custom log pipelines.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Enabling JSON output
|
||||
|
||||
Set the `CREWAI_LOG_FORMAT` environment variable to `json` on every container that runs your deployment (API + workers).
|
||||
|
||||
```shell
|
||||
CREWAI_LOG_FORMAT=json
|
||||
```
|
||||
|
||||
Restart the deployment to pick up the change. Every log line on stdout from that point on is a single JSON object.
|
||||
|
||||
<Note>
|
||||
The default value is `text`, which preserves the legacy human-readable line format byte-for-byte. Setting any value other than `json` falls back to text mode. There is no migration step — the variable is read at process start and the format switches immediately.
|
||||
</Note>
|
||||
|
||||
## What a log event looks like
|
||||
|
||||
A single info-level log inside an active automation kickoff:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:23.482914Z",
|
||||
"level": "INFO",
|
||||
"logger": "crewai_enterprise.utilities.pii_redaction",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "PII tracking state reset (engines preserved)",
|
||||
"automation_id": "12",
|
||||
"task_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow"
|
||||
}
|
||||
```
|
||||
|
||||
An error with a Python exception is collapsed into a single event with the traceback as a string:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:31.218450Z",
|
||||
"level": "ERROR",
|
||||
"logger": "api.tasks.flow_run_task",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "Flow execution failed",
|
||||
"automation_id": "12",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow",
|
||||
"exception": {
|
||||
"type": "ValueError",
|
||||
"message": "Topic cannot be empty",
|
||||
"stacktrace": "Traceback (most recent call last):\n File \"/app/flow.py\", line 42, in summarize\n ...\nValueError: Topic cannot be empty\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The same error in legacy text mode would have produced ~25 separate log events (one per traceback line) — all of which the backend would bill and index individually.
|
||||
|
||||
## Schema v1 field reference
|
||||
|
||||
Within the `v1` schema, fields are only added, never renamed or removed. New fields will appear as soon as a deployment is upgraded.
|
||||
|
||||
| Field | Type | Always present | Source |
|
||||
|-------|------|----------------|--------|
|
||||
| `schema` | string | Yes | Constant `"v1"`. Increment indicates a breaking schema change. |
|
||||
| `ts` | string (ISO-8601 UTC, microseconds) | Yes | Record creation time, e.g. `2026-06-17T16:14:23.482914Z`. |
|
||||
| `level` | string | Yes | Python log level name: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`. |
|
||||
| `logger` | string | Yes | Dotted logger name, e.g. `api.tasks.flow_run_task`. |
|
||||
| `crewai_version` | string | Yes (when `crewai` package metadata is resolvable) | Installed `crewai` package version, e.g. `"1.14.7"`. |
|
||||
| `msg` | string | Yes | Rendered log message (after `%`-formatting / `{}`-formatting). |
|
||||
| `automation_id` | string | When `CREWAI_PLUS_ID` env var is set | Numeric deployment ID (AMP provisions this on every container). |
|
||||
| `task_id` | string | On Celery worker logs | Celery task UUID, or `"no-task"` for non-task contexts. |
|
||||
| `kickoff_id` | string | Inside an automation kickoff | UUID of the current kickoff. |
|
||||
| `execution_id` | string | Inside an automation kickoff | UUID of the current sub-execution. Equal to `kickoff_id` at the top level; differs for nested flow methods that spawn sub-executions. |
|
||||
| `automation_name` | string | Inside an automation kickoff | Human-readable automation/flow name, e.g. `"research_flow"`. |
|
||||
| `trace_id` | string (32-hex) | Inside a recording OpenTelemetry span | Hex trace ID. Omitted when no span is active. |
|
||||
| `span_id` | string (16-hex) | Inside a recording OpenTelemetry span | Hex span ID. Omitted when no span is active. |
|
||||
| `exception` | object | When the log record has `exc_info` | `{type, message, stacktrace}` — full traceback as a single escaped string. |
|
||||
|
||||
<Tip>
|
||||
Any additional `extra={...}` kwargs passed to a logger call appear as top-level JSON fields verbatim. Reserved field names above always win to keep the schema stable.
|
||||
</Tip>
|
||||
|
||||
## Verifying it's working
|
||||
|
||||
After enabling the env var and restarting, fetch the latest container logs and confirm each line is a single JSON object:
|
||||
|
||||
```shell
|
||||
# Example: docker logs <api-container> --tail 10
|
||||
docker logs $(docker ps -qf name=crewai-api) --tail 10 | jq -r '.msg'
|
||||
```
|
||||
|
||||
If the output is JSON, each line will parse successfully and `jq` will print only the `msg` field. If you see "parse error", the env var didn't take effect — confirm it's set in the running container and that the deployment was restarted after the change.
|
||||
|
||||
## Compatibility and versioning
|
||||
|
||||
The `schema` field declares the contract. Within `v1`, CrewAI commits to:
|
||||
|
||||
- **Never removing a field** that customers may have built queries or dashboards against.
|
||||
- **Never renaming a field** in place — renames happen via a schema bump (e.g. `v2`), with the old name kept as a deprecated alias for at least one release cycle.
|
||||
- **Adding new fields** at any time. Consumers should ignore unknown top-level keys.
|
||||
|
||||
When a `v2` is introduced, both the `schema` field and the migration guide will be published in advance, and `v1` will continue to be emitted for one release cycle so dashboards and queries have time to migrate.
|
||||
|
||||
## What's next
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Datadog Dashboard for crewAI" icon="dog" href="/ko/enterprise/guides/datadog_dashboard">
|
||||
Import a ready-made operations dashboard built on these facets — executions, errors, token cost, version distribution. Works with both the Datadog Agent and Datadog's OTLP intake.
|
||||
</Card>
|
||||
<Card title="OpenTelemetry Export" icon="magnifying-glass-chart" href="/ko/enterprise/guides/capture_telemetry_logs">
|
||||
Ship logs and traces to your own OTel collector or directly to a backend's OTLP intake. The same context fields land as OTLP attributes, so the dashboard works regardless of which path you use.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
@@ -49,7 +49,9 @@ Os dados de telemetria seguem as [convenções semânticas GenAI do OpenTelemetr
|
||||
- `otlp.ap1.datadoghq.com` (AP1)
|
||||
- **API Key** — Sua chave de API do Datadog. Veja [como criar uma](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys).
|
||||
|
||||
A integração com o Datadog exporta **traces**.
|
||||
O template padrão do Datadog envia **traces** para o caminho `/v1/traces`. Para exportar **logs** via OTLP, adicione um coletor **OpenTelemetry Logs** apontando para o mesmo host OTLP do Datadog com o caminho definido como `/v1/logs` — ambos os sinais podem rodar lado a lado.
|
||||
|
||||
Para envio de logs via stdout (o caminho do Datadog Agent) em vez de OTLP, veja [Logs JSON Estruturados](/pt-BR/enterprise/guides/structured_logs) e [Dashboard Datadog para crewAI](/pt-BR/enterprise/guides/datadog_dashboard).
|
||||
|
||||
<Frame></Frame>
|
||||
</Tab>
|
||||
|
||||
140
docs/edge/pt-BR/enterprise/guides/datadog_dashboard.mdx
Normal file
140
docs/edge/pt-BR/enterprise/guides/datadog_dashboard.mdx
Normal file
@@ -0,0 +1,140 @@
|
||||
---
|
||||
title: "Dashboard Datadog para crewAI"
|
||||
description: "Importe um dashboard Datadog pronto para monitorar implantações CrewAI AMP auto-hospedadas — execuções, erros, custo de tokens e distribuição de versão. Funciona com o Datadog Agent e com o ingest OTLP do Datadog."
|
||||
icon: "dog"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**Tradução em andamento** — conteúdo exibido em inglês.
|
||||
</Note>
|
||||
|
||||
CrewAI ships a ready-made Datadog dashboard for self-hosted AMP deployments. Once your logs are flowing into Datadog, you can import the dashboard JSON and have an operations view live in your account in under five minutes.
|
||||
|
||||
The dashboard works with either of Datadog's two log-ingestion paths — pick whichever fits your infrastructure:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Datadog Agent (stdout)">
|
||||
The Datadog Agent runs alongside your CrewAI containers (typically as a DaemonSet on Kubernetes) and tails their stdout. This path requires enabling [Structured JSON Logs](/pt-BR/enterprise/guides/structured_logs) so each log event is a single billable line instead of a multi-line traceback.
|
||||
|
||||
**Setup:**
|
||||
1. Set `CREWAI_LOG_FORMAT=json` on every CrewAI container — see [Structured JSON Logs](/pt-BR/enterprise/guides/structured_logs) for full details.
|
||||
2. Install the Datadog Agent in your cluster following [Datadog's Kubernetes setup guide](https://docs.datadoghq.com/containers/kubernetes/installation/). Enable log collection (`logs_enabled: true`) and container log collection (`logs_config.container_collect_all: true`).
|
||||
3. Confirm logs are landing in Datadog by searching `service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you already run the Datadog Agent for infrastructure metrics, or you want logs without configuring an OTel collector in AMP.
|
||||
</Tab>
|
||||
<Tab title="Datadog OTLP intake (no agent)">
|
||||
Datadog accepts OTLP traffic directly at its intake endpoint, no agent required. Configure CrewAI AMP's built-in OTel collector to point at Datadog's OTLP host.
|
||||
|
||||
**Setup:**
|
||||
1. In CrewAI AMP: **Settings → OpenTelemetry Collectors → Add Collector → Datadog**. See [OpenTelemetry Export](/pt-BR/enterprise/guides/capture_telemetry_logs) for the full collector setup.
|
||||
2. The default Datadog template ships **traces** to `/v1/traces`. For log export, switch the endpoint path to `/v1/logs` on the OpenTelemetry Logs collector (use the same Datadog OTLP host).
|
||||
3. Confirm logs are landing by searching `source:otlp service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you can't or don't want to run the Datadog Agent, or you're already using OTLP for traces and want a single export pipeline.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
Either path lands the same structured facets in Datadog (`@automation_id`, `@kickoff_id`, `@execution_id`, `@automation_name`, `@crewai_version`, `@exception.type`, `@gen_ai.*`), so the dashboard works identically with either choice.
|
||||
|
||||
## Prerequisite: promote facets
|
||||
|
||||
Datadog auto-discovers fields the first time it sees them but doesn't make them queryable in widgets until they're promoted to **facets**. This is a one-time setup in your Datadog account.
|
||||
|
||||
<Steps>
|
||||
<Step title="Search for a CrewAI log">
|
||||
Open [Logs Explorer](https://app.datadoghq.com/logs) and search `service:crewai*`. You should see at least one log event.
|
||||
</Step>
|
||||
<Step title="Promote each field">
|
||||
Click any log entry to open the right-hand details panel. For each field below, hover the field name → click the gear icon → **Create facet**.
|
||||
|
||||
- `automation_id`, `automation_name`, `execution_id`, `kickoff_id`, `task_id`
|
||||
- `crewai_version`, `model_id`
|
||||
- `exception.type`, `exception.message`
|
||||
|
||||
Skip any field that already shows a star icon next to its name — that means it's already a facet. The `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, and `gen_ai.request.model` facets are typically promoted automatically by Datadog's LLM Observability auto-discovery, but verify they exist before importing the dashboard.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Import the dashboard
|
||||
|
||||
<Steps>
|
||||
<Step title="Download the dashboard JSON">
|
||||
Save [`datadog_dashboard.json`](https://raw.githubusercontent.com/crewAIInc/crewAI/main/docs/edge/en/enterprise/guides/datadog_dashboard.json) to your machine.
|
||||
</Step>
|
||||
<Step title="Open the import dialog in Datadog">
|
||||
Navigate to **Dashboards → New Dashboard**. Click the **gear icon** in the top right of the empty dashboard and select **Import Dashboard JSON**.
|
||||
</Step>
|
||||
<Step title="Paste or upload the JSON">
|
||||
Paste the contents of `datadog_dashboard.json` into the import dialog (or drag the file in). Click **Import**.
|
||||
|
||||
Datadog creates the dashboard immediately and lands you on it. The first load may show empty widgets for a few seconds while queries execute against the time range.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
Datadog's [Dashboard API](https://docs.datadoghq.com/api/latest/dashboards/#create-a-new-dashboard) accepts the same JSON via `POST /api/v1/dashboard`. Use it if you manage dashboards through Terraform, Pulumi, or CI.
|
||||
</Tip>
|
||||
|
||||
## What you get
|
||||
|
||||
The dashboard is organized into four sections plus a placeholder for a custom drill-down widget:
|
||||
|
||||
| Section | Widgets | Useful for |
|
||||
|---------|---------|------------|
|
||||
| **Header** | Total Executions · Error Rate (%) · Active Automations · CrewAI Versions in Use | At-a-glance health for the last hour. Error Rate is conditionally formatted (green ≤ 5%, yellow ≤ 10%, red > 10%). |
|
||||
| **Throughput** | Executions per Hour by Automation (top 10, stacked bars) | Spotting traffic shifts, surfacing busy automations, validating that a rollout didn't change baseline volume. |
|
||||
| **Errors** | Errors by Exception Type (top 5, stacked bars) · Top Exception Types by Count (toplist) | Triaging failures — which exception types are spiking, which automations they're hitting. |
|
||||
| **Cost** | Total Tokens per Hour by Model (input + output, stacked area) | Tracking LLM token spend by model. Useful for catching cost regressions when an automation switches model or starts looping. |
|
||||
| **Drill-Down** | _(empty placeholder)_ | See [Customization](#customization) for adding a recent-errors log stream here. |
|
||||
|
||||
Three template variables at the top of the dashboard re-scope every widget at once:
|
||||
|
||||
- **`$automation`** — filter to a single automation by name.
|
||||
- **`$version`** — filter to a single `crewai` SDK version (useful for comparing pre- and post-upgrade behavior).
|
||||
- **`$service`** — filter to a specific Datadog `service` tag (useful when multiple CrewAI deployments share one Datadog account).
|
||||
|
||||
## Customization
|
||||
|
||||
The dashboard ships with deliberate gaps so you can extend it without uninstalling and re-importing.
|
||||
|
||||
### Add a Recent Errors log stream
|
||||
|
||||
The **Drill-Down** section is intentionally empty. Add a Log Stream widget to it for an inline view of recent failures:
|
||||
|
||||
1. Edit the dashboard and click **+ Add Widgets** inside the Drill-Down group.
|
||||
2. Drag in a **Log Stream** widget.
|
||||
3. Set the filter query to `status:error $automation $version $service`.
|
||||
4. Choose columns: `@timestamp`, `@automation_name`, `@exception.type`, `@exception.message`, `@execution_id`.
|
||||
5. Sort by most recent, limit to 25 entries.
|
||||
|
||||
Clicking any row jumps to Logs Explorer with the same filter pre-applied.
|
||||
|
||||
### Add p95 latency
|
||||
|
||||
Logs don't include execution duration by default. Two ways to add a latency widget:
|
||||
|
||||
- **From APM traces** — if you also export OTLP traces to Datadog, add a Timeseries widget with data source **Traces**, query `service:crewai*`, aggregation `p95 of @duration`. Datadog APM auto-tracks span duration.
|
||||
- **From metric extraction** — extract a `flow.duration_ms` metric from logs via [Datadog's log-to-metric pipeline](https://docs.datadoghq.com/logs/log_configuration/logs_to_metrics/), then chart it like any other metric. Useful if you don't run APM.
|
||||
|
||||
### Re-scope to multiple deployments
|
||||
|
||||
The `$service` template variable defaults to `*` and will catch every CrewAI deployment in your Datadog account. Change the default to a specific service name in **Configure → Template Variables** if you want the dashboard to focus on one deployment by default.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| All widgets show "No data" | Facets aren't promoted | Re-do the [Promote facets](#prerequisite-promote-facets) step. Datadog won't query against an un-promoted field. |
|
||||
| Error Rate widget shows `NaN` | No executions in the time window | Either no traffic, or `@execution_id` isn't faceted. Expand the time range and re-check facets. |
|
||||
| Throughput chart is flat at the same value | Logs aren't reaching Datadog | Search `service:crewai*` in Logs Explorer. If nothing shows, verify the Datadog Agent is running (Agent path) or the OTel collector endpoint is correct (OTLP path). |
|
||||
| `crewai_version` shows fewer values than expected | Some containers predate the structured-logs work | The `crewai_version` field was added alongside JSON output. Older deployments running text mode (or older AMP builds) won't emit it. Upgrade those deployments to pick up the field. |
|
||||
| Template variables don't filter widgets | The widget's filter line doesn't reference the template variable | Edit the widget and confirm the search includes `$automation $version $service`. |
|
||||
|
||||
## References
|
||||
|
||||
- [Structured JSON Logs](/pt-BR/enterprise/guides/structured_logs) — the underlying log format the dashboard queries against.
|
||||
- [OpenTelemetry Export](/pt-BR/enterprise/guides/capture_telemetry_logs) — set up the OTLP path if you're not using the Datadog Agent.
|
||||
- [Datadog Log Search Syntax](https://docs.datadoghq.com/logs/explorer/search_syntax/) — reference for customizing widget queries.
|
||||
- [Datadog Dashboard JSON Schema](https://docs.datadoghq.com/dashboards/graphing_json/) — full reference for the dashboard file format if you want to script changes.
|
||||
146
docs/edge/pt-BR/enterprise/guides/structured_logs.mdx
Normal file
146
docs/edge/pt-BR/enterprise/guides/structured_logs.mdx
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
title: "Logs JSON Estruturados"
|
||||
description: "Emita eventos de log JSON de linha única a partir de implantações CrewAI AMP para ingestão estruturada e mais barata no Datadog, Splunk, Loki e outros backends de log."
|
||||
icon: "brackets-curly"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**Tradução em andamento** — conteúdo exibido em inglês.
|
||||
</Note>
|
||||
|
||||
CrewAI AMP can emit one JSON object per log event on stdout instead of the default multi-line text format. Each event ships with typed context fields (automation, kickoff, execution, trace IDs, exception details), making logs cheaper to index, easier to search, and trivially correlatable with traces.
|
||||
|
||||
This page describes the JSON schema, how to enable it, and how to verify it's working. For a ready-made Datadog dashboard built on top of these fields, see [Datadog Dashboard for crewAI](/pt-BR/enterprise/guides/datadog_dashboard).
|
||||
|
||||
## Why use JSON output
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Lower ingestion cost" icon="dollar-sign">
|
||||
Most managed log backends bill per event. A Python traceback in text format is counted as one event per line — 30+ events for a single error. JSON output collapses each traceback into a single event with the stack trace as an escaped string field.
|
||||
</Card>
|
||||
<Card title="Structured search" icon="magnifying-glass">
|
||||
Search by `@automation_id`, `@exception.type`, `@kickoff_id` instead of grepping free-text. Build dashboards on typed facets without parser configuration.
|
||||
</Card>
|
||||
<Card title="APM ↔ logs correlation" icon="link">
|
||||
Every event carries `trace_id` and `span_id` when fired inside a recording span, so backends auto-link logs to traces.
|
||||
</Card>
|
||||
<Card title="Backend agnostic" icon="server">
|
||||
The format is plain JSON — Datadog, Splunk, Loki, Elasticsearch, and CloudWatch all parse it natively without custom log pipelines.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Enabling JSON output
|
||||
|
||||
Set the `CREWAI_LOG_FORMAT` environment variable to `json` on every container that runs your deployment (API + workers).
|
||||
|
||||
```shell
|
||||
CREWAI_LOG_FORMAT=json
|
||||
```
|
||||
|
||||
Restart the deployment to pick up the change. Every log line on stdout from that point on is a single JSON object.
|
||||
|
||||
<Note>
|
||||
The default value is `text`, which preserves the legacy human-readable line format byte-for-byte. Setting any value other than `json` falls back to text mode. There is no migration step — the variable is read at process start and the format switches immediately.
|
||||
</Note>
|
||||
|
||||
## What a log event looks like
|
||||
|
||||
A single info-level log inside an active automation kickoff:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:23.482914Z",
|
||||
"level": "INFO",
|
||||
"logger": "crewai_enterprise.utilities.pii_redaction",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "PII tracking state reset (engines preserved)",
|
||||
"automation_id": "12",
|
||||
"task_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow"
|
||||
}
|
||||
```
|
||||
|
||||
An error with a Python exception is collapsed into a single event with the traceback as a string:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:31.218450Z",
|
||||
"level": "ERROR",
|
||||
"logger": "api.tasks.flow_run_task",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "Flow execution failed",
|
||||
"automation_id": "12",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow",
|
||||
"exception": {
|
||||
"type": "ValueError",
|
||||
"message": "Topic cannot be empty",
|
||||
"stacktrace": "Traceback (most recent call last):\n File \"/app/flow.py\", line 42, in summarize\n ...\nValueError: Topic cannot be empty\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The same error in legacy text mode would have produced ~25 separate log events (one per traceback line) — all of which the backend would bill and index individually.
|
||||
|
||||
## Schema v1 field reference
|
||||
|
||||
Within the `v1` schema, fields are only added, never renamed or removed. New fields will appear as soon as a deployment is upgraded.
|
||||
|
||||
| Field | Type | Always present | Source |
|
||||
|-------|------|----------------|--------|
|
||||
| `schema` | string | Yes | Constant `"v1"`. Increment indicates a breaking schema change. |
|
||||
| `ts` | string (ISO-8601 UTC, microseconds) | Yes | Record creation time, e.g. `2026-06-17T16:14:23.482914Z`. |
|
||||
| `level` | string | Yes | Python log level name: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`. |
|
||||
| `logger` | string | Yes | Dotted logger name, e.g. `api.tasks.flow_run_task`. |
|
||||
| `crewai_version` | string | Yes (when `crewai` package metadata is resolvable) | Installed `crewai` package version, e.g. `"1.14.7"`. |
|
||||
| `msg` | string | Yes | Rendered log message (after `%`-formatting / `{}`-formatting). |
|
||||
| `automation_id` | string | When `CREWAI_PLUS_ID` env var is set | Numeric deployment ID (AMP provisions this on every container). |
|
||||
| `task_id` | string | On Celery worker logs | Celery task UUID, or `"no-task"` for non-task contexts. |
|
||||
| `kickoff_id` | string | Inside an automation kickoff | UUID of the current kickoff. |
|
||||
| `execution_id` | string | Inside an automation kickoff | UUID of the current sub-execution. Equal to `kickoff_id` at the top level; differs for nested flow methods that spawn sub-executions. |
|
||||
| `automation_name` | string | Inside an automation kickoff | Human-readable automation/flow name, e.g. `"research_flow"`. |
|
||||
| `trace_id` | string (32-hex) | Inside a recording OpenTelemetry span | Hex trace ID. Omitted when no span is active. |
|
||||
| `span_id` | string (16-hex) | Inside a recording OpenTelemetry span | Hex span ID. Omitted when no span is active. |
|
||||
| `exception` | object | When the log record has `exc_info` | `{type, message, stacktrace}` — full traceback as a single escaped string. |
|
||||
|
||||
<Tip>
|
||||
Any additional `extra={...}` kwargs passed to a logger call appear as top-level JSON fields verbatim. Reserved field names above always win to keep the schema stable.
|
||||
</Tip>
|
||||
|
||||
## Verifying it's working
|
||||
|
||||
After enabling the env var and restarting, fetch the latest container logs and confirm each line is a single JSON object:
|
||||
|
||||
```shell
|
||||
# Example: docker logs <api-container> --tail 10
|
||||
docker logs $(docker ps -qf name=crewai-api) --tail 10 | jq -r '.msg'
|
||||
```
|
||||
|
||||
If the output is JSON, each line will parse successfully and `jq` will print only the `msg` field. If you see "parse error", the env var didn't take effect — confirm it's set in the running container and that the deployment was restarted after the change.
|
||||
|
||||
## Compatibility and versioning
|
||||
|
||||
The `schema` field declares the contract. Within `v1`, CrewAI commits to:
|
||||
|
||||
- **Never removing a field** that customers may have built queries or dashboards against.
|
||||
- **Never renaming a field** in place — renames happen via a schema bump (e.g. `v2`), with the old name kept as a deprecated alias for at least one release cycle.
|
||||
- **Adding new fields** at any time. Consumers should ignore unknown top-level keys.
|
||||
|
||||
When a `v2` is introduced, both the `schema` field and the migration guide will be published in advance, and `v1` will continue to be emitted for one release cycle so dashboards and queries have time to migrate.
|
||||
|
||||
## What's next
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Datadog Dashboard for crewAI" icon="dog" href="/pt-BR/enterprise/guides/datadog_dashboard">
|
||||
Import a ready-made operations dashboard built on these facets — executions, errors, token cost, version distribution. Works with both the Datadog Agent and Datadog's OTLP intake.
|
||||
</Card>
|
||||
<Card title="OpenTelemetry Export" icon="magnifying-glass-chart" href="/pt-BR/enterprise/guides/capture_telemetry_logs">
|
||||
Ship logs and traces to your own OTel collector or directly to a backend's OTLP intake. The same context fields land as OTLP attributes, so the dashboard works regardless of which path you use.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
Reference in New Issue
Block a user