mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-07-01 13:18:10 +00:00
Revert "docs(enterprise): register structured logs + Datadog dashboard in pt-BR/ko/ar"
This reverts commit 2b4ae346da.
This commit is contained in:
@@ -8649,8 +8649,6 @@
|
||||
"edge/pt-BR/enterprise/guides/update-crew",
|
||||
"edge/pt-BR/enterprise/guides/enable-crew-studio",
|
||||
"edge/pt-BR/enterprise/guides/capture_telemetry_logs",
|
||||
"edge/pt-BR/enterprise/guides/structured_logs",
|
||||
"edge/pt-BR/enterprise/guides/datadog_dashboard",
|
||||
"edge/pt-BR/enterprise/guides/azure-openai-setup",
|
||||
"edge/pt-BR/enterprise/guides/tool-repository",
|
||||
"edge/pt-BR/enterprise/guides/custom-mcp-server",
|
||||
@@ -16514,8 +16512,6 @@
|
||||
"edge/ko/enterprise/guides/update-crew",
|
||||
"edge/ko/enterprise/guides/enable-crew-studio",
|
||||
"edge/ko/enterprise/guides/capture_telemetry_logs",
|
||||
"edge/ko/enterprise/guides/structured_logs",
|
||||
"edge/ko/enterprise/guides/datadog_dashboard",
|
||||
"edge/ko/enterprise/guides/azure-openai-setup",
|
||||
"edge/ko/enterprise/guides/tool-repository",
|
||||
"edge/ko/enterprise/guides/custom-mcp-server",
|
||||
@@ -24571,8 +24567,6 @@
|
||||
"edge/ar/enterprise/guides/update-crew",
|
||||
"edge/ar/enterprise/guides/enable-crew-studio",
|
||||
"edge/ar/enterprise/guides/capture_telemetry_logs",
|
||||
"edge/ar/enterprise/guides/structured_logs",
|
||||
"edge/ar/enterprise/guides/datadog_dashboard",
|
||||
"edge/ar/enterprise/guides/azure-openai-setup",
|
||||
"edge/ar/enterprise/guides/tool-repository",
|
||||
"edge/ar/enterprise/guides/custom-mcp-server",
|
||||
|
||||
@@ -49,9 +49,7 @@ mode: "wide"
|
||||
- `otlp.ap1.datadoghq.com` (AP1)
|
||||
- **API Key** — مفتاح واجهة برمجة تطبيقات Datadog الخاص بك. راجع [كيفية إنشاء واحد](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys).
|
||||
|
||||
يرسل قالب Datadog الافتراضي **التتبعات** إلى المسار `/v1/traces`. لتصدير **السجلات** عبر OTLP بدلاً من ذلك، أضف مجمّع **OpenTelemetry Logs** يشير إلى نفس مضيف OTLP الخاص بـ Datadog مع تعيين المسار إلى `/v1/logs` — يمكن للإشارتين العمل جنبًا إلى جنب.
|
||||
|
||||
لشحن السجلات عبر stdout (مسار Datadog Agent) بدلاً من OTLP، راجع [سجلات JSON المنظمة](/ar/enterprise/guides/structured_logs) و[لوحة معلومات Datadog لـ crewAI](/ar/enterprise/guides/datadog_dashboard).
|
||||
يصدّر تكامل Datadog **التتبعات**.
|
||||
|
||||
<Frame></Frame>
|
||||
</Tab>
|
||||
|
||||
@@ -1,140 +0,0 @@
|
||||
---
|
||||
title: "لوحة معلومات Datadog لـ crewAI"
|
||||
description: "استورد لوحة معلومات Datadog جاهزة لمراقبة عمليات نشر CrewAI AMP المُستضافة ذاتيًا — التنفيذات والأخطاء وتكلفة الرموز وتوزيع الإصدار. يعمل مع وكيل Datadog ومع استيعاب OTLP من Datadog."
|
||||
icon: "dog"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**الترجمة قيد التقدم** — يتم عرض المحتوى باللغة الإنجليزية.
|
||||
</Note>
|
||||
|
||||
CrewAI ships a ready-made Datadog dashboard for self-hosted AMP deployments. Once your logs are flowing into Datadog, you can import the dashboard JSON and have an operations view live in your account in under five minutes.
|
||||
|
||||
The dashboard works with either of Datadog's two log-ingestion paths — pick whichever fits your infrastructure:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Datadog Agent (stdout)">
|
||||
The Datadog Agent runs alongside your CrewAI containers (typically as a DaemonSet on Kubernetes) and tails their stdout. This path requires enabling [Structured JSON Logs](/ar/enterprise/guides/structured_logs) so each log event is a single billable line instead of a multi-line traceback.
|
||||
|
||||
**Setup:**
|
||||
1. Set `CREWAI_LOG_FORMAT=json` on every CrewAI container — see [Structured JSON Logs](/ar/enterprise/guides/structured_logs) for full details.
|
||||
2. Install the Datadog Agent in your cluster following [Datadog's Kubernetes setup guide](https://docs.datadoghq.com/containers/kubernetes/installation/). Enable log collection (`logs_enabled: true`) and container log collection (`logs_config.container_collect_all: true`).
|
||||
3. Confirm logs are landing in Datadog by searching `service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you already run the Datadog Agent for infrastructure metrics, or you want logs without configuring an OTel collector in AMP.
|
||||
</Tab>
|
||||
<Tab title="Datadog OTLP intake (no agent)">
|
||||
Datadog accepts OTLP traffic directly at its intake endpoint, no agent required. Configure CrewAI AMP's built-in OTel collector to point at Datadog's OTLP host.
|
||||
|
||||
**Setup:**
|
||||
1. In CrewAI AMP: **Settings → OpenTelemetry Collectors → Add Collector → Datadog**. See [OpenTelemetry Export](/ar/enterprise/guides/capture_telemetry_logs) for the full collector setup.
|
||||
2. The default Datadog template ships **traces** to `/v1/traces`. For log export, switch the endpoint path to `/v1/logs` on the OpenTelemetry Logs collector (use the same Datadog OTLP host).
|
||||
3. Confirm logs are landing by searching `source:otlp service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you can't or don't want to run the Datadog Agent, or you're already using OTLP for traces and want a single export pipeline.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
Either path lands the same structured facets in Datadog (`@automation_id`, `@kickoff_id`, `@execution_id`, `@automation_name`, `@crewai_version`, `@exception.type`, `@gen_ai.*`), so the dashboard works identically with either choice.
|
||||
|
||||
## Prerequisite: promote facets
|
||||
|
||||
Datadog auto-discovers fields the first time it sees them but doesn't make them queryable in widgets until they're promoted to **facets**. This is a one-time setup in your Datadog account.
|
||||
|
||||
<Steps>
|
||||
<Step title="Search for a CrewAI log">
|
||||
Open [Logs Explorer](https://app.datadoghq.com/logs) and search `service:crewai*`. You should see at least one log event.
|
||||
</Step>
|
||||
<Step title="Promote each field">
|
||||
Click any log entry to open the right-hand details panel. For each field below, hover the field name → click the gear icon → **Create facet**.
|
||||
|
||||
- `automation_id`, `automation_name`, `execution_id`, `kickoff_id`, `task_id`
|
||||
- `crewai_version`, `model_id`
|
||||
- `exception.type`, `exception.message`
|
||||
|
||||
Skip any field that already shows a star icon next to its name — that means it's already a facet. The `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, and `gen_ai.request.model` facets are typically promoted automatically by Datadog's LLM Observability auto-discovery, but verify they exist before importing the dashboard.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Import the dashboard
|
||||
|
||||
<Steps>
|
||||
<Step title="Download the dashboard JSON">
|
||||
Save [`datadog_dashboard.json`](https://raw.githubusercontent.com/crewAIInc/crewAI/main/docs/edge/en/enterprise/guides/datadog_dashboard.json) to your machine.
|
||||
</Step>
|
||||
<Step title="Open the import dialog in Datadog">
|
||||
Navigate to **Dashboards → New Dashboard**. Click the **gear icon** in the top right of the empty dashboard and select **Import Dashboard JSON**.
|
||||
</Step>
|
||||
<Step title="Paste or upload the JSON">
|
||||
Paste the contents of `datadog_dashboard.json` into the import dialog (or drag the file in). Click **Import**.
|
||||
|
||||
Datadog creates the dashboard immediately and lands you on it. The first load may show empty widgets for a few seconds while queries execute against the time range.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
Datadog's [Dashboard API](https://docs.datadoghq.com/api/latest/dashboards/#create-a-new-dashboard) accepts the same JSON via `POST /api/v1/dashboard`. Use it if you manage dashboards through Terraform, Pulumi, or CI.
|
||||
</Tip>
|
||||
|
||||
## What you get
|
||||
|
||||
The dashboard is organized into four sections plus a placeholder for a custom drill-down widget:
|
||||
|
||||
| Section | Widgets | Useful for |
|
||||
|---------|---------|------------|
|
||||
| **Header** | Total Executions · Error Rate (%) · Active Automations · CrewAI Versions in Use | At-a-glance health for the last hour. Error Rate is conditionally formatted (green ≤ 5%, yellow ≤ 10%, red > 10%). |
|
||||
| **Throughput** | Executions per Hour by Automation (top 10, stacked bars) | Spotting traffic shifts, surfacing busy automations, validating that a rollout didn't change baseline volume. |
|
||||
| **Errors** | Errors by Exception Type (top 5, stacked bars) · Top Exception Types by Count (toplist) | Triaging failures — which exception types are spiking, which automations they're hitting. |
|
||||
| **Cost** | Total Tokens per Hour by Model (input + output, stacked area) | Tracking LLM token spend by model. Useful for catching cost regressions when an automation switches model or starts looping. |
|
||||
| **Drill-Down** | _(empty placeholder)_ | See [Customization](#customization) for adding a recent-errors log stream here. |
|
||||
|
||||
Three template variables at the top of the dashboard re-scope every widget at once:
|
||||
|
||||
- **`$automation`** — filter to a single automation by name.
|
||||
- **`$version`** — filter to a single `crewai` SDK version (useful for comparing pre- and post-upgrade behavior).
|
||||
- **`$service`** — filter to a specific Datadog `service` tag (useful when multiple CrewAI deployments share one Datadog account).
|
||||
|
||||
## Customization
|
||||
|
||||
The dashboard ships with deliberate gaps so you can extend it without uninstalling and re-importing.
|
||||
|
||||
### Add a Recent Errors log stream
|
||||
|
||||
The **Drill-Down** section is intentionally empty. Add a Log Stream widget to it for an inline view of recent failures:
|
||||
|
||||
1. Edit the dashboard and click **+ Add Widgets** inside the Drill-Down group.
|
||||
2. Drag in a **Log Stream** widget.
|
||||
3. Set the filter query to `status:error $automation $version $service`.
|
||||
4. Choose columns: `@timestamp`, `@automation_name`, `@exception.type`, `@exception.message`, `@execution_id`.
|
||||
5. Sort by most recent, limit to 25 entries.
|
||||
|
||||
Clicking any row jumps to Logs Explorer with the same filter pre-applied.
|
||||
|
||||
### Add p95 latency
|
||||
|
||||
Logs don't include execution duration by default. Two ways to add a latency widget:
|
||||
|
||||
- **From APM traces** — if you also export OTLP traces to Datadog, add a Timeseries widget with data source **Traces**, query `service:crewai*`, aggregation `p95 of @duration`. Datadog APM auto-tracks span duration.
|
||||
- **From metric extraction** — extract a `flow.duration_ms` metric from logs via [Datadog's log-to-metric pipeline](https://docs.datadoghq.com/logs/log_configuration/logs_to_metrics/), then chart it like any other metric. Useful if you don't run APM.
|
||||
|
||||
### Re-scope to multiple deployments
|
||||
|
||||
The `$service` template variable defaults to `*` and will catch every CrewAI deployment in your Datadog account. Change the default to a specific service name in **Configure → Template Variables** if you want the dashboard to focus on one deployment by default.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| All widgets show "No data" | Facets aren't promoted | Re-do the [Promote facets](#prerequisite-promote-facets) step. Datadog won't query against an un-promoted field. |
|
||||
| Error Rate widget shows `NaN` | No executions in the time window | Either no traffic, or `@execution_id` isn't faceted. Expand the time range and re-check facets. |
|
||||
| Throughput chart is flat at the same value | Logs aren't reaching Datadog | Search `service:crewai*` in Logs Explorer. If nothing shows, verify the Datadog Agent is running (Agent path) or the OTel collector endpoint is correct (OTLP path). |
|
||||
| `crewai_version` shows fewer values than expected | Some containers predate the structured-logs work | The `crewai_version` field was added alongside JSON output. Older deployments running text mode (or older AMP builds) won't emit it. Upgrade those deployments to pick up the field. |
|
||||
| Template variables don't filter widgets | The widget's filter line doesn't reference the template variable | Edit the widget and confirm the search includes `$automation $version $service`. |
|
||||
|
||||
## References
|
||||
|
||||
- [Structured JSON Logs](/ar/enterprise/guides/structured_logs) — the underlying log format the dashboard queries against.
|
||||
- [OpenTelemetry Export](/ar/enterprise/guides/capture_telemetry_logs) — set up the OTLP path if you're not using the Datadog Agent.
|
||||
- [Datadog Log Search Syntax](https://docs.datadoghq.com/logs/explorer/search_syntax/) — reference for customizing widget queries.
|
||||
- [Datadog Dashboard JSON Schema](https://docs.datadoghq.com/dashboards/graphing_json/) — full reference for the dashboard file format if you want to script changes.
|
||||
@@ -1,146 +0,0 @@
|
||||
---
|
||||
title: "سجلات JSON المنظمة"
|
||||
description: "أصدر أحداث سجل JSON ذات سطر واحد من عمليات نشر CrewAI AMP للحصول على استيعاب منظم وأقل تكلفة في Datadog وSplunk وLoki وواجهات سجل أخرى."
|
||||
icon: "brackets-curly"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**الترجمة قيد التقدم** — يتم عرض المحتوى باللغة الإنجليزية.
|
||||
</Note>
|
||||
|
||||
CrewAI AMP can emit one JSON object per log event on stdout instead of the default multi-line text format. Each event ships with typed context fields (automation, kickoff, execution, trace IDs, exception details), making logs cheaper to index, easier to search, and trivially correlatable with traces.
|
||||
|
||||
This page describes the JSON schema, how to enable it, and how to verify it's working. For a ready-made Datadog dashboard built on top of these fields, see [Datadog Dashboard for crewAI](/ar/enterprise/guides/datadog_dashboard).
|
||||
|
||||
## Why use JSON output
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Lower ingestion cost" icon="dollar-sign">
|
||||
Most managed log backends bill per event. A Python traceback in text format is counted as one event per line — 30+ events for a single error. JSON output collapses each traceback into a single event with the stack trace as an escaped string field.
|
||||
</Card>
|
||||
<Card title="Structured search" icon="magnifying-glass">
|
||||
Search by `@automation_id`, `@exception.type`, `@kickoff_id` instead of grepping free-text. Build dashboards on typed facets without parser configuration.
|
||||
</Card>
|
||||
<Card title="APM ↔ logs correlation" icon="link">
|
||||
Every event carries `trace_id` and `span_id` when fired inside a recording span, so backends auto-link logs to traces.
|
||||
</Card>
|
||||
<Card title="Backend agnostic" icon="server">
|
||||
The format is plain JSON — Datadog, Splunk, Loki, Elasticsearch, and CloudWatch all parse it natively without custom log pipelines.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Enabling JSON output
|
||||
|
||||
Set the `CREWAI_LOG_FORMAT` environment variable to `json` on every container that runs your deployment (API + workers).
|
||||
|
||||
```shell
|
||||
CREWAI_LOG_FORMAT=json
|
||||
```
|
||||
|
||||
Restart the deployment to pick up the change. Every log line on stdout from that point on is a single JSON object.
|
||||
|
||||
<Note>
|
||||
The default value is `text`, which preserves the legacy human-readable line format byte-for-byte. Setting any value other than `json` falls back to text mode. There is no migration step — the variable is read at process start and the format switches immediately.
|
||||
</Note>
|
||||
|
||||
## What a log event looks like
|
||||
|
||||
A single info-level log inside an active automation kickoff:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:23.482914Z",
|
||||
"level": "INFO",
|
||||
"logger": "crewai_enterprise.utilities.pii_redaction",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "PII tracking state reset (engines preserved)",
|
||||
"automation_id": "12",
|
||||
"task_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow"
|
||||
}
|
||||
```
|
||||
|
||||
An error with a Python exception is collapsed into a single event with the traceback as a string:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:31.218450Z",
|
||||
"level": "ERROR",
|
||||
"logger": "api.tasks.flow_run_task",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "Flow execution failed",
|
||||
"automation_id": "12",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow",
|
||||
"exception": {
|
||||
"type": "ValueError",
|
||||
"message": "Topic cannot be empty",
|
||||
"stacktrace": "Traceback (most recent call last):\n File \"/app/flow.py\", line 42, in summarize\n ...\nValueError: Topic cannot be empty\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The same error in legacy text mode would have produced ~25 separate log events (one per traceback line) — all of which the backend would bill and index individually.
|
||||
|
||||
## Schema v1 field reference
|
||||
|
||||
Within the `v1` schema, fields are only added, never renamed or removed. New fields will appear as soon as a deployment is upgraded.
|
||||
|
||||
| Field | Type | Always present | Source |
|
||||
|-------|------|----------------|--------|
|
||||
| `schema` | string | Yes | Constant `"v1"`. Increment indicates a breaking schema change. |
|
||||
| `ts` | string (ISO-8601 UTC, microseconds) | Yes | Record creation time, e.g. `2026-06-17T16:14:23.482914Z`. |
|
||||
| `level` | string | Yes | Python log level name: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`. |
|
||||
| `logger` | string | Yes | Dotted logger name, e.g. `api.tasks.flow_run_task`. |
|
||||
| `crewai_version` | string | Yes (when `crewai` package metadata is resolvable) | Installed `crewai` package version, e.g. `"1.14.7"`. |
|
||||
| `msg` | string | Yes | Rendered log message (after `%`-formatting / `{}`-formatting). |
|
||||
| `automation_id` | string | When `CREWAI_PLUS_ID` env var is set | Numeric deployment ID (AMP provisions this on every container). |
|
||||
| `task_id` | string | On Celery worker logs | Celery task UUID, or `"no-task"` for non-task contexts. |
|
||||
| `kickoff_id` | string | Inside an automation kickoff | UUID of the current kickoff. |
|
||||
| `execution_id` | string | Inside an automation kickoff | UUID of the current sub-execution. Equal to `kickoff_id` at the top level; differs for nested flow methods that spawn sub-executions. |
|
||||
| `automation_name` | string | Inside an automation kickoff | Human-readable automation/flow name, e.g. `"research_flow"`. |
|
||||
| `trace_id` | string (32-hex) | Inside a recording OpenTelemetry span | Hex trace ID. Omitted when no span is active. |
|
||||
| `span_id` | string (16-hex) | Inside a recording OpenTelemetry span | Hex span ID. Omitted when no span is active. |
|
||||
| `exception` | object | When the log record has `exc_info` | `{type, message, stacktrace}` — full traceback as a single escaped string. |
|
||||
|
||||
<Tip>
|
||||
Any additional `extra={...}` kwargs passed to a logger call appear as top-level JSON fields verbatim. Reserved field names above always win to keep the schema stable.
|
||||
</Tip>
|
||||
|
||||
## Verifying it's working
|
||||
|
||||
After enabling the env var and restarting, fetch the latest container logs and confirm each line is a single JSON object:
|
||||
|
||||
```shell
|
||||
# Example: docker logs <api-container> --tail 10
|
||||
docker logs $(docker ps -qf name=crewai-api) --tail 10 | jq -r '.msg'
|
||||
```
|
||||
|
||||
If the output is JSON, each line will parse successfully and `jq` will print only the `msg` field. If you see "parse error", the env var didn't take effect — confirm it's set in the running container and that the deployment was restarted after the change.
|
||||
|
||||
## Compatibility and versioning
|
||||
|
||||
The `schema` field declares the contract. Within `v1`, CrewAI commits to:
|
||||
|
||||
- **Never removing a field** that customers may have built queries or dashboards against.
|
||||
- **Never renaming a field** in place — renames happen via a schema bump (e.g. `v2`), with the old name kept as a deprecated alias for at least one release cycle.
|
||||
- **Adding new fields** at any time. Consumers should ignore unknown top-level keys.
|
||||
|
||||
When a `v2` is introduced, both the `schema` field and the migration guide will be published in advance, and `v1` will continue to be emitted for one release cycle so dashboards and queries have time to migrate.
|
||||
|
||||
## What's next
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Datadog Dashboard for crewAI" icon="dog" href="/ar/enterprise/guides/datadog_dashboard">
|
||||
Import a ready-made operations dashboard built on these facets — executions, errors, token cost, version distribution. Works with both the Datadog Agent and Datadog's OTLP intake.
|
||||
</Card>
|
||||
<Card title="OpenTelemetry Export" icon="magnifying-glass-chart" href="/ar/enterprise/guides/capture_telemetry_logs">
|
||||
Ship logs and traces to your own OTel collector or directly to a backend's OTLP intake. The same context fields land as OTLP attributes, so the dashboard works regardless of which path you use.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
@@ -49,9 +49,7 @@ CrewAI AMP는 배포에서 OpenTelemetry **트레이스**와 **로그**를 자
|
||||
- `otlp.ap1.datadoghq.com` (AP1)
|
||||
- **API Key** — Datadog API 키입니다. [키 생성 방법](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys)을 참고하세요.
|
||||
|
||||
기본 Datadog 템플릿은 **트레이스**를 `/v1/traces` 경로로 전송합니다. **로그**를 OTLP로 내보내려면 동일한 Datadog OTLP 호스트에 경로를 `/v1/logs`로 설정한 **OpenTelemetry Logs** 수집기를 추가하세요 — 두 신호는 나란히 실행될 수 있습니다.
|
||||
|
||||
OTLP 대신 stdout 기반 로그 전송(Datadog Agent 경로)을 원하면 [구조화된 JSON 로그](/ko/enterprise/guides/structured_logs)와 [crewAI용 Datadog 대시보드](/ko/enterprise/guides/datadog_dashboard)를 참고하세요.
|
||||
Datadog 통합은 **트레이스**를 내보냅니다.
|
||||
|
||||
<Frame></Frame>
|
||||
</Tab>
|
||||
|
||||
@@ -1,140 +0,0 @@
|
||||
---
|
||||
title: "crewAI용 Datadog 대시보드"
|
||||
description: "자체 호스팅 CrewAI AMP 배포 모니터링을 위한 기성 Datadog 대시보드를 가져오세요 — 실행, 오류, 토큰 비용 및 버전 분포. Datadog Agent와 Datadog의 OTLP 수집 모두에서 작동합니다."
|
||||
icon: "dog"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**번역 진행 중** — 콘텐츠가 영어로 표시됩니다.
|
||||
</Note>
|
||||
|
||||
CrewAI ships a ready-made Datadog dashboard for self-hosted AMP deployments. Once your logs are flowing into Datadog, you can import the dashboard JSON and have an operations view live in your account in under five minutes.
|
||||
|
||||
The dashboard works with either of Datadog's two log-ingestion paths — pick whichever fits your infrastructure:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Datadog Agent (stdout)">
|
||||
The Datadog Agent runs alongside your CrewAI containers (typically as a DaemonSet on Kubernetes) and tails their stdout. This path requires enabling [Structured JSON Logs](/ko/enterprise/guides/structured_logs) so each log event is a single billable line instead of a multi-line traceback.
|
||||
|
||||
**Setup:**
|
||||
1. Set `CREWAI_LOG_FORMAT=json` on every CrewAI container — see [Structured JSON Logs](/ko/enterprise/guides/structured_logs) for full details.
|
||||
2. Install the Datadog Agent in your cluster following [Datadog's Kubernetes setup guide](https://docs.datadoghq.com/containers/kubernetes/installation/). Enable log collection (`logs_enabled: true`) and container log collection (`logs_config.container_collect_all: true`).
|
||||
3. Confirm logs are landing in Datadog by searching `service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you already run the Datadog Agent for infrastructure metrics, or you want logs without configuring an OTel collector in AMP.
|
||||
</Tab>
|
||||
<Tab title="Datadog OTLP intake (no agent)">
|
||||
Datadog accepts OTLP traffic directly at its intake endpoint, no agent required. Configure CrewAI AMP's built-in OTel collector to point at Datadog's OTLP host.
|
||||
|
||||
**Setup:**
|
||||
1. In CrewAI AMP: **Settings → OpenTelemetry Collectors → Add Collector → Datadog**. See [OpenTelemetry Export](/ko/enterprise/guides/capture_telemetry_logs) for the full collector setup.
|
||||
2. The default Datadog template ships **traces** to `/v1/traces`. For log export, switch the endpoint path to `/v1/logs` on the OpenTelemetry Logs collector (use the same Datadog OTLP host).
|
||||
3. Confirm logs are landing by searching `source:otlp service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you can't or don't want to run the Datadog Agent, or you're already using OTLP for traces and want a single export pipeline.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
Either path lands the same structured facets in Datadog (`@automation_id`, `@kickoff_id`, `@execution_id`, `@automation_name`, `@crewai_version`, `@exception.type`, `@gen_ai.*`), so the dashboard works identically with either choice.
|
||||
|
||||
## Prerequisite: promote facets
|
||||
|
||||
Datadog auto-discovers fields the first time it sees them but doesn't make them queryable in widgets until they're promoted to **facets**. This is a one-time setup in your Datadog account.
|
||||
|
||||
<Steps>
|
||||
<Step title="Search for a CrewAI log">
|
||||
Open [Logs Explorer](https://app.datadoghq.com/logs) and search `service:crewai*`. You should see at least one log event.
|
||||
</Step>
|
||||
<Step title="Promote each field">
|
||||
Click any log entry to open the right-hand details panel. For each field below, hover the field name → click the gear icon → **Create facet**.
|
||||
|
||||
- `automation_id`, `automation_name`, `execution_id`, `kickoff_id`, `task_id`
|
||||
- `crewai_version`, `model_id`
|
||||
- `exception.type`, `exception.message`
|
||||
|
||||
Skip any field that already shows a star icon next to its name — that means it's already a facet. The `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, and `gen_ai.request.model` facets are typically promoted automatically by Datadog's LLM Observability auto-discovery, but verify they exist before importing the dashboard.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Import the dashboard
|
||||
|
||||
<Steps>
|
||||
<Step title="Download the dashboard JSON">
|
||||
Save [`datadog_dashboard.json`](https://raw.githubusercontent.com/crewAIInc/crewAI/main/docs/edge/en/enterprise/guides/datadog_dashboard.json) to your machine.
|
||||
</Step>
|
||||
<Step title="Open the import dialog in Datadog">
|
||||
Navigate to **Dashboards → New Dashboard**. Click the **gear icon** in the top right of the empty dashboard and select **Import Dashboard JSON**.
|
||||
</Step>
|
||||
<Step title="Paste or upload the JSON">
|
||||
Paste the contents of `datadog_dashboard.json` into the import dialog (or drag the file in). Click **Import**.
|
||||
|
||||
Datadog creates the dashboard immediately and lands you on it. The first load may show empty widgets for a few seconds while queries execute against the time range.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
Datadog's [Dashboard API](https://docs.datadoghq.com/api/latest/dashboards/#create-a-new-dashboard) accepts the same JSON via `POST /api/v1/dashboard`. Use it if you manage dashboards through Terraform, Pulumi, or CI.
|
||||
</Tip>
|
||||
|
||||
## What you get
|
||||
|
||||
The dashboard is organized into four sections plus a placeholder for a custom drill-down widget:
|
||||
|
||||
| Section | Widgets | Useful for |
|
||||
|---------|---------|------------|
|
||||
| **Header** | Total Executions · Error Rate (%) · Active Automations · CrewAI Versions in Use | At-a-glance health for the last hour. Error Rate is conditionally formatted (green ≤ 5%, yellow ≤ 10%, red > 10%). |
|
||||
| **Throughput** | Executions per Hour by Automation (top 10, stacked bars) | Spotting traffic shifts, surfacing busy automations, validating that a rollout didn't change baseline volume. |
|
||||
| **Errors** | Errors by Exception Type (top 5, stacked bars) · Top Exception Types by Count (toplist) | Triaging failures — which exception types are spiking, which automations they're hitting. |
|
||||
| **Cost** | Total Tokens per Hour by Model (input + output, stacked area) | Tracking LLM token spend by model. Useful for catching cost regressions when an automation switches model or starts looping. |
|
||||
| **Drill-Down** | _(empty placeholder)_ | See [Customization](#customization) for adding a recent-errors log stream here. |
|
||||
|
||||
Three template variables at the top of the dashboard re-scope every widget at once:
|
||||
|
||||
- **`$automation`** — filter to a single automation by name.
|
||||
- **`$version`** — filter to a single `crewai` SDK version (useful for comparing pre- and post-upgrade behavior).
|
||||
- **`$service`** — filter to a specific Datadog `service` tag (useful when multiple CrewAI deployments share one Datadog account).
|
||||
|
||||
## Customization
|
||||
|
||||
The dashboard ships with deliberate gaps so you can extend it without uninstalling and re-importing.
|
||||
|
||||
### Add a Recent Errors log stream
|
||||
|
||||
The **Drill-Down** section is intentionally empty. Add a Log Stream widget to it for an inline view of recent failures:
|
||||
|
||||
1. Edit the dashboard and click **+ Add Widgets** inside the Drill-Down group.
|
||||
2. Drag in a **Log Stream** widget.
|
||||
3. Set the filter query to `status:error $automation $version $service`.
|
||||
4. Choose columns: `@timestamp`, `@automation_name`, `@exception.type`, `@exception.message`, `@execution_id`.
|
||||
5. Sort by most recent, limit to 25 entries.
|
||||
|
||||
Clicking any row jumps to Logs Explorer with the same filter pre-applied.
|
||||
|
||||
### Add p95 latency
|
||||
|
||||
Logs don't include execution duration by default. Two ways to add a latency widget:
|
||||
|
||||
- **From APM traces** — if you also export OTLP traces to Datadog, add a Timeseries widget with data source **Traces**, query `service:crewai*`, aggregation `p95 of @duration`. Datadog APM auto-tracks span duration.
|
||||
- **From metric extraction** — extract a `flow.duration_ms` metric from logs via [Datadog's log-to-metric pipeline](https://docs.datadoghq.com/logs/log_configuration/logs_to_metrics/), then chart it like any other metric. Useful if you don't run APM.
|
||||
|
||||
### Re-scope to multiple deployments
|
||||
|
||||
The `$service` template variable defaults to `*` and will catch every CrewAI deployment in your Datadog account. Change the default to a specific service name in **Configure → Template Variables** if you want the dashboard to focus on one deployment by default.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| All widgets show "No data" | Facets aren't promoted | Re-do the [Promote facets](#prerequisite-promote-facets) step. Datadog won't query against an un-promoted field. |
|
||||
| Error Rate widget shows `NaN` | No executions in the time window | Either no traffic, or `@execution_id` isn't faceted. Expand the time range and re-check facets. |
|
||||
| Throughput chart is flat at the same value | Logs aren't reaching Datadog | Search `service:crewai*` in Logs Explorer. If nothing shows, verify the Datadog Agent is running (Agent path) or the OTel collector endpoint is correct (OTLP path). |
|
||||
| `crewai_version` shows fewer values than expected | Some containers predate the structured-logs work | The `crewai_version` field was added alongside JSON output. Older deployments running text mode (or older AMP builds) won't emit it. Upgrade those deployments to pick up the field. |
|
||||
| Template variables don't filter widgets | The widget's filter line doesn't reference the template variable | Edit the widget and confirm the search includes `$automation $version $service`. |
|
||||
|
||||
## References
|
||||
|
||||
- [Structured JSON Logs](/ko/enterprise/guides/structured_logs) — the underlying log format the dashboard queries against.
|
||||
- [OpenTelemetry Export](/ko/enterprise/guides/capture_telemetry_logs) — set up the OTLP path if you're not using the Datadog Agent.
|
||||
- [Datadog Log Search Syntax](https://docs.datadoghq.com/logs/explorer/search_syntax/) — reference for customizing widget queries.
|
||||
- [Datadog Dashboard JSON Schema](https://docs.datadoghq.com/dashboards/graphing_json/) — full reference for the dashboard file format if you want to script changes.
|
||||
@@ -1,146 +0,0 @@
|
||||
---
|
||||
title: "구조화된 JSON 로그"
|
||||
description: "CrewAI AMP 배포에서 한 줄 JSON 로그 이벤트를 내보내 Datadog, Splunk, Loki 및 기타 로그 백엔드에서 더 저렴하고 구조화된 수집을 수행하세요."
|
||||
icon: "brackets-curly"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**번역 진행 중** — 콘텐츠가 영어로 표시됩니다.
|
||||
</Note>
|
||||
|
||||
CrewAI AMP can emit one JSON object per log event on stdout instead of the default multi-line text format. Each event ships with typed context fields (automation, kickoff, execution, trace IDs, exception details), making logs cheaper to index, easier to search, and trivially correlatable with traces.
|
||||
|
||||
This page describes the JSON schema, how to enable it, and how to verify it's working. For a ready-made Datadog dashboard built on top of these fields, see [Datadog Dashboard for crewAI](/ko/enterprise/guides/datadog_dashboard).
|
||||
|
||||
## Why use JSON output
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Lower ingestion cost" icon="dollar-sign">
|
||||
Most managed log backends bill per event. A Python traceback in text format is counted as one event per line — 30+ events for a single error. JSON output collapses each traceback into a single event with the stack trace as an escaped string field.
|
||||
</Card>
|
||||
<Card title="Structured search" icon="magnifying-glass">
|
||||
Search by `@automation_id`, `@exception.type`, `@kickoff_id` instead of grepping free-text. Build dashboards on typed facets without parser configuration.
|
||||
</Card>
|
||||
<Card title="APM ↔ logs correlation" icon="link">
|
||||
Every event carries `trace_id` and `span_id` when fired inside a recording span, so backends auto-link logs to traces.
|
||||
</Card>
|
||||
<Card title="Backend agnostic" icon="server">
|
||||
The format is plain JSON — Datadog, Splunk, Loki, Elasticsearch, and CloudWatch all parse it natively without custom log pipelines.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Enabling JSON output
|
||||
|
||||
Set the `CREWAI_LOG_FORMAT` environment variable to `json` on every container that runs your deployment (API + workers).
|
||||
|
||||
```shell
|
||||
CREWAI_LOG_FORMAT=json
|
||||
```
|
||||
|
||||
Restart the deployment to pick up the change. Every log line on stdout from that point on is a single JSON object.
|
||||
|
||||
<Note>
|
||||
The default value is `text`, which preserves the legacy human-readable line format byte-for-byte. Setting any value other than `json` falls back to text mode. There is no migration step — the variable is read at process start and the format switches immediately.
|
||||
</Note>
|
||||
|
||||
## What a log event looks like
|
||||
|
||||
A single info-level log inside an active automation kickoff:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:23.482914Z",
|
||||
"level": "INFO",
|
||||
"logger": "crewai_enterprise.utilities.pii_redaction",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "PII tracking state reset (engines preserved)",
|
||||
"automation_id": "12",
|
||||
"task_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow"
|
||||
}
|
||||
```
|
||||
|
||||
An error with a Python exception is collapsed into a single event with the traceback as a string:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:31.218450Z",
|
||||
"level": "ERROR",
|
||||
"logger": "api.tasks.flow_run_task",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "Flow execution failed",
|
||||
"automation_id": "12",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow",
|
||||
"exception": {
|
||||
"type": "ValueError",
|
||||
"message": "Topic cannot be empty",
|
||||
"stacktrace": "Traceback (most recent call last):\n File \"/app/flow.py\", line 42, in summarize\n ...\nValueError: Topic cannot be empty\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The same error in legacy text mode would have produced ~25 separate log events (one per traceback line) — all of which the backend would bill and index individually.
|
||||
|
||||
## Schema v1 field reference
|
||||
|
||||
Within the `v1` schema, fields are only added, never renamed or removed. New fields will appear as soon as a deployment is upgraded.
|
||||
|
||||
| Field | Type | Always present | Source |
|
||||
|-------|------|----------------|--------|
|
||||
| `schema` | string | Yes | Constant `"v1"`. Increment indicates a breaking schema change. |
|
||||
| `ts` | string (ISO-8601 UTC, microseconds) | Yes | Record creation time, e.g. `2026-06-17T16:14:23.482914Z`. |
|
||||
| `level` | string | Yes | Python log level name: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`. |
|
||||
| `logger` | string | Yes | Dotted logger name, e.g. `api.tasks.flow_run_task`. |
|
||||
| `crewai_version` | string | Yes (when `crewai` package metadata is resolvable) | Installed `crewai` package version, e.g. `"1.14.7"`. |
|
||||
| `msg` | string | Yes | Rendered log message (after `%`-formatting / `{}`-formatting). |
|
||||
| `automation_id` | string | When `CREWAI_PLUS_ID` env var is set | Numeric deployment ID (AMP provisions this on every container). |
|
||||
| `task_id` | string | On Celery worker logs | Celery task UUID, or `"no-task"` for non-task contexts. |
|
||||
| `kickoff_id` | string | Inside an automation kickoff | UUID of the current kickoff. |
|
||||
| `execution_id` | string | Inside an automation kickoff | UUID of the current sub-execution. Equal to `kickoff_id` at the top level; differs for nested flow methods that spawn sub-executions. |
|
||||
| `automation_name` | string | Inside an automation kickoff | Human-readable automation/flow name, e.g. `"research_flow"`. |
|
||||
| `trace_id` | string (32-hex) | Inside a recording OpenTelemetry span | Hex trace ID. Omitted when no span is active. |
|
||||
| `span_id` | string (16-hex) | Inside a recording OpenTelemetry span | Hex span ID. Omitted when no span is active. |
|
||||
| `exception` | object | When the log record has `exc_info` | `{type, message, stacktrace}` — full traceback as a single escaped string. |
|
||||
|
||||
<Tip>
|
||||
Any additional `extra={...}` kwargs passed to a logger call appear as top-level JSON fields verbatim. Reserved field names above always win to keep the schema stable.
|
||||
</Tip>
|
||||
|
||||
## Verifying it's working
|
||||
|
||||
After enabling the env var and restarting, fetch the latest container logs and confirm each line is a single JSON object:
|
||||
|
||||
```shell
|
||||
# Example: docker logs <api-container> --tail 10
|
||||
docker logs $(docker ps -qf name=crewai-api) --tail 10 | jq -r '.msg'
|
||||
```
|
||||
|
||||
If the output is JSON, each line will parse successfully and `jq` will print only the `msg` field. If you see "parse error", the env var didn't take effect — confirm it's set in the running container and that the deployment was restarted after the change.
|
||||
|
||||
## Compatibility and versioning
|
||||
|
||||
The `schema` field declares the contract. Within `v1`, CrewAI commits to:
|
||||
|
||||
- **Never removing a field** that customers may have built queries or dashboards against.
|
||||
- **Never renaming a field** in place — renames happen via a schema bump (e.g. `v2`), with the old name kept as a deprecated alias for at least one release cycle.
|
||||
- **Adding new fields** at any time. Consumers should ignore unknown top-level keys.
|
||||
|
||||
When a `v2` is introduced, both the `schema` field and the migration guide will be published in advance, and `v1` will continue to be emitted for one release cycle so dashboards and queries have time to migrate.
|
||||
|
||||
## What's next
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Datadog Dashboard for crewAI" icon="dog" href="/ko/enterprise/guides/datadog_dashboard">
|
||||
Import a ready-made operations dashboard built on these facets — executions, errors, token cost, version distribution. Works with both the Datadog Agent and Datadog's OTLP intake.
|
||||
</Card>
|
||||
<Card title="OpenTelemetry Export" icon="magnifying-glass-chart" href="/ko/enterprise/guides/capture_telemetry_logs">
|
||||
Ship logs and traces to your own OTel collector or directly to a backend's OTLP intake. The same context fields land as OTLP attributes, so the dashboard works regardless of which path you use.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
@@ -49,9 +49,7 @@ Os dados de telemetria seguem as [convenções semânticas GenAI do OpenTelemetr
|
||||
- `otlp.ap1.datadoghq.com` (AP1)
|
||||
- **API Key** — Sua chave de API do Datadog. Veja [como criar uma](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys).
|
||||
|
||||
O template padrão do Datadog envia **traces** para o caminho `/v1/traces`. Para exportar **logs** via OTLP, adicione um coletor **OpenTelemetry Logs** apontando para o mesmo host OTLP do Datadog com o caminho definido como `/v1/logs` — ambos os sinais podem rodar lado a lado.
|
||||
|
||||
Para envio de logs via stdout (o caminho do Datadog Agent) em vez de OTLP, veja [Logs JSON Estruturados](/pt-BR/enterprise/guides/structured_logs) e [Dashboard Datadog para crewAI](/pt-BR/enterprise/guides/datadog_dashboard).
|
||||
A integração com o Datadog exporta **traces**.
|
||||
|
||||
<Frame></Frame>
|
||||
</Tab>
|
||||
|
||||
@@ -1,140 +0,0 @@
|
||||
---
|
||||
title: "Dashboard Datadog para crewAI"
|
||||
description: "Importe um dashboard Datadog pronto para monitorar implantações CrewAI AMP auto-hospedadas — execuções, erros, custo de tokens e distribuição de versão. Funciona com o Datadog Agent e com o ingest OTLP do Datadog."
|
||||
icon: "dog"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**Tradução em andamento** — conteúdo exibido em inglês.
|
||||
</Note>
|
||||
|
||||
CrewAI ships a ready-made Datadog dashboard for self-hosted AMP deployments. Once your logs are flowing into Datadog, you can import the dashboard JSON and have an operations view live in your account in under five minutes.
|
||||
|
||||
The dashboard works with either of Datadog's two log-ingestion paths — pick whichever fits your infrastructure:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Datadog Agent (stdout)">
|
||||
The Datadog Agent runs alongside your CrewAI containers (typically as a DaemonSet on Kubernetes) and tails their stdout. This path requires enabling [Structured JSON Logs](/pt-BR/enterprise/guides/structured_logs) so each log event is a single billable line instead of a multi-line traceback.
|
||||
|
||||
**Setup:**
|
||||
1. Set `CREWAI_LOG_FORMAT=json` on every CrewAI container — see [Structured JSON Logs](/pt-BR/enterprise/guides/structured_logs) for full details.
|
||||
2. Install the Datadog Agent in your cluster following [Datadog's Kubernetes setup guide](https://docs.datadoghq.com/containers/kubernetes/installation/). Enable log collection (`logs_enabled: true`) and container log collection (`logs_config.container_collect_all: true`).
|
||||
3. Confirm logs are landing in Datadog by searching `service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you already run the Datadog Agent for infrastructure metrics, or you want logs without configuring an OTel collector in AMP.
|
||||
</Tab>
|
||||
<Tab title="Datadog OTLP intake (no agent)">
|
||||
Datadog accepts OTLP traffic directly at its intake endpoint, no agent required. Configure CrewAI AMP's built-in OTel collector to point at Datadog's OTLP host.
|
||||
|
||||
**Setup:**
|
||||
1. In CrewAI AMP: **Settings → OpenTelemetry Collectors → Add Collector → Datadog**. See [OpenTelemetry Export](/pt-BR/enterprise/guides/capture_telemetry_logs) for the full collector setup.
|
||||
2. The default Datadog template ships **traces** to `/v1/traces`. For log export, switch the endpoint path to `/v1/logs` on the OpenTelemetry Logs collector (use the same Datadog OTLP host).
|
||||
3. Confirm logs are landing by searching `source:otlp service:crewai*` in the [Logs Explorer](https://app.datadoghq.com/logs).
|
||||
|
||||
**When to pick this path:** you can't or don't want to run the Datadog Agent, or you're already using OTLP for traces and want a single export pipeline.
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
Either path lands the same structured facets in Datadog (`@automation_id`, `@kickoff_id`, `@execution_id`, `@automation_name`, `@crewai_version`, `@exception.type`, `@gen_ai.*`), so the dashboard works identically with either choice.
|
||||
|
||||
## Prerequisite: promote facets
|
||||
|
||||
Datadog auto-discovers fields the first time it sees them but doesn't make them queryable in widgets until they're promoted to **facets**. This is a one-time setup in your Datadog account.
|
||||
|
||||
<Steps>
|
||||
<Step title="Search for a CrewAI log">
|
||||
Open [Logs Explorer](https://app.datadoghq.com/logs) and search `service:crewai*`. You should see at least one log event.
|
||||
</Step>
|
||||
<Step title="Promote each field">
|
||||
Click any log entry to open the right-hand details panel. For each field below, hover the field name → click the gear icon → **Create facet**.
|
||||
|
||||
- `automation_id`, `automation_name`, `execution_id`, `kickoff_id`, `task_id`
|
||||
- `crewai_version`, `model_id`
|
||||
- `exception.type`, `exception.message`
|
||||
|
||||
Skip any field that already shows a star icon next to its name — that means it's already a facet. The `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, and `gen_ai.request.model` facets are typically promoted automatically by Datadog's LLM Observability auto-discovery, but verify they exist before importing the dashboard.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Import the dashboard
|
||||
|
||||
<Steps>
|
||||
<Step title="Download the dashboard JSON">
|
||||
Save [`datadog_dashboard.json`](https://raw.githubusercontent.com/crewAIInc/crewAI/main/docs/edge/en/enterprise/guides/datadog_dashboard.json) to your machine.
|
||||
</Step>
|
||||
<Step title="Open the import dialog in Datadog">
|
||||
Navigate to **Dashboards → New Dashboard**. Click the **gear icon** in the top right of the empty dashboard and select **Import Dashboard JSON**.
|
||||
</Step>
|
||||
<Step title="Paste or upload the JSON">
|
||||
Paste the contents of `datadog_dashboard.json` into the import dialog (or drag the file in). Click **Import**.
|
||||
|
||||
Datadog creates the dashboard immediately and lands you on it. The first load may show empty widgets for a few seconds while queries execute against the time range.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
Datadog's [Dashboard API](https://docs.datadoghq.com/api/latest/dashboards/#create-a-new-dashboard) accepts the same JSON via `POST /api/v1/dashboard`. Use it if you manage dashboards through Terraform, Pulumi, or CI.
|
||||
</Tip>
|
||||
|
||||
## What you get
|
||||
|
||||
The dashboard is organized into four sections plus a placeholder for a custom drill-down widget:
|
||||
|
||||
| Section | Widgets | Useful for |
|
||||
|---------|---------|------------|
|
||||
| **Header** | Total Executions · Error Rate (%) · Active Automations · CrewAI Versions in Use | At-a-glance health for the last hour. Error Rate is conditionally formatted (green ≤ 5%, yellow ≤ 10%, red > 10%). |
|
||||
| **Throughput** | Executions per Hour by Automation (top 10, stacked bars) | Spotting traffic shifts, surfacing busy automations, validating that a rollout didn't change baseline volume. |
|
||||
| **Errors** | Errors by Exception Type (top 5, stacked bars) · Top Exception Types by Count (toplist) | Triaging failures — which exception types are spiking, which automations they're hitting. |
|
||||
| **Cost** | Total Tokens per Hour by Model (input + output, stacked area) | Tracking LLM token spend by model. Useful for catching cost regressions when an automation switches model or starts looping. |
|
||||
| **Drill-Down** | _(empty placeholder)_ | See [Customization](#customization) for adding a recent-errors log stream here. |
|
||||
|
||||
Three template variables at the top of the dashboard re-scope every widget at once:
|
||||
|
||||
- **`$automation`** — filter to a single automation by name.
|
||||
- **`$version`** — filter to a single `crewai` SDK version (useful for comparing pre- and post-upgrade behavior).
|
||||
- **`$service`** — filter to a specific Datadog `service` tag (useful when multiple CrewAI deployments share one Datadog account).
|
||||
|
||||
## Customization
|
||||
|
||||
The dashboard ships with deliberate gaps so you can extend it without uninstalling and re-importing.
|
||||
|
||||
### Add a Recent Errors log stream
|
||||
|
||||
The **Drill-Down** section is intentionally empty. Add a Log Stream widget to it for an inline view of recent failures:
|
||||
|
||||
1. Edit the dashboard and click **+ Add Widgets** inside the Drill-Down group.
|
||||
2. Drag in a **Log Stream** widget.
|
||||
3. Set the filter query to `status:error $automation $version $service`.
|
||||
4. Choose columns: `@timestamp`, `@automation_name`, `@exception.type`, `@exception.message`, `@execution_id`.
|
||||
5. Sort by most recent, limit to 25 entries.
|
||||
|
||||
Clicking any row jumps to Logs Explorer with the same filter pre-applied.
|
||||
|
||||
### Add p95 latency
|
||||
|
||||
Logs don't include execution duration by default. Two ways to add a latency widget:
|
||||
|
||||
- **From APM traces** — if you also export OTLP traces to Datadog, add a Timeseries widget with data source **Traces**, query `service:crewai*`, aggregation `p95 of @duration`. Datadog APM auto-tracks span duration.
|
||||
- **From metric extraction** — extract a `flow.duration_ms` metric from logs via [Datadog's log-to-metric pipeline](https://docs.datadoghq.com/logs/log_configuration/logs_to_metrics/), then chart it like any other metric. Useful if you don't run APM.
|
||||
|
||||
### Re-scope to multiple deployments
|
||||
|
||||
The `$service` template variable defaults to `*` and will catch every CrewAI deployment in your Datadog account. Change the default to a specific service name in **Configure → Template Variables** if you want the dashboard to focus on one deployment by default.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| All widgets show "No data" | Facets aren't promoted | Re-do the [Promote facets](#prerequisite-promote-facets) step. Datadog won't query against an un-promoted field. |
|
||||
| Error Rate widget shows `NaN` | No executions in the time window | Either no traffic, or `@execution_id` isn't faceted. Expand the time range and re-check facets. |
|
||||
| Throughput chart is flat at the same value | Logs aren't reaching Datadog | Search `service:crewai*` in Logs Explorer. If nothing shows, verify the Datadog Agent is running (Agent path) or the OTel collector endpoint is correct (OTLP path). |
|
||||
| `crewai_version` shows fewer values than expected | Some containers predate the structured-logs work | The `crewai_version` field was added alongside JSON output. Older deployments running text mode (or older AMP builds) won't emit it. Upgrade those deployments to pick up the field. |
|
||||
| Template variables don't filter widgets | The widget's filter line doesn't reference the template variable | Edit the widget and confirm the search includes `$automation $version $service`. |
|
||||
|
||||
## References
|
||||
|
||||
- [Structured JSON Logs](/pt-BR/enterprise/guides/structured_logs) — the underlying log format the dashboard queries against.
|
||||
- [OpenTelemetry Export](/pt-BR/enterprise/guides/capture_telemetry_logs) — set up the OTLP path if you're not using the Datadog Agent.
|
||||
- [Datadog Log Search Syntax](https://docs.datadoghq.com/logs/explorer/search_syntax/) — reference for customizing widget queries.
|
||||
- [Datadog Dashboard JSON Schema](https://docs.datadoghq.com/dashboards/graphing_json/) — full reference for the dashboard file format if you want to script changes.
|
||||
@@ -1,146 +0,0 @@
|
||||
---
|
||||
title: "Logs JSON Estruturados"
|
||||
description: "Emita eventos de log JSON de linha única a partir de implantações CrewAI AMP para ingestão estruturada e mais barata no Datadog, Splunk, Loki e outros backends de log."
|
||||
icon: "brackets-curly"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Note>
|
||||
**Tradução em andamento** — conteúdo exibido em inglês.
|
||||
</Note>
|
||||
|
||||
CrewAI AMP can emit one JSON object per log event on stdout instead of the default multi-line text format. Each event ships with typed context fields (automation, kickoff, execution, trace IDs, exception details), making logs cheaper to index, easier to search, and trivially correlatable with traces.
|
||||
|
||||
This page describes the JSON schema, how to enable it, and how to verify it's working. For a ready-made Datadog dashboard built on top of these fields, see [Datadog Dashboard for crewAI](/pt-BR/enterprise/guides/datadog_dashboard).
|
||||
|
||||
## Why use JSON output
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Lower ingestion cost" icon="dollar-sign">
|
||||
Most managed log backends bill per event. A Python traceback in text format is counted as one event per line — 30+ events for a single error. JSON output collapses each traceback into a single event with the stack trace as an escaped string field.
|
||||
</Card>
|
||||
<Card title="Structured search" icon="magnifying-glass">
|
||||
Search by `@automation_id`, `@exception.type`, `@kickoff_id` instead of grepping free-text. Build dashboards on typed facets without parser configuration.
|
||||
</Card>
|
||||
<Card title="APM ↔ logs correlation" icon="link">
|
||||
Every event carries `trace_id` and `span_id` when fired inside a recording span, so backends auto-link logs to traces.
|
||||
</Card>
|
||||
<Card title="Backend agnostic" icon="server">
|
||||
The format is plain JSON — Datadog, Splunk, Loki, Elasticsearch, and CloudWatch all parse it natively without custom log pipelines.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Enabling JSON output
|
||||
|
||||
Set the `CREWAI_LOG_FORMAT` environment variable to `json` on every container that runs your deployment (API + workers).
|
||||
|
||||
```shell
|
||||
CREWAI_LOG_FORMAT=json
|
||||
```
|
||||
|
||||
Restart the deployment to pick up the change. Every log line on stdout from that point on is a single JSON object.
|
||||
|
||||
<Note>
|
||||
The default value is `text`, which preserves the legacy human-readable line format byte-for-byte. Setting any value other than `json` falls back to text mode. There is no migration step — the variable is read at process start and the format switches immediately.
|
||||
</Note>
|
||||
|
||||
## What a log event looks like
|
||||
|
||||
A single info-level log inside an active automation kickoff:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:23.482914Z",
|
||||
"level": "INFO",
|
||||
"logger": "crewai_enterprise.utilities.pii_redaction",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "PII tracking state reset (engines preserved)",
|
||||
"automation_id": "12",
|
||||
"task_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow"
|
||||
}
|
||||
```
|
||||
|
||||
An error with a Python exception is collapsed into a single event with the traceback as a string:
|
||||
|
||||
```json
|
||||
{
|
||||
"schema": "v1",
|
||||
"ts": "2026-06-17T16:14:31.218450Z",
|
||||
"level": "ERROR",
|
||||
"logger": "api.tasks.flow_run_task",
|
||||
"crewai_version": "1.14.7",
|
||||
"msg": "Flow execution failed",
|
||||
"automation_id": "12",
|
||||
"kickoff_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"execution_id": "0843a930-b306-464b-89c8-bfafa78cc711",
|
||||
"automation_name": "research_flow",
|
||||
"exception": {
|
||||
"type": "ValueError",
|
||||
"message": "Topic cannot be empty",
|
||||
"stacktrace": "Traceback (most recent call last):\n File \"/app/flow.py\", line 42, in summarize\n ...\nValueError: Topic cannot be empty\n"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The same error in legacy text mode would have produced ~25 separate log events (one per traceback line) — all of which the backend would bill and index individually.
|
||||
|
||||
## Schema v1 field reference
|
||||
|
||||
Within the `v1` schema, fields are only added, never renamed or removed. New fields will appear as soon as a deployment is upgraded.
|
||||
|
||||
| Field | Type | Always present | Source |
|
||||
|-------|------|----------------|--------|
|
||||
| `schema` | string | Yes | Constant `"v1"`. Increment indicates a breaking schema change. |
|
||||
| `ts` | string (ISO-8601 UTC, microseconds) | Yes | Record creation time, e.g. `2026-06-17T16:14:23.482914Z`. |
|
||||
| `level` | string | Yes | Python log level name: `DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`. |
|
||||
| `logger` | string | Yes | Dotted logger name, e.g. `api.tasks.flow_run_task`. |
|
||||
| `crewai_version` | string | Yes (when `crewai` package metadata is resolvable) | Installed `crewai` package version, e.g. `"1.14.7"`. |
|
||||
| `msg` | string | Yes | Rendered log message (after `%`-formatting / `{}`-formatting). |
|
||||
| `automation_id` | string | When `CREWAI_PLUS_ID` env var is set | Numeric deployment ID (AMP provisions this on every container). |
|
||||
| `task_id` | string | On Celery worker logs | Celery task UUID, or `"no-task"` for non-task contexts. |
|
||||
| `kickoff_id` | string | Inside an automation kickoff | UUID of the current kickoff. |
|
||||
| `execution_id` | string | Inside an automation kickoff | UUID of the current sub-execution. Equal to `kickoff_id` at the top level; differs for nested flow methods that spawn sub-executions. |
|
||||
| `automation_name` | string | Inside an automation kickoff | Human-readable automation/flow name, e.g. `"research_flow"`. |
|
||||
| `trace_id` | string (32-hex) | Inside a recording OpenTelemetry span | Hex trace ID. Omitted when no span is active. |
|
||||
| `span_id` | string (16-hex) | Inside a recording OpenTelemetry span | Hex span ID. Omitted when no span is active. |
|
||||
| `exception` | object | When the log record has `exc_info` | `{type, message, stacktrace}` — full traceback as a single escaped string. |
|
||||
|
||||
<Tip>
|
||||
Any additional `extra={...}` kwargs passed to a logger call appear as top-level JSON fields verbatim. Reserved field names above always win to keep the schema stable.
|
||||
</Tip>
|
||||
|
||||
## Verifying it's working
|
||||
|
||||
After enabling the env var and restarting, fetch the latest container logs and confirm each line is a single JSON object:
|
||||
|
||||
```shell
|
||||
# Example: docker logs <api-container> --tail 10
|
||||
docker logs $(docker ps -qf name=crewai-api) --tail 10 | jq -r '.msg'
|
||||
```
|
||||
|
||||
If the output is JSON, each line will parse successfully and `jq` will print only the `msg` field. If you see "parse error", the env var didn't take effect — confirm it's set in the running container and that the deployment was restarted after the change.
|
||||
|
||||
## Compatibility and versioning
|
||||
|
||||
The `schema` field declares the contract. Within `v1`, CrewAI commits to:
|
||||
|
||||
- **Never removing a field** that customers may have built queries or dashboards against.
|
||||
- **Never renaming a field** in place — renames happen via a schema bump (e.g. `v2`), with the old name kept as a deprecated alias for at least one release cycle.
|
||||
- **Adding new fields** at any time. Consumers should ignore unknown top-level keys.
|
||||
|
||||
When a `v2` is introduced, both the `schema` field and the migration guide will be published in advance, and `v1` will continue to be emitted for one release cycle so dashboards and queries have time to migrate.
|
||||
|
||||
## What's next
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Datadog Dashboard for crewAI" icon="dog" href="/pt-BR/enterprise/guides/datadog_dashboard">
|
||||
Import a ready-made operations dashboard built on these facets — executions, errors, token cost, version distribution. Works with both the Datadog Agent and Datadog's OTLP intake.
|
||||
</Card>
|
||||
<Card title="OpenTelemetry Export" icon="magnifying-glass-chart" href="/pt-BR/enterprise/guides/capture_telemetry_logs">
|
||||
Ship logs and traces to your own OTel collector or directly to a backend's OTLP intake. The same context fields land as OTLP attributes, so the dashboard works regardless of which path you use.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
Reference in New Issue
Block a user