diff --git a/docs/en/observability/confident-ai.mdx b/docs/en/observability/confident-ai.mdx
new file mode 100644
index 000000000..4ba2412a1
--- /dev/null
+++ b/docs/en/observability/confident-ai.mdx
@@ -0,0 +1,137 @@
+---
+title: Confident AI Integration
+description: Monitor and evaluate your CrewAI agents with Confident AI's comprehensive evaluation platform powered by DeepEval.
+icon: shield-check
+---
+
+# Confident AI Overview
+
+[Confident AI](https://confident-ai.com) is a comprehensive evaluation platform for LLM applications, powered by [DeepEval](https://github.com/confident-ai/deepeval). It provides advanced monitoring, evaluation, and optimization capabilities specifically designed for AI agent workflows.
+
+Confident AI offers both tracing capabilities to monitor your agents in real-time and evaluation tools to assess the quality, safety, and performance of your CrewAI applications.
+
+### Features
+
+- **Real-time Monitoring**: Track agent interactions, task execution, and performance metrics
+- **Comprehensive Evaluation**: Assess output quality, relevance, safety, and consistency
+- **Cost Tracking**: Monitor LLM API usage and associated costs across your crews
+- **Safety & Compliance**: Detect potential issues like bias, toxicity, and PII leaks
+- **Performance Analytics**: Analyze execution times, success rates, and bottlenecks
+- **Custom Metrics**: Define and track domain-specific evaluation criteria
+- **Team Collaboration**: Share insights and collaborate on agent optimization
+
+## Setup Instructions
+
+
+
+ ```shell
+ pip install deepeval crewai
+ ```
+
+
+ 1. Sign up at [Confident AI](https://confident-ai.com)
+ 2. Navigate to your project settings
+ 3. Copy your API key
+
+
+ Instrument CrewAI with your Confident API key using `instrument_crewai`:
+
+ ```python
+ from crewai import Task, Crew, Agent
+ from deepeval.integrations.crewai import instrument_crewai
+
+ instrument_crewai()
+
+ agent = Agent(
+ role="Consultant",
+ goal="Write clear, concise explanation.",
+ backstory="An expert consultant with a keen eye for software trends.",
+ )
+
+ task = Task(
+ description="Explain the importance of {topic}",
+ expected_output="A clear and concise explanation of the topic.",
+ agent=agent,
+ )
+
+ crew = Crew(agents=[agent], tasks=[task])
+
+ result = crew.kickoff(inputs={"topic": "AI"})
+ ```
+
+
+ For comprehensive evaluation of your crew's outputs:
+
+ ```python
+ from deepeval import evaluate
+ from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
+ from deepeval.test_case import LLMTestCase
+
+ # Define evaluation metrics
+ relevancy_metric = AnswerRelevancyMetric(threshold=0.7)
+ faithfulness_metric = FaithfulnessMetric(threshold=0.8)
+
+ # Execute crew
+ result = crew.kickoff(inputs={"topic": "artificial intelligence"})
+
+ # Create test case for evaluation
+ test_case = LLMTestCase(
+ input="Explain the importance of artificial intelligence",
+ actual_output=str(result),
+ expected_output="A comprehensive explanation of AI's significance"
+ )
+
+ # Evaluate the output
+ evaluate([test_case], [relevancy_metric, faithfulness_metric])
+ ```
+
+
+ After running your CrewAI application with Confident AI integration:
+
+ 1. Visit your [Confident AI dashboard](https://confident-ai.com/dashboard)
+ 2. Navigate to your project to view traces and evaluations
+ 3. Analyze agent performance, costs, and quality metrics
+ 4. Set up alerts for performance thresholds or quality issues
+
+
+
+## Key Metrics Tracked
+
+### Performance Metrics
+- **Execution Time**: Duration of individual tasks and overall crew execution
+- **Token Usage**: Input/output tokens consumed by each agent
+- **API Latency**: Response times from LLM providers
+- **Success Rate**: Percentage of successfully completed tasks
+
+### Quality Metrics
+- **Answer Relevancy**: How well outputs address the given tasks
+- **Faithfulness**: Accuracy and consistency of agent responses
+- **Coherence**: Logical flow and structure of outputs
+- **Safety**: Detection of harmful or inappropriate content
+
+### Cost Metrics
+- **API Costs**: Real-time tracking of LLM usage costs
+- **Cost per Task**: Economic efficiency analysis
+- **Budget Monitoring**: Alerts for spending thresholds
+
+## Best Practices
+
+### Development Phase
+- Start with basic tracing to understand agent behavior
+- Implement evaluation metrics early in development
+- Use custom metrics for domain-specific requirements
+- Monitor resource usage during testing
+
+### Production Phase
+- Set up comprehensive monitoring and alerting
+- Track performance trends over time
+- Implement automated quality checks
+- Maintain cost visibility and control
+
+### Continuous Improvement
+- Regular performance reviews using Confident AI analytics
+- A/B testing of different agent configurations
+- Feedback loops for quality improvement
+- Documentation of optimization insights
+
+For more detailed information and advanced configurations, visit the [Confident AI documentation](https://confident-ai.com/docs) and [DeepEval documentation](https://docs.deepeval.com/).
diff --git a/docs/en/observability/overview.mdx b/docs/en/observability/overview.mdx
index e99858c9e..ca4e48a8c 100644
--- a/docs/en/observability/overview.mdx
+++ b/docs/en/observability/overview.mdx
@@ -57,6 +57,10 @@ Observability is crucial for understanding how your CrewAI agents perform, ident
Weights & Biases platform for tracking and evaluating AI applications.
+
+
+ Comprehensive evaluation platform powered by DeepEval for monitoring and optimizing agent performance.
+
### Evaluation & Quality Assurance