AI Observability Evaluation resources

9 resources

AgentOps

AI Observability Evaluation

Observability and debugging platform for AI agents.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Arize Phoenix

AI Observability Evaluation

Open-source observability and evaluation tool for LLM and ML applications.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Helicone

AI Observability Evaluation

Open-source observability platform for LLM usage, latency, and cost tracking.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Langfuse

AI Observability Evaluation

Open-source LLM observability and tracing platform.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Promptfoo

AI Observability Evaluation

Open-source tool for testing, evaluating, and red-teaming prompts and LLM apps.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Ragas

AI Observability Evaluation

Evaluation framework for RAG and LLM application quality checks.

ToolNeeds verificationOpen sourceSelf-hostedGitHub linked

Arize-ai/phoenix

Python

AI Observability & Evaluation

RepositoryAuto enrichedPythonActive signal

Helicone/helicone

Typescript

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

RepositoryAuto enrichedTypeScriptApache-2.0Active signal

AgentOps-AI/agentops

Python

Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including CrewAI, Agno, OpenAI Agents SDK, Langchain, Autogen, AG2, and CamelAI

RepositoryAuto enrichedPythonMITActive signal