Get started with LangWatch Skills in seconds: Set up evals, scenario tests, and tracing just by asking your AI coding assistant.
Python
Experiment
[ { "status": "processed", "score": 123, "passed": true, "label": "<string>", "details": "<string>", "cost": { "currency": "<string>", "amount": 123 } } ]
Uses an LLM to check if the generated output answers a question correctly the same way as the expected output, even if their style is different.
API key for authentication
Show child attributes
Optional trace ID to associate this evaluation with a trace
Successful evaluation
processed
skipped
error
Numeric score from the evaluation
Whether the evaluation passed
Label assigned by the evaluation
Additional details about the evaluation
Was this page helpful?