Ship reliable, testable agents – not guesses. Better Agents adds simulations, evaluations, and standards on top of any framework. Explore Better Agents
Balances between precision and recall for context retrieval, increasing it means a better signal-to-noise ratio. Uses traditional string distance metrics.
POST
/
ragas
/
context_f1
/
evaluate
Copy
import langwatch
df = langwatch.datasets.get_dataset("dataset-id").to_pandas()
experiment = langwatch.experiment.init("my-experiment")
for index, row in experiment.loop(df.iterrows()):
# your execution code here
experiment.evaluate(
"ragas/context_f1",
index=index,
data={
"contexts": row["contexts"],
"expected_contexts": row["expected_contexts"],
},
settings={}
)