Ship reliable, testable agents – not guesses. Better Agents adds simulations, evaluations, and standards on top of any framework. Explore Better Agents
This metric gauges the relevancy of the retrieved context, calculated based on both the question and contexts. The values fall within the range of (0, 1), with higher values indicating better relevancy.
POST
/
legacy
/
ragas_context_relevancy
/
evaluate
Copy
import langwatch
df = langwatch.datasets.get_dataset("dataset-id").to_pandas()
experiment = langwatch.experiment.init("my-experiment")
for index, row in experiment.loop(df.iterrows()):
# your execution code here
experiment.evaluate(
"legacy/ragas_context_relevancy",
index=index,
data={
"output": output,
"contexts": row["contexts"],
},
settings={}
)