This evaluator assesses the extent to which the generated answer is consistent with the provided context. Higher scores indicate better faithfulness to the context, useful for detecting hallucinations.
POST
/
ragas
/
faithfulness
/
evaluate
import langwatchdf = langwatch.datasets.get_dataset("dataset-id").to_pandas()experiment = langwatch.experiment.init("my-experiment")for index, row in experiment.loop(df.iterrows()): # your execution code here experiment.evaluate( "ragas/faithfulness", index=index, data={ "output": output, "contexts": row["contexts"], "input": row["input"], }, settings={} )