This evaluator detects potentially unsafe content in text, including hate speech, self-harm, sexual content, and violence. It allows customization of the severity threshold and the specific categories to check.
Successful evaluation
processed, skipped, error Evaluation score
Whether the evaluation passed
Evaluation label
Additional details about the evaluation
Raw response from the evaluator
Type of error if status is 'error'
Error traceback if status is 'error'