Creating Evaluations
Code and judge evals for your Maniac models.
Judge Prompt Evals
Code Evals
Item Structure
Dependencies
Examples
Example: Exact Match
Example: Multi-Label IuO (Jaccard Similarity)
Example: JSON Schema Validation
Example: Semantic Similarity Evaluation
Last updated