REST API

The Maniac API provides openai compatible inference endpoints for interacting with both frontier models and your custom models. This reference details the available endpoints.

List evaluation runs

get

List evaluation runs for the authenticated project. Optionally filter by container and status.

Authorizations

AuthorizationstringRequired

API key in Authorization header using Bearer .

Query parameters

containerany ofOptional

Container ID or label to filter by.

stringOptional

nullOptional

statusany ofOptional

Filter by run status (e.g. 'running', 'completed', 'error').

stringOptional

nullOptional

limitinteger · min: 1 · max: 100OptionalDefault: 20

offsetintegerOptionalDefault: 0

Responses

200

Successful Response

application/json

objectconst: listOptional

Object type identifier.

Default: list

totalintegerRequired

Total number of items available for this resource.

400

Bad Request

application/json

401

Unauthorized

application/json

403

Forbidden

application/json

422

Validation Error

application/json

429

Too Many Requests

application/json

500

Internal Server Error

application/json

501

Not Implemented

application/json

503

Upstream Unavailable

application/json

get

/v1/evaluation/runs

GET /v1/evaluation/runs HTTP/1.1
Host: platform.maniac.ai
Authorization: Bearer YOUR_SECRET_TOKEN
Accept: */*

{
  "object": "list",
  "data": [
    {
      "created_at": "text",
      "finished_at": "text",
      "error_at": "text",
      "status": "text",
      "error": null,
      "object": "evaluation.run",
      "id": "text",
      "process_id": "text",
      "evaluators": [
        "text"
      ],
      "container": "text",
      "dataset_id": "text",
      "sample": {
        "range": "0:100",
        "type": "text",
        "dataset": "file_abc123"
      },
      "ground_truth": {
        "range": "0:100",
        "type": "text",
        "dataset": "file_abc123"
      },
      "baseline": {
        "range": "0:100",
        "type": "text",
        "dataset": "file_abc123"
      },
      "results": {
        "overall": {
          "avg_score": 1,
          "avg_accuracy": 1,
          "ANY_ADDITIONAL_PROPERTY": "anything"
        },
        "metadata": {
          "ANY_ADDITIONAL_PROPERTY": "anything"
        },
        "per_model": [
          {
            "model": {
              "ANY_ADDITIONAL_PROPERTY": "anything"
            },
            "per_eval": {
              "ANY_ADDITIONAL_PROPERTY": {
                "accuracy": 1,
                "avg_score": 1,
                "num_total": 1,
                "num_errors": 1,
                "num_failed": 1,
                "num_passed": 1,
                "num_scored": 1,
                "num_missing": 1,
                "avg_accuracy": 1,
                "break_reason": "text",
                "expected_count": 1,
                "ANY_ADDITIONAL_PROPERTY": "anything"
              }
            },
            "avg_score": 1,
            "avg_accuracy": 1,
            "launch_index": 1,
            "launch_call_id": "text",
            "ANY_ADDITIONAL_PROPERTY": "anything"
          }
        ],
        "launch_count": 1,
        "ANY_ADDITIONAL_PROPERTY": "anything"
      },
      "metrics": {
        "ANY_ADDITIONAL_PROPERTY": "anything"
      },
      "config": {
        "ANY_ADDITIONAL_PROPERTY": "anything"
      },
      "spend": 1,
      "metadata": {
        "ANY_ADDITIONAL_PROPERTY": "anything"
      }
    }
  ],
  "total": 0
}

Create an evaluation run

post

Launch an evaluation run. Validates access to the specified container, evaluators, data sources, and models, then dispatches the run through the backend gateway interface.

Authorizations

AuthorizationstringRequired

API key in Authorization header using Bearer .

Body

Request body for creating an evaluation run.

Each side of the evaluation (sample and ground_truth) is described by a single data-source object whose type discriminator determines how data is obtained:

"dataset" — pull from a dataset.
"container" — pull from the container's task logs.
"generate" — generate completions using one or more models.

Both fields are optional, but at least one must be provided. When a side is omitted it defaults to the top-level container's task logs. At least one resolved side must not be type='generate' so there is seed input to evaluate against.

containerstringRequired

Container id or label.

evaluatorsstring[] · min: 1Required

Evaluator ids or labels.

sampleany ofOptional

Sample-side data source. Omit to default to the container's task logs. Use type='dataset' to pull from a dataset, type='container' to pull from task logs, or type='generate' to generate completions with the specified models.

anyOptional

nullOptional

ground_truthany ofOptional

Ground-truth-side data source. Omit to default to the container's task logs. Use type='dataset' to pull from a dataset, type='container' to pull from task logs, or type='generate' to generate completions with a model.

anyOptional

nullOptional

baselineany ofOptional

Baseline-side data source for pairwise evaluation. Use type='dataset' to pull from a dataset, type='container' to pull from task logs, or type='generate' to generate completions with a model.

anyOptional

nullOptional

metadataany ofOptional

Optional metadata.

nullOptional

environmentany ofOptional

Execution environment name (maps to Modal app suffix).

Default: main

stringOptional

nullOptional

Responses

201

Successful Response

application/json

400

Bad Request

application/json

401

Unauthorized

application/json

403

Forbidden

application/json

404

Not Found

application/json

409

Conflict

application/json

422

Validation Error

application/json

429

Too Many Requests

application/json

500

Internal Server Error

application/json

501

Not Implemented

application/json

503

Upstream Unavailable

application/json

post

/v1/evaluation/runs

POST /v1/evaluation/runs HTTP/1.1
Host: platform.maniac.ai
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 316

{
  "container": "text",
  "evaluators": [
    "text"
  ],
  "sample": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "ground_truth": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "baseline": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "metadata": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "environment": "main"
}

{
  "created_at": "text",
  "finished_at": "text",
  "error_at": "text",
  "status": "text",
  "error": null,
  "object": "evaluation.run",
  "id": "text",
  "process_id": "text",
  "evaluators": [
    "text"
  ],
  "container": "text",
  "dataset_id": "text",
  "sample": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "ground_truth": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "baseline": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "results": {
    "overall": {
      "avg_score": 1,
      "avg_accuracy": 1,
      "ANY_ADDITIONAL_PROPERTY": "anything"
    },
    "metadata": {
      "ANY_ADDITIONAL_PROPERTY": "anything"
    },
    "per_model": [
      {
        "model": {
          "ANY_ADDITIONAL_PROPERTY": "anything"
        },
        "per_eval": {
          "ANY_ADDITIONAL_PROPERTY": {
            "accuracy": 1,
            "avg_score": 1,
            "num_total": 1,
            "num_errors": 1,
            "num_failed": 1,
            "num_passed": 1,
            "num_scored": 1,
            "num_missing": 1,
            "avg_accuracy": 1,
            "break_reason": "text",
            "expected_count": 1,
            "ANY_ADDITIONAL_PROPERTY": "anything"
          }
        },
        "avg_score": 1,
        "avg_accuracy": 1,
        "launch_index": 1,
        "launch_call_id": "text",
        "ANY_ADDITIONAL_PROPERTY": "anything"
      }
    ],
    "launch_count": 1,
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "metrics": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "config": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "spend": 1,
  "metadata": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  }
}

Get an evaluation run

get

Retrieve a single evaluation run by ID within the authenticated project.

Authorizations

AuthorizationstringRequired

API key in Authorization header using Bearer .

Path parameters

run_idstringRequired

Responses

200

Successful Response

application/json

Response model for an evaluation run.

created_atstringRequired

finished_atany ofOptional

stringOptional

nullOptional

error_atany ofOptional

stringOptional

nullOptional

statusstringRequired

errorany ofOptional

anyOptional

nullOptional

objectconst: evaluation.runOptional

Object type.

Default: evaluation.run

idstringRequired

Evaluation run id (run group id).

process_idany ofOptional

Process id for lifecycle tracking.

stringOptional

nullOptional

evaluatorsany ofOptional

Evaluator ids used in this run.

string[]Optional

nullOptional

containerany ofOptional

Container id.

stringOptional

nullOptional

dataset_idany ofOptional

Dataset id (if a dataset was used).

stringOptional

nullOptional

sampleany ofOptional

Resolved sample-side data source.

anyOptional

nullOptional

ground_truthany ofOptional

Resolved ground-truth-side data source.

anyOptional

nullOptional

baselineany ofOptional

Resolved baseline-side data source for pairwise evaluation.

anyOptional

nullOptional

resultsany ofOptional

Evaluation results (populated on completion).

nullOptional

metricsany ofOptional

Evaluation metrics (populated on completion).

nullOptional

configany ofOptional

Run configuration as submitted.

nullOptional

spendany ofOptional

Estimated spend.

numberOptional

nullOptional

metadataany ofOptional

Optional metadata.

nullOptional

400

Bad Request

application/json

401

Unauthorized

application/json

403

Forbidden

application/json

404

Not Found

application/json

409

Conflict

application/json

422

Validation Error

application/json

429

Too Many Requests

application/json

500

Internal Server Error

application/json

501

Not Implemented

application/json

503

Upstream Unavailable

application/json

get

/v1/evaluation/runs/{run_id}

GET /v1/evaluation/runs/{run_id} HTTP/1.1
Host: platform.maniac.ai
Authorization: Bearer YOUR_SECRET_TOKEN
Accept: */*

{
  "created_at": "text",
  "finished_at": "text",
  "error_at": "text",
  "status": "text",
  "error": null,
  "object": "evaluation.run",
  "id": "text",
  "process_id": "text",
  "evaluators": [
    "text"
  ],
  "container": "text",
  "dataset_id": "text",
  "sample": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "ground_truth": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "baseline": {
    "range": "0:100",
    "type": "text",
    "dataset": "file_abc123"
  },
  "results": {
    "overall": {
      "avg_score": 1,
      "avg_accuracy": 1,
      "ANY_ADDITIONAL_PROPERTY": "anything"
    },
    "metadata": {
      "ANY_ADDITIONAL_PROPERTY": "anything"
    },
    "per_model": [
      {
        "model": {
          "ANY_ADDITIONAL_PROPERTY": "anything"
        },
        "per_eval": {
          "ANY_ADDITIONAL_PROPERTY": {
            "accuracy": 1,
            "avg_score": 1,
            "num_total": 1,
            "num_errors": 1,
            "num_failed": 1,
            "num_passed": 1,
            "num_scored": 1,
            "num_missing": 1,
            "avg_accuracy": 1,
            "break_reason": "text",
            "expected_count": 1,
            "ANY_ADDITIONAL_PROPERTY": "anything"
          }
        },
        "avg_score": 1,
        "avg_accuracy": 1,
        "launch_index": 1,
        "launch_call_id": "text",
        "ANY_ADDITIONAL_PROPERTY": "anything"
      }
    ],
    "launch_count": 1,
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "metrics": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "config": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  },
  "spend": 1,
  "metadata": {
    "ANY_ADDITIONAL_PROPERTY": "anything"
  }
}

Healthz

get

Health check endpoint for load balancers and uptime monitors.

Responses

200

Successful Response

application/json

Health check response.

okbooleanRequired

get

/healthz

GET /healthz HTTP/1.1
Host: platform.maniac.ai
Accept: */*

200

Successful Response

{
  "ok": true
}

Last updated 15 days ago

Good morning

hashtagList evaluation runs

hashtagCreate an evaluation run

hashtagGet an evaluation run

hashtagHealthz

List evaluation runs

Create an evaluation run

Get an evaluation run

Healthz