Upload Existing Data

Adding pre-existing data to a container.

Maniac lets you upload existing datasets directly into a container. These might be inference logs from a different inference provider, or a labeled dataset. Once uploaded, the data appears alongside inference logs and can be used for optimization and evaluation.

Boilerplate

from maniac import Maniac
maniac = Maniac()

# First, create a container if you don't already have onee
container = maniac.containers.create(
    label="my-container",
    initial_model="some-base-model",
    initial_system_prompt="Your system prompt here",
)

# Second, create the messages object that will populate your container
dataset = [
    {
        "input": {
            "messages": [
                {"role": "system", "content": "..."},
                {"role": "user", "content": "..."},
            ]
        },
        "output": {
            "choices": [
                {"message": {"role": "assistant", "content": "..."}}
            ]
        },
    }
]

# Register completions. These will now show up on the Maniac dashboard in your container.
maniac.chat.completions.register(
    model="maniac:my-container,
    dataset=dataset,
)

Each dataset entry consists of:

  • input : the messages sent to the model (system prompt & user prompt)

  • output : the assistant response

Example: Uploading a HuggingFace Dataset

Let's walk through an example using the LEDGARarrow-up-right (Tuggener et al. 2020) dataset, great for training and testing legal clause classification models.

Prerequisites

Load the HuggingFace Dataset

Define the System Prompt

Create a container

You can also skip this step and upload a dataset to an existing container, where it will be combined with any existing inference logs.

Upload Data in Batches

For large datasets, it's recommended uploading in batches to avoid timeouts.

Note: Unlike generating completions inside a container—where the container’s system prompt is automatically applied—registering (logging) existing completions requires the system prompt to be included explicitly with each messages object. Registered completions do not inherit the container-level system prompt.

Last updated