# Registering in Batches

Maniac lets you upload existing completions directly into a container. These might be inference logs from a different inference provider, or a labeled dataset.

## Boilerplate

```python
import requests

# First, create a container if you don't already have onee
container = maniac.containers.create(
    label="my-container",
    initial_model="some-base-model", # If you want to track the model that originated these completions
    initial_system_prompt="Your system prompt here",
)

# Second, create the messages object that will populate your container
dataset = [
    {
        "input": {
            "messages": [
                {"role": "system", "content": "..."},
                {"role": "user", "content": "..."},
            ]
        },
        "output": {
            "choices": [
                {"message": {"role": "assistant", "content": "..."}}
            ]
        },
    }
]

# Register completions. These will now show up on the Maniac dashboard in your container.
maniac.chat.completions.register(
    model="my-container,
    items=dataset,
)
```

Each dataset entry consists of:

* `input` : the messages sent to the model (system prompt & user prompt)
* `output` (optional): the assistant response

## Example: Uploading a HuggingFace Dataset

Let's walk through an example using the [LEDGAR](https://aclanthology.org/2020.lrec-1.155.pdf) (Tuggener et al. 2020) dataset, great for training and testing legal clause classification models.

#### Prerequisites

```python
export MANIAC_API_KEY=...
pip install maniac datasets
```

#### Load the HuggingFace Dataset

```python
from datasets import load_dataset

DATASET = "coastalchp/ledgar"

# Load dataset
dataset = load_dataset(DATASET)
train_split = dataset["train"]

# Extract inputs and labels
clauses = train_split["text"]                  
label_ids = train_split["label"]              
label_names = train_split.features["label"].names
```

#### Define the System Prompt

```python
system_prompt = (
    "You are a legal clause classifier.\n"
    "Given a clause, return exactly one label from this list:\n"
    + "\n".join(f"- {label}" for label in label_names)
    + "\nRespond with only the label name.")
```

#### Create a container

You can also skip this step and upload a dataset to an existing container, where it will be combined with any existing inference logs.

```python
from maniac import Maniac

maniac = Maniac()

container = maniac.containers.create(
    label="LEDGAR-register",
    default_system_prompt=system_prompt,
)
```

#### Upload Data in Batches

For large datasets, it's recommended uploading in batches to avoid timeouts.

```python
BATCH_SIZE = 1500
START = 0
MAX_SAMPLES = 60000

end = min(len(clauses), START + MAX_SAMPLES)

for batch_start in range(START, end, BATCH_SIZE):
    batch_end = min(batch_start + BATCH_SIZE, end)
    dataset = []

    for i in range(batch_start, batch_end):
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": clauses[i]},
        ]

        dataset.append({
            "input": {"messages": messages},
            "output": {
                "choices": [
                    {
                        "message": {
                            "role": "assistant",
                            "content": label_names[label_ids[i]],
                        }
                    }
                ]
            },
        })

    client.chat.completions.register(
        container=container,
        items=dataset,
    )

    print(f"Uploaded samples {batch_start}–{batch_end - 1}")
```

> **Note:** Unlike generating completions inside a container—where the container’s system prompt is automatically applied—**registering (logging) existing completions requires the system prompt to be included explicitly with each messages object**. Registered completions do not inherit the container-level system prompt.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.maniac.ai/datasets/registering-in-batches.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
