Skip to content

blackboxml

PyPI version Python Version Tests Downloads ReadTheDocs

ML experiment tracking. Local, lightweight, framework-agnostic.

blackboxml logs your training runs as structured JSON. No accounts, no servers. It works with PyTorch, Keras, scikit-learn, or any Python training loop.

Features

  • Three tracking modes. @track decorator, Run context manager, or Keras BlackBoxCallback.
  • Automatic environment capture. Git commit, dirty state, Python version, framework versions, hostname.
  • Batch-to-epoch aggregation. MetricStore computes weighted averages across variable batch sizes.
  • Local JSON storage. Runs saved to blackboxml_logs/ as human-readable JSON, no database needed.
  • Built-in CLI. List, inspect, and clean up runs from the terminal with bbml.
  • Visualisation. Plot training and validation curves from any saved run.
  • Zero config. Install and start logging in two lines of code.

Install

pip install blackboxml

For Keras/TensorFlow callback support:

pip install blackboxml[keras]

Requires Python 3.10+.

Quick start

Decorator (generator functions)

from blackboxml import track, MetricStore

@track(name="resnet_cifar10", tags=["pytorch", "cifar10"])
def train():
    metrics = MetricStore()
    for epoch in range(10):
        metrics.reset()
        for batch in dataloader:
            loss, acc = train_step(batch)
            metrics.update({"loss": loss, "acc": acc}, n=len(batch))
        yield metrics.compute()

train()

Context manager

from blackboxml import Run

with Run(name="resnet_cifar10", tags=["pytorch"]) as run:
    for epoch in range(10):
        run.log({"loss": train_one_epoch(), "epoch": epoch})

Keras callback

from blackboxml.callback import BlackBoxCallback

model.fit(x_train, y_train, epochs=10,
          callbacks=[BlackBoxCallback(name="lstm_nlp", tags=["keras"])])

scikit-learn

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
from blackboxml import Run

with Run(name="rf_search", tags=["sklearn"]) as run:
    for n in [50, 100, 200]:
        scores = cross_val_score(RandomForestClassifier(n_estimators=n), X, y, cv=5)
        run.log({"n_estimators": n, "accuracy": scores.mean()})

Each run is saved to blackboxml_logs/<name>_<timestamp>/run.json. Use bbml runs to list them, bbml show <name> to inspect one, or visualise_run() to plot the curves.

Next steps