Silent Stops
It stopped midway, but you did not notice until much later. With alerts, you could have acted sooner.
Now in Beta
Track long-running processes in real time. Get notified on failures and manage everything from a unified dashboard with no code changes.
Installation
$ curl -fsSL https://github.com/kazuki-kanaya/obsern/releases/latest/download/install.sh | sh # Run your training script normally
python train_model.py --epochs 100
# One-time setup (generate obsern.yaml)
➜ obsern init
# Wrap it with obsern
➜ obsern run python train_model.py --epochs 100
HOW IT WORKS
Just wrap your existing command and start monitoring in minutes.
The pain
A workflow that relies on repeatedly checking processes over SSH leads to missed completions and delayed failure detection. Obsern reduces that wasted wait time and rework, and lowers operational burden.
Silent Stops
It stopped midway, but you did not notice until much later. With alerts, you could have acted sooner.
Slow Debug Cycles
A background run fails, but the cause is unclear. With Obsern, you can immediately check exit status and error output.
Multi-Host Blind Spots
It is hard to know what is running on which server. Obsern gives you one cross-host view of runtime status and logs.
Zero Code Changes
Wrap existing commands with obsern. No SDK integration or instrumentation boilerplate required.
Instant Notifications
With webhook integration, failure, completion, and runtime log notifications can be delivered to Slack or Discord right away.
Unified Dashboard
Check status, logs, and runtime hosts in one place, with smoother root-cause investigation.
Optimized for Team Use
Share workspaces, invite members, and control access with owner/editor/viewer roles so operations stay clear and secure.
Obsern focuses on execution health rather than experiment metrics.
| Feature | Obsern | MLflow | TensorBoard |
|---|---|---|---|
| Primary Goal | Execution monitoring | Experiment management | Training metrics visualization |
| Adoption Method | CLI wrapper (no code changes) | SDK integration | SDK integration |
| Monitoring Target | Running processes | Experiment logs | Training logs |
| Abnormal Exit Alerts | Instant alerts (Slack) | Not built-in | None |
| Monitoring Scope | Across multiple servers | Per project | Local-first |