Back to blog

The Brain Runs on a Schedule Now

Two weeks ago I shipped the first milestone of The Brain — the bare runner. You wrote a Python file with a sequence of steps; you ran brain run path/to/workflow.py; the run landed in Postgres; you inspected it from the CLI. That was M1. It works, and you can run it on demand whenever you want.

Today M2 is done. The Brain now runs on a schedule, on its own, without you in the loop.

Why this matters

The most useful workflow automation only kicks in when you stop having to babysit it — daily digests, scheduled exports, nightly summaries, anything that compounds. M1 proved the runner works. M2 is the milestone where leaving it alone is a reasonable thing to do.

That's the whole point of M2 in one sentence. The rest of this post is what that looks like in practice.

What M2 ships

Cron schedules. Register a workflow on a standard 5-field cron expression. The Brain writes the schedule to Postgres next to your run history.

docker compose exec brain brain register examples/daily_digest.py --cron "0 9 * * 1-5"

The schedule validates the cron expression and the workflow file before it lands in the database. Duplicate schedule names are rejected — no silent overwrite. You can list everything that's registered, see when each one ran last and when it'll fire next, pause and resume schedules, and unregister them when you're done. All from the CLI you already used in M1, all against the same database.

A scheduler daemon. The container now runs a long-running process — a daemon — that polls the schedule table every 10 seconds and fires whatever is due. SIGTERM finishes the currently-running workflow before exiting cleanly. On a crash, any run that was in flight gets recovered as a failed run with a clear error, so the run history never lies about what's running and what isn't.

The daemon and the CLI are separate processes against the same database. You don't have to "stop the daemon to run a workflow" — docker compose exec brain brain run ... still works exactly as it did in M1, in parallel with whatever the daemon is doing.

A new brain daemon-status command tells you whether the daemon is alive. Docker uses the same command as its container healthcheck — so the orchestration layer notices if the daemon dies and brings it back.

Workflows that read their previous run. A step can write {previous.<step_name>} in its prompt or command, and The Brain substitutes the same step's output from the last successful run of the same workflow. A daily digest can include yesterday's summary in today's prompt. A diff workflow can compare today's state to yesterday's. A long-running task can build on itself.

LLMStep(
    name="summary",
    prompt=(
        "Yesterday's summary:\n{previous.summary}\n\n"
        "Today's memories:\n{recent}\n\n"
        "Write today's summary."
    ),
)

There's a strict rule: on the very first run, when there is no previous successful run, that step fails with a clear error rather than silently substituting empty string. Same strict-failure shape as M1's intra-run {step_name} placeholder — better to halt loudly than to leak braces into a shell command.

An opt-in HTTP endpoint. POST /run accepts a workflow path, runs the workflow, and returns the run's metadata as JSON. Bearer token from an environment variable; without the token in the environment, the service refuses to start. Designed for server-to-server, not browsers — no CORS, no public docs, single token. Opt in by bringing up the api compose profile.

THE_BRAIN_API_TOKEN=your-secret docker compose --profile api up -d

If you want to fire workflows from another machine, this is the surface. If you don't, you ignore the profile and nothing in M1 changes.

Same workflow file shape as M1. Same three step types. Same persistence model — one row per run, the full step-by-step output as a JSONB array, the same brain history and brain show introspection. Workflows you wrote for M1 don't change; they still work the same way when you run them with brain run, and they work the same way when the daemon runs them on a schedule. Same step types, same placeholders, same output rows. The only difference is who started the run.

How it works under the hood

The scheduler daemon is the new long-running piece. The shape is small on purpose. A single function does one poll cycle — heartbeat row, look up what's due, fire each one sequentially, advance the next_run_at column after each fire. A wrapper around that function loops every 10 seconds. Tests drive the cycle function directly with a frozen clock instead of spawning a real long-running process.

Cron precision is minute-level, so 10s polling is enough resolution. "Sleep exactly until the next scheduled fire" would be neater but isn't worth building — 10 seconds is already imperceptible for the workflows that actually run on schedules.

Two invariants are worth naming because they're easy to get wrong.

Skip, don't catch up. If a workflow takes longer than its cron interval — say a 1-minute cron whose last run took 5 minutes — the daemon does not queue up four backlog fires for the boundaries it missed. It fires once, advances next_run_at to the next cron boundary after right-now, and moves on. A schedule that fell six hours behind fires once and continues on its current cadence. Catching up across a long outage is almost always wrong; it floods the system with stale work the moment it comes back.

Sequential within a poll cycle. No concurrent workflow execution. A long-running workflow blocks the daemon from picking up other due workflows until it finishes. This is by design for v1.0; parallel execution and a real work queue are a v1.1 concern.

The placeholder lookup is a single indexed query. A partial index on workflow_runs (workflow_name, started_at DESC) WHERE status = 'success' makes it index-only. The previous run's output JSONB is decomposed into a step-name → output map at lookup time, which is what {previous.X} resolves against.

The HTTP endpoint is intentionally minimal. One sync endpoint, one body field, one bearer token, no API docs exposed in production. The threat model is single-token-server-to-server — anyone with the token can execute arbitrary server-side Python by pointing the endpoint at any file on the host. The token is the only gate; treat it as a production secret. Path allowlisting and per-caller scoping are a v1.1 concern.

What M2 does not do

Worth naming so the next post isn't an apology.

What's next

Milestone 3 is the reactive layer — webhook triggers and file-watcher triggers. That's when The Brain stops only firing on the clock and starts firing in response to things that happen.

The full roadmap and milestone progress table live in the repo's README.

Try it

git clone https://github.com/MihaiBuilds/the-brain
cd the-brain
docker compose up -d
docker compose exec brain brain daemon-status
docker compose exec brain brain register examples/hello.py --cron "*/1 * * * *"
docker compose exec brain brain list

That's the whole walkthrough. Wait a minute, run brain history, and you'll see the daemon-fired run sitting in there alongside any brain run invocations from the M1 quickstart — same row shape, same inspection commands, same database.

The repo has the longer version with state-across-runs and the HTTP endpoint walkthrough.

Follow the build

Watch the repo to follow along as each milestone ships.