The Vault and the Poet: an AI that updates my books every day — without ever booking a thing

I built an AI agent that updates my membership administration in the books every morning. The uncomfortable part: a model that guesses, loose on my financial core. Here’s how I made that safe — with a principle from ERP pioneer Jan Baan as my compass.

A digital vault with an AI brain in front of it, half organic, half circuit — The vault and the brain. The truth sits safely behind the door; the thinking happens in front of it.

The idea that kept me up at night

Since this spring our memberships run through Stripe. Every payment has to land back in our bookkeeping in Odoo as proper administration: a customer, a sales invoice, a payment, neatly reconciled. Until now I did that by hand or half-automatically. It could be more elegant: let an AI agent do it every day.

And that is exactly where it pinched. Because what you’re really saying is: “Dear language model, here are the keys to my ledger — go ahead and book.” A model that guesses by nature, that sometimes hallucinates, that has good days and odd days, unsupervised in the one place where a single wrong number bends your annual accounts out of shape. That’s not automation, that’s roulette.

The temptation with these tools is to make the AI be the action. The smart answer is the opposite.

What Jan Baan says about it

In mid-June, Jan Baan — the man behind Baan Company, one of the founding fathers of ERP — wrote a sharp analysis on LinkedIn. His point: never make your core system the “engine of action”. Letting generative AI loose on the rigid tables of a ledger he calls “AI on broken systems” — it produces “probabilistic spaghetti” in exactly the core that has to be one hundred percent correct.

His alternative is a strict separation of powers. The core system becomes an Iron Fort: not an action engine, but a heavily secured vault — the place where the truth lives, purely deterministic. The action moves to a layer above it, split across three roles:

Four stages: The Poet, The Lock, The Accountant, The Iron Fort — Baan's separation of powers. The poet thinks, the lock signs, the accountant executes exactly, the vault keeps the truth.

The Poet — the guessing AI. May propose ideas and draft a blueprint. Nothing more.
The Lock — a human looks at that proposal and deliberately signs off on it.
The Accountant — a dumb, exact machine that executes only what was approved, letter for letter, without interpretation of its own.

The one sentence that captures it all, and that should have been pinned above my screen: “The guessing AI never directly touches the live environment. 1 + 1 always stays 2.”

This was written for the enterprise — for SAP-sized giants and Java code. But the principle isn’t big or small. It’s simply right. And so I built it, small and concrete, in the bookkeeping of our foundation.

My translation of the three roles

The mapping was almost too neat:

Claude is the Poet. The model pulls the Stripe transactions, judges them, and smells whether something is off. But it never books a single cent itself.
I am the Lock. I review the proposal — the “dry run” — before any money is booked.
A plain Python script is the Accountant. Dumb, exact, and idempotent: it books exactly what was approved, and if you accidentally run it twice, nothing doubles.
Odoo is the vault. The truth lives there, and only the Accountant may enter.

The crucial difference from “AI that does your bookkeeping” is in that one word: the script is the booking engine, not the model. Claude starts it and watches it. It’s the guard dog, not the hand on the pen.

The Accountant speaks Python

So far I’ve called it “the Accountant” — the dumb, exact machine that executes what was approved. Time to say what that concretely is: a Python script. No AI, no model, no magic. A few hundred lines of code that do exactly one thing, the same way every time, and that I can read line by line.

There’s a nice line running back to Jan Baan here. In his architecture the deterministic executor is an engine that translates everything into Java 21 — the language of the enterprise, robust and proven. In ours, that same role is a Python script. Different language, identical principle: a language model may think, but the execution is plain, predictable, readable code. Determinism isn’t a property of the language; it’s a property of the design.

And here’s the nice coincidence — or maybe no coincidence at all. Because the vault itself, Odoo, is also written in Python. So our Accountant literally speaks the mother tongue of the bookkeeping it writes into. That doesn’t just make it elegant, it makes it buildable: I run a separate Odoo environment on Odoo.sh, where not only the system runs but the full source code is available too. I can see how it works on the inside, and build fitting scripts around it that line up exactly with how Odoo does things.

For someone who comes out of the Exact world, that’s a small relief. I’ve worked for decades with closed bookkeeping systems — solid, but a black box: you get what the vendor gives you. Odoo is from a different generation. Open, modern, in a language I can read, with an environment in which I can build myself. After all, you can only secure something properly if you’re allowed to look inside.

What “safe” literally means here

Talking about security is easy. Let me show it. This is — lightly trimmed — the only function with which an invoice is created in our bookkeeping:

def safe_create_vendor_bill(..., dry_run=True, prod_confirmed=False):

    # 1. By default this changes nothing. It only shows what it WOULD do.
    if dry_run:
        return {"dry_run": True, "would_create": {"state": "draft", ...}}

    # 2. Only when you deliberately say 'book for real': the brake on production.
    guard_production_write("create vendor bill", confirmed=prod_confirmed)

    # 3. Booking always happens as a draft — never directly final.
    bill_id = rpc(session, "account.move", "create", [bill_data])

    # 4. And every booking leaves a trail.
    _audit_log("create_vendor_bill", "account.move", {...})

Four lines of comment, four safeguards — and none of them optional:

The safe state is the default. The function isn’t called create_vendor_bill but safe_create_vendor_bill, and it starts in dry_run. You have to step out of it deliberately to write anything. Forget to, and simply nothing happens — the worst that can go wrong is that there’s no booking.
The brake sits between the decision and the deed. guard_production_write does nothing on the test environment, but refuses on production — unless there’s an explicit yes. And that yes the AI cannot give itself; it comes from me. This is the Lock, in one line.
Booking is always a draft. A draft you can review and throw away; a final entry you cannot. The Accountant never sets the irreversible stamp itself.
No mutation without a trail. Who, what, on the basis of which proposal — it all lands in a logbook.

And the brake itself is as sober as the rest:

def guard_production_write(action, confirmed=False):
    if not is_production:  return        # test environment: free
    if confirmed:          return        # you deliberately said 'yes'
    raise RuntimeError(f"Production {action} REFUSED: no confirmation.")

No AI in these lines. No chance. A production booking without deliberate confirmation isn’t discouraged or logged — it’s refused. That’s the difference between a system that hopes it goes well, and a system in which it can’t go any other way.

How the watchdog spends its day

Every morning the same cycle runs. Not one big magical leap, but five sober steps:

Five steps: Dry Run, Judge, Book, Verify, Trace, with a green/red outcome — The daily cycle. First look, then judge, then only — through the script — book, verify and leave a trail.

Dry run. Pull the plan. Book nothing yet. What would be booked today?
Judge. This is where the Poet does its real work — the work a plain automation can’t do. Are there refunds? An unknown product type? A member that suspiciously resembles an existing contact? An odd amount? On a hard doubt: stop, book nothing, warn me.
Book. Only when the plan is clean does the deterministic script run. Idempotent, so safe.
Verify. And this is my favourite step, because this is where Baan’s sentence lives. The balance of the Stripe account in Odoo must match the real balance at Stripe to the cent. If it doesn’t, something is wrong in the chain. 1 + 1 must be 2 — and if it’s 1.98, the whole thing locks.
Leave a trail. Every booking gets an attachment in Odoo with its origin: which Stripe transaction it came from, which amount, which fee. Every line in the ledger traces back to its source.

Green? Then I hear nothing — as it should be. Yellow or red? Then I get a message. The agent never repairs anything itself; a half-finished run is safe (idempotent), but a guessed repair is not.

One subtlety I learned along the way: a naive check that alarms on every difference screams all day. Because in a live payment system new payments arrive constantly. The direction of the difference is the signal. Does Stripe have more than Odoo? Then transactions are still waiting — harmless, just book them. Does Odoo have more than Stripe? That’s when something is wrong. Suspicious. Red.

Where the agent runs — and where it may not go

This daily run happens in Claude Cowork: the variant of Claude that doesn’t live in a chat window but, like a real colleague, carries out tasks on its own on a fixed rhythm. The interesting part is under the hood. The moment a task starts, Cowork spins up its own, sealed-off Linux environment on the machine — a disposable workspace in which the agent may read files, run scripts and do its work, without ever being able to reach the rest of the system.

That isolation has one deliberate, sharp edge: the network of that workspace is locked down. A script that tries to call the outside world directly from inside that shell — Stripe, the bookkeeping — gets blocked. And that’s exactly right. You don’t want an autonomous agent able to reach the open internet from a sandbox.

But it did mean I had to choose, deliberately, where the booking engine runs. The solution fits the whole principle of this story seamlessly: the engine runs not in the sandbox, but on the host machine itself, through a separate, explicit connector. The agent in the sandbox only uses that connector — it gives the start signal and reads the result. The heavy work happens outside the shell, where the network is open, along a path I opened myself and can close again.

Left blocked in the sandbox, right the fix via a connector on the host — The sandbox guards the agent, the connector guards the network. The script runs on the host, where the network is open.

It’s not a detour, it’s the design. Security is never one wall; it’s a series of deliberate choices about where something may run and what it may touch. The sandbox guards the agent, the connector guards the network, and the booking engine remains the only thing that touches the vault.

First, get the vault itself in order

In more than forty years with bookkeeping systems, ERP and integrations, I’ve never let go of one rule: you write into a ledger with great care. A general ledger is not an ordinary database — it’s the financial truth of an organisation. Nothing may change in it unseen, and nothing irreversible may happen without a human deliberately signing for it. That is exactly what Baan’s 1 + 1 = 2 is about. It’s also the heart of what I call the BAAS principle: the owner stays boss of their own figures, and the system must never put them in a state they can’t get back out of.

With that bar in mind, I built the booking core before letting any agent near it. Because it’s precisely the heaviest actions — booking, paying, reconciling, closing a financial year — that deserve the heaviest protection. In the very place where it matters most that 1 + 1 = 2, the gate has to be shut tightest.

So I built it this way. A mandatory front door around the entire booking core, where every booking, payment and reconciliation first arrives as a proposal, then gets my go, and only then is executed letter for letter. An irreversible step — like locking a financial year — has its own separate ceremony, apart from the daily corrections, with an enforced restore point before it. A loud warning that leaves no doubt whether you’re working in the test or the real environment. And a logbook that records every mutation: who, what, on the basis of which proposal.

The heart of it comes down to one shift: from “it goes well because we pay attention” to “it cannot go wrong by design”. Discipline that happens to go well is not a foundation; it’s luck with a straight face. I’ve known that distinction too long to leave it to chance.

Why this matters for small business right now

The beautiful thing about Baan’s argument is that the big players are moving in exactly the same direction — core system as a vault, action in a separate layer, deterministic execution. But where the enterprise casts that in heavy Java foundations, in a small-business ledger you can build it with a handful of scripts, a good connector, and a model that behaves like a watchdog instead of a bookkeeper.

And that’s precisely where it becomes worthwhile. Because for a small business “certainty” isn’t a luxury feature — it’s the entire basis of trust in a system that works with your money. The trail every booking leaves, the fact that a human signs before money moves, the guarantee that the guessing part never touches the vault: that is what makes an AI agent on your administration usable instead of frightening.

My membership administration now runs every day. The agent books — and never touches the entries itself. The poet thinks, I sign, the accountant executes, the vault keeps the truth.

1 + 1 stays 2. Thank you, Jan.

With thanks to Jan Baan, whose LinkedIn analysis of the ERP as an “Iron Fort” was the compass for this work. The technique behind it — a deterministic booking script, an AI watchdog and a daily run — I’ve captured as a reusable pattern, so the next payment provider follows the same route.