
Ever looked at your Claude Sonnet 4.5 cost chart and wondered what those coloured bars actually mean?
Today I was exploring AI capabilities to migrate a schema from DB2 to a modern cloud-native database engine. As an initial step, I needed to spin up a DB2 instance locally inside a Docker container. Once that was running, I decided to check my Anthropic console to see how much this was costing me.
After some research into what each category means, here’s how I interpret them — using a familiar engineering analogy: DB2 running in Docker on a local machine.
Prompt Caching Write ($0.47)
This is the one-time cost of uploading and caching your first large prompt.
Think of it like starting a fresh DB2 container — loading binaries, allocating memory, building buffer pools.
It’s setup-heavy but only happens once if you keep the session alive.
Prompt Caching Read ($0.04)
This is the read-back from the cache — fast and cheap.
Just like reusing a warm DB2 container where data pages are already in memory.
The more you reuse, the more efficient (and cheaper) it becomes.
Output ($0.12)
This is the cost of the model generating responses.
Comparable to DB2 executing SQL and returning result sets — cost grows with complexity and output size.
Input ($0.00)
Your input prompt size was small enough to be negligible.
Like sending a lightweight SQL query or config command.
Engineering Takeaway
The real cost isn’t model computation — it’s context management.
We can reuse cached prompts within 5 minutes, trim unnecessary context, and keep sessions “warm.”
Exactly the same mindset you’d apply to optimising DB2 containers: avoid cold starts, reuse buffers, and keep resources hot.

Leave a Reply