KK

Kamil Kwapisz

Tech founder, developer, AI enthusiast

Kamil Kwapisz

Kamil Kwapisz

3 min read

Plan with Opus 4.5, execute with cheap models - AI costs strategy

Plan with Opus 4.5, execute with cheap models - AI costs strategy

Plan Smart, Execute Cheap

While every month we are getting cheaper and cheaper high-quality models, the best ones available are still quite expensive. Sometimes even 25 times more expensive.

If you are running a company with solid revenue and have just fired a few senior developers, you probably do not need to think much about LLM pricing. You will most likely spend less on tokens than on a junior developer’s salary.

But if you have a smaller budget, you should consider implementing strategies for smarter AI usage.

The simplest strategy that actually works

The easiest one to apply is:

✨ Plan with expensive models, execute with cheaper ones

It sounds obvious, but many AI users still do not take advantage of it. And this strategy works extremely well.

Why planning matters more than execution

Nowadays, most AI assistants support some form of planning mode. At the beginning, it was mainly used as a prompt fixer or enhancer. Today, these agentic tools can quickly produce very detailed plans, including:

  • Clear overviews
  • Diagrams
  • Implementation steps
  • Explanations
  • TODOs

And this is the most important part of the process.

A great plan that ends with truly atomic tasks is the best possible starting point for execution, no matter whether it is done by humans or by LLMs.

If the overview and tasks are clear enough, even a smaller, “dumber” model will know exactly what to do and will execute it as intended.

Especially now, when even tiny, super-cheap models can make you say “WOW” with how capable they are.

How I apply this in practice

My recommendation is simple:

  • Use the smartest model you have for planning (for example, Opus 4.5)
  • Use any cheap and fast model for execution

When I am coding in Cursor, I use Composer 1 to build what Opus has planned, because it is super fast.

Even when the plan is extensive and the model needs to update many files, a fast LLM keeps me in the loop all the time. There is no temptation to switch windows and start doing something else while waiting.

That is better for:

  • Work output
  • Focus
  • Brain energy

This applies beyond coding

The same strategy applies not only to AI-assisted coding, but also to production-ready AI systems and agents.

And finally some tips:

  • Always start with the smartest guy planning, and let the rest of the team execute.
  • Planning does not mean your prompts can be bad. AI can help you, but it won’t read your mind. Describe what you need in a well structured way, iterate with changes if the output is insufficient.
  • Sometimes cheap models won’t be enough. Don’t spend tokens for running same job over and over if it fails a lot. Try switching model instead.

Kamil Kwapisz