The cost seems deceivingly low right now because those AI companies are fighting for monopoly, but in reality the cost is huge – not only capital, but also trust, privacy, and environmental.
If the concern is about the inference cost -- we do have open-weight models that are getting more powerful, and hardware to run small-ish models cheap. I run agents using small local models in my MacBook.
Aider (cli) and continue.dev (VS Code plugin) can both run with a local(net) Ollama. The qwen-coder models are pretty good and getting better; qwen3-coder is in the ballpark of Sonnet 3.5 for code-synthesis, albeit slower on my hardware.
For quality to be comparable, you need to use a relatively big model, which will only work if you have around 64GB of RAM or more. The latest OpenAI local models (https://openai.com/index/introducing-gpt-oss/), for example, are really good, but you probably want the 120b to have results at least near what you get with their best cloud models, and that requires I think 80GB+. If you don't have that much, you can try stuff like the DeepSeek models, which are known for being ultra-efficient and runnable with "normal" computers, if you don't mind the politics of using that (and there are many models now that are similar!) but I haven't tried too many more to be able to comment.
On my Macbook M1 Pro I can run the gpt-oss-20b model without issues and quite fast.
I had pretty mixed experiences with the 20B version of GPT-OSS, sometimes that thing would just start looping in the thinking block and no sampler parameters would seem to do anything for specific questions.
That said Qwen3 and Qwen3 Coder are both pretty nice. Also ERNIE 4.5 if the benchmarks are to be trusted but I mostly run Ollama instead of vLLM now so can’t test it out atm (apparently llama.cpp added support for them recently though).
The models by Mistral might also be worth a look and personally I thought the EuroLLM project was also nice, but MoE models feel way more palatable on limited hardware.
Neither seem to be able to directly compete with Sonnet 4 or Gemini 2.5 Pro, would need way better hardware to come close.
Hmm, well. So I need a 64GB MBP to run the AI tools, and another machine (likely running Linux) to run the system under development, since we're going all local. Well, doable.
Not sure why parent is being downvoted here. Even without getting into whether it's possible for technology to be apolitical, many AI companies have explicitly political goals.
For example, OpenAI's charter is "to ensure that artificial general intelligence benefits all of humanity". They go on to list more specific political goals downstream from that: https://openai.com/charter/
I care far more about the noise and air pollution that x.ai is causing in Memphis (ruining lives) than the environmental impact of the industry as a whole.
6% YoY growth in domestic electricity demand is frankly nothing compared to the capacity that developing economies are building out for things other than AI.
The cost seems deceivingly low right now because those AI companies are fighting for monopoly, but in reality the cost is huge – not only capital, but also trust, privacy, and environmental.