Why I've Built an AI PC (and maybe you should too?)
I've been experimenting with LLMs in my homelab - this post was largely written by me ranting at Gemma3 27b and it organizing my thoughts 😄
As a "software engineer" – a title I sometimes feel I haven’t earned, given my background is in biology and chemistry – I believe that understanding AI applications are key to keeping your job, especially through a market downturn. Shareholders are obsessed with seeing the "AI button" getting pressed, in every facet of every application; I work for a big green vehicle manufacturer, and even we've partnered with OpenAI (probably to the tune of hundreds of millions of dollars) and it has been only a colossal waste of time and money for everyone. There have been ZERO hackathon projects, products, internal utilities, dashboards, anything that could possibly produce value as a result - even after the ~year or so we've had the partnership. Despite this, the company is still adding "AI Applications" to its KPIs for engineers. This is the largest driver for me familiarizing myself with LLMs, as the shareholder virtue signaling does not seem to be waning.
Beyond "professional necessity", I think that a lot of things you could ask ChatGPT (there are a few useful queries! What should I have for dinner tonight?) and I'd rather my data not exfiltrate to OpenAI or DeepSeek. I know that most people don't care about that, and that's okay, but it's important to me.
Why I think self-hosted LLMs are practically useless
LLMs are just fun. Watching hardware generate text feels wild; and that fun has quickly revealed the limitations of LLMs. The hallucination problem is awful – even with ample data, they introduce errors that can derail entire threads. I’ve seen it firsthand trying to transcribe recipes from photos. 8b and higher models generally do a surprisingly good job, but even a single-character mistake in things like a recipe is a big deal. I see no possible way for any application to be completely safe from the introduction of errant data through hallucination - not through prompting, fine tuning, web search, anything. Think about RAG systems for company documentation... you really don’t want hallucinations in on-boarding materials.
Ultimately, I see this as a “love of the game” pursuit. If you’re considering a home lab, understand it’s about exploration and learning, don't expect that you're going to be solving any real problems.
There is one practical application of LLMs in general I've found: writing software. The key is rapid verification (and lots of context...). You can quickly test code with unit tests or simply by building and running it. That quick feedback loop is critical for responsible LLM use. I've built several projects with Cline and Sonnet 3.7, and it's very useful for that kind of thing - none of them are very large or popular, but that's because my ideas aren't very good.
However... running these models locally for anything substantial requires expensive hardware. For meaningful projects, you need to handle 100,000+ tokens, which is beyond the capacity of my current GPU. While Gemma 3 27B offers a 128,000 token context window, it runs too slowly on my 8GB card to be practical. Which brings me to my personal setup...
I use a Lenovo M720q with a PCIe riser bracket sporting an Nvidia T1000 8GB card. It's got an i7 8700, and I've added 64GB of 2600MT/s DDR4, which does limit larger models pretty heavily (Llama3.3 70b gets about .8tk/s) but they still run and are useful for non-interactive applications, like "take my resume and this job application and write a cover letter".

For software development inference though, I'm outsourcing inference using OpenRouter, which feels like a compromise, and does cost money (around $20-25 so far). I’ll link the projects I’ve built at the bottom of the post. I'm hopeful that local inference will become more accessible soon.
For now, it’s about learning and experimentation. If you're an engineer and have the money to burn, give it a try - maybe it'll help you be ready when your KPIs change to include incorporating LLMs.
This is my showcase project for coding with LLMs:
https://github.com/Wgelyjr/kanely