#MeWriting On the front page of this website, I write:
“And yet, your PC or laptop is already capable of performing the delightful tasks that large language models (LLMs) actually perform well.”
Hand Turkey by Grok.
My use of the word “delightful” and of “delight” in general to describe what LLMs do well is often misunderstood. How do I know it’s misunderstood? Because people parrot it back to me as a reason that ChatGPT does things for them it doesn’t actually do.
“I would be delighted if ChatGPT took this information and created a well-researched business plan from it that I could start implementing tomorrow.”
Yeah, that’s not it. So let me tell you why I describe some tasks we give LLMs as “delightful” and why those are the tasks LLMs are good at.
A “delightful” task is one where it doesn’t matter how wrong the answer is. It will still elicit a genuine smile from you. I have used Mmojo Server — download free here — to write stories about Mona, Johnny, and Eeyore hundreds, maybe thousands of time. It’s just a great test case to use as I’m working on Mmojo Complete or getting a release of Mmojo Server out the door. I read fast, so I usually take a moment to read them. They still always bring a smile to my face. When I (rarely) skip reading because I have a time crunch, I feel a little guilty. Weird, right? I often send the stories to friends, some specifically to annoy them.
Imagine your kid participated in a “literacy project” at school and came home with an original “hand turkey” picture. You don’t have to fake a smile and a sense of pride despite it being objectively bad. That’s the delight I’m talking about. You would immediately hang that over the screen on your new smart refrigerator and know it was an improvement to your kitchen decor.
I had Grok make that one for me because I didn’t want to steal art directly from another web site. A little side joke for you. Long after arts and crafts were removed from school curricula, hand turkeys became a popular staple of literacy projects. They used to drive a couple of literacy experts I worked with in the 00s bonkers. I think they’re adorable though.
Here’s the point. If you’re “delighted” that an LLM gave you an actionable go to market plan for your business that will result in increased ROI while not harming your MPG, I’m happy for you. But that’s not the “delight” I’m talking about.
If you’d like to start experiencing some of this LLM delight, get Mmojo Server for your PC or laptop. It’s private and free. No sign up. No installation. Zero footprint.
This is an page that used to sit to be linked from the masthead. To clean up the site for products, I’ve moved it here. -Brad
#MeWriting I have 3 core principles for generative artificial intelligence (GenAI) in general, and large language models (LLMs) in particular. These principles are:
Privacy
Dignity
Requisite Chill (was “Intellectual Honesty”)
I recommend that people adhere to these principles when using GenAI tools. My own tools are designed to promote them. Let’s briefly explore these principles.
People routinely upload private and sensitive medical, financial, business, and even customer data to ChatGPT and other cloud LLMs. Once uploaded to a cloud system, it can potentially be accessed by hackers or just inadvertently shared publicly. It might be subject to court orders preserving the data. Once uploaded, it is no longer private. You should not submit private data to public cloud systems.
Worse, the data you upload may not be considered. There is no guarantee that in generating an answer to a question you pose about your uploaded data that the cloud LLM can actually access it or is actually using it. There is no audit trail to show you whether it has. It may very well be making up an answer without even considering the private data you have uploaded.
Even worse, if considered, the data you upload may not be effective in formulating a response. LLMs do not analyze or think. They generate answers one random best-enough token at a time, quite similar to how you might have played the autocomplete “game” on your phone. If your data doesn’t linguistically steer the completion algorithm to an answer, it will be ineffective.
My Mmojo Server LLM server runs on your PC or laptop without sending any data to any other computer.
My Mmojo Knowledge appliance offers an LLM server for use on your private network that does not leak your cues, completions, questions, or answers outside your network.
In April, 2025, a 16 year old named Adam Raine, from my old home town of Rancho Santa Margarita, CA, took his own life after 7 months of using ChatGPT. For several months, the chatbot helped Adam ideate his suicide and appears to have encouraged him to commit the act. Adam’s family has sued OpenAI, the creator of the near ubiquitous ChatGPT. OpenAI’s response and defense is that Adam bypassed “safety” mechanisms and violated the terms of service. There are surviving members of at least 6 other families with similar lawsuits against OpenAI at this time.
This is absurd. Chat is an abomination. There is no reason that anyone has to pretend to have a back and forth conversation to get information from an LLM. Chat isn’t just an engaging, even dangerously addictive, mode of interaction. It is a cheap illusion enabled by stop words that creates authority that does not exist, subjugating users as a price of admission.
You do not have to cosplay with a fake computer character to access knowledge contained in an LLM. You look like a dork doing it, and if you are particular vulnerable to the illusion, it can drive you to do horrible harm to yourself and others.
Side note: Minor children should not use any system employing generative algorithms without direct, attentive parental supervision. It does not matter whether these systems have chat or completion user interfaces. Your kids should not use them without *you*, their parent, present and attentive.
Side note: Your kids shouldn’t be subjected this chatbot garbage in school either.
My Mmojo Complete user interface, part of Mmojo Server and the Mmojo Knowledge Appliance, is a powerful completion style UI that doesn’t require you to role play.
If you understand how the completion algorithm works — generating one best enough random token at a time — you know what it does not do. It does not think. It does not reason. It does not generate “correct” answers. It cannot pursue “truth”. It does not write production quality code. It makes your LinkedIn posts look mid, at best, and only that good because lots of other mids are using AI to write their posts too. It cannot replace a thinking, conscientious human being.
When a “manager” or “leader” proclaims that AI will replace any workers or make workers more efficient, the manager or leader is stupid or he’s lying. I have the confidence to say that, believe that, and back it up because I know how these systems work. I want regular people to be that confident because they too know how these systems work. Being bullied by smart people is unfortunate. Being bullied by idiots is inexcusable.
Empirical evidence is gathering that optimistic AI boosters have been unable to get these systems to do what they promised the systems would do. We now have plenty of data to support the contention that most of AI hype isn’t just wrong. It’s intellectually dishonest. It’s also tacky, and it just lacks requisite chill.
That said, we need to find and embrace applications where GenAI is appropriate and useful. These include creating prototypes intended to be thrown away, visualization, and scenario building. I describe what GenAI can do as solving the blank page problem. You have a blank page. You need something — anything — to fill it. You describe what you’d like. GenAI fills the blank page with a plausible enough answer. Not necessarily or even likely a correct answer, but one that does the job of filling the blank page better than ipsum lorem boilerplate.
Writing custom, relevant stories for your kids who are learning to read has that requisite chill where LLMs shine. Each new story is a new rep for them. Each new story can feature their favorite characters, pets, and people doing interesting things! There is no “wrong” answer.
I identify my writing on this site with the #MeWriting hashtag. Although I rarely publish generated writing, I identify it as such. I note the source of pictures, and if they are generated.
Goofy badge images by Grok. I need to make a new picture for the third section. Originally, that section came hard at intellectual dishonesty. I now realize that people who stretch expectations of generative AI beyond reality aren’t intentionally intellectually dishonest. They’re just tacky. No chill.
#MeWriting I seem to be having this discussion a lot with people lately. AI hype is bullshit. AI doom is bullshit. Pardon my high school level French. It might get worse. Click away if that bothers you.
I want to give you a concrete example of each, because the reason I know these two facts to be true is that I use and develop actual so-called “AI”. I spend an inordinate amount of time and effort watching other people use actual so-called “AI”. And when I don’t understand what they are trying to accomplish, I ask them. And I listen.
I hear the craziest, stupidest shit. But I don’t judge. Out loud anyway. I try to figure out how to make it safe for them.
My concrete example for hype is OpenClaw, an open source “AI agent” system that has taken the tech world by storm since the end of January, 2026. It is the dumbest shit I could ever have imagined in an AI ecosystem that does not disappoint. High level, people want to automate spamming their contacts instead of texting or calling them or interacting like human beings. There is nothing impressive about this. Nothing. You’re not cool for wanting to do this. You’re an asshole. However…
This mob is instructive to watch. The early users spent hundreds of dollars a day on tokens for very large cloud models trying to make their agents work. Then, a bunch of them decided they should get Mac Minis to run OpenClaw. Most of that bunch didn’t and still doesn’t know why they want to run it on Mac Minis. That doesn’t solve the cloud token bill problem. Unless… They install a private, local LLM on their brand new Mac Minis. But they don’t know that and didn’t think about it. So they all rush out to buy Mac Minis and make YouTube videos about how they’re all buying Mac Minis because they are reliable and have an ecosystem. I shit you not. 5 to 10 minutes of AI hype about Mac Minis without a hint of what the reason is!
The Mac Mini is an ideal machine for running a local LLM because it has unified memory and GPU cores on the same “System on a Chip” (SoC) as the CPU. This makes them fast and cheap at the expense of not being upgradeable. This also makes the Mac Mini about 1/4 the cost of NVIDIA GPUs for LLM processing power at comparable speed. And you don’t have to spec out parts and install them in your system. THAT IS WHY YOU BUY A MAC MINI TO RUN OPENCLAW. I really can’t emphasize that enough.
AI Hype crowd, you can be assumed to be full of shit because, at ground level, you never know what you’re talking about.
Now the AI Doomers. There is an essay by one appropriately named Matt Shumer. Shumer the doomer. The simulation is totally screwing with us. See how I pulled a word choice punch there? Take a minute and read it if you must. It’s in a link a couple sentences back.
Shumer is, allegedly, a CEO of an AI company and a partner in his own investment firm. So when he told us everything is rigged and we’re all going to lose our jobs to AI, millions of white collar professionals are wringing their hands today and worrying about how they’re going to afford their next kale salad. And the keto people… well, they are even more fu-screwed! Shumer the doomer is the most annoying kind of doomer because his doom relies on AI succeeding as the hype crowd hopes. He is different from the “you stole my PhD thesis” crowd of doomers.
Here’s the reality. AI is not doing anyone’s job better or more efficiently. You hear about vibe coding. It is helping the bottom 50% of coders who previously produced buggy crap produce 5x as much buggy crap or produce the same amount of buggy crap in 1/5 the time. It is not helping good or the best programmers produce more, because good and the best have pride in their work and won’t push or publish crap. That slows down the vibe. A lot. Here’s a link to a short video explaining the math.
“But Brad! I’m a very good programmer, even better than you, and coding agents make me so much more productive!”
To which I respond:
“No you’re not. See above. You’re a shitty programmer and you’re producing more shit than ever now. Next.”
I don’t want to be a dick, but that’s the truth, and the truth matters when we talk about AI replacing everyone’s job. The same calculus will apply in anything we ask LLMs to do. I know this because I know and can show you what the completion algorithm does. Watch here. Oh, and I hate to pull rank on your sorry intellectual ass, but I have a Master’s of Science in Information and Computer Science with a concentration in Algorithms and Data Structures from the University of California, Irvine (1994). UCI ICS was a top 5 program at the time. We were looking up at Cal Tech. So yeah, I’m a dick for telling you I can understand what the completion algorithm does and I can and would be happy to help you understand! I’m happy to help you so you don’t fall for the doomer bullshit that AI can take your job! It turns out, 32 years later, that’s why I stayed in school.
But I am also an observer. And I see people being threatened with “replacement by AI” by bosses and companies with their heads stuck clearly up their collective ass. Over a long enough and messy enough time horizon, reality wins out. But that doesn’t mean the battle isn’t going to be difficult and painful. It just means that when we win eventually, we hang the people who caused us unreasonable harm. And maybe I’m not employing a metaphor here. Or maybe I am. Actually I’m not. I’ll bring all the rope we need. The people who do this to us are not our friends. They are awkward negotiators screwing with our lives and livelihoods. They deserve what’s coming. You people stick to the metaphor though.
This might get hundreds of views because it isn’t what anyone wants to hear. That said, if it resonates with you, I make software that lets you run LLMs privately and safely. My Mmojo Server eschews the chat illusion — an abomination in my humble opinion — and lets you interact directly with the completion algorithm, the natural language of an LLM. It’s free to install and use. It’s open source so that others can verify my claims and assure you I’m not full of shit like just about everyone else with a “voice” in AI. Get started here:
#MeWriting I’m happy to announce the immediate availability of Mmojo Server for Debian, Ubuntu, and Raspberry Pi 5. Mmojo Server is a Large Language Model (LLM) server that runs on your PC or laptop. It supports the industry standard OpenAI API, so you can connect AI applications to it. It works with an NVIDIA GPU if you have one available on your computer. It also works with your computer’s CPU, albeit a bit slower.
Debian support is intended for stand-alone installations on a PC or virtual machine running Debian Linux or Ubuntu Linux, or on a Raspberry Pi 5. x86_64 and aarch64 (arm64) are offered for download. The linked instructions walk you through setting up Linux, downloading some models, downloading and installing the Mmojo Server software, and making it all run.
The Mmojo Server software incorporates the popular llama.cpp LLM server software, and is fully open source to ensure that your data exchanged with Mmojo Server remains private. If you don’t trust me telling you that, you are welcome to inspect and build the source code! Mmojo Server is compatible with many .gguf models you can find on Hugging Face and other web sites.
Installing Mmojo Server is a fun do-it-yourself adventure! I’ve asked non-technical people to test my instructions with good results. Casual and occasional developers should have no problems.
Your PC or virtual machine should have a recent high-end x86_64 or aarch64 (arm64) CPU, with at least 16 GB RAM and 100 GB available storage space. An NVIDIA GPU with 8GB or more of VRAM will make Mmojo Server faster. You can build a custom Mmojo Server with support for Vulkan if your computer has GPUs supported by Vulkan.
If you need assistance via Zoom call and screen sharing, I offer a one-hour hands-on session, for (US) $100. It can be scheduled during extended west coast business hours. You will be working with me, the guy who made this thing work. Email me if interested.
#MeWriting I’m happy to announce the immediate availability of my OpenClaw and Mmojo Server deployment guide for Windows. OpenClaw is an open source “AI agent” platform intended to automate your common communications tasks. Mmojo Server is a Large Language Model (LLM) server that runs on your PC or laptop. OpenClaw and Mmojo Server can be deployed on you PC or laptop.
On Windows, we run Mmojo Server and OpenClaw in separate Windows Subsystem for Linux (WSL) instances. This allows us to sandbox OpenClaw so that you have to give it permission to access any data on your computers storage. The linked instructions walk you through installing both Mmojo Server and OpenClaw, then configuring OpenClaw to use Mmojo Server as its LLM server.
In the strangest twist you will ever see in a product announcement, I’m about to tell you that this system does not work well as an “AI agent” platform to automate your communications. In fact, I have written about how such systems cannot and will not work well for automation. However, I believe that by installing OpenClaw with a local LLM and playing around with it, you will be able to see the problems with the whole approach. As a bonus, you will not waste money on expensive cloud LLMs that purportedly “work better”.
Installing Mmojo Server is a fun do-it-yourself adventure! I’ve asked non-technical people to test my instructions with good results. Casual and occasional developers should have no problems.
Your PC or laptop should have a recent high-end Intel or AMD CPU, with at least 16 GB RAM and 100 GB available storage space. An NVIDIA GPU with 8GB or more of VRAM will make Mmojo Server faster.
If you need assistance via Zoom call and screen sharing, I offer a one-hour hands-on session, for (US) $100. It can be scheduled during extended west coast business hours. You will be working with me, the guy who made this thing work. Email me if interested.
#MeWriting I’m happy to announce the immediate availability of Mmojo Server for Windows. Mmojo Server is a Large Language Model (LLM) server that runs on your PC or laptop. It supports the industry standard OpenAI API, so you can connect AI applications to it. It works with an NVIDIA GPU if you have one available on your computer. It also works with your computer’s CPU, albeit a bit slower.
On Windows, Mmojo Server runs in a Windows Subsystem for Linux (WSL) sandbox. This let’s me ship you the fastest builds that are compatible with popular NVIDIA GPUs. It also help keep your Mmojo Server private to your computer. The linked instructions walk you through setting up WSL, downloading some models, downloading and installing the Mmojo Server software, and making it all run.
The Mmojo Server software incorporates the popular llama.cpp LLM server software, and is fully open source to ensure that your data exchanged with Mmojo Server remains private. If you don’t trust me telling you that, you are welcome to inspect and build the source code! Mmojo Server is compatible with many .gguf models you can find on Hugging Face and other web sites.
Installing Mmojo Server is a fun do-it-yourself adventure! I’ve asked non-technical people to test my instructions with good results. Casual and occasional developers should have no problems.
Your PC or laptop should have a recent high-end Intel or AMD CPU, with at least 16 GB RAM and 100 GB available storage space. An NVIDIA GPU with 8GB or more of VRAM will make Mmojo Server faster.
If you need assistance via Zoom call and screen sharing, I offer a one-hour hands-on session, for (US) $100. It can be scheduled during extended west coast business hours. You will be working with me, the guy who made this thing work. Email me if interested.
#MeWriting Large language models repeatedly perform one easy to understand operation. Given a sequence of tokens (words), they predict a next best enough token. They use a large database of “weights” to calculate “next best enough”. They accept a random best enough token to cut down on comparison time versus picking the very best. This willingness to accept “best enough” is what makes the completion algorithm, as it is called, practical. It’s also what makes it interesting. For a long enough answer which isn’t very long, you will get a different word by word answer every time. You will get different classes of similar themed answers as well.
The completion algorithm is a lot like when your phone has three words choices for autocomplete, you pick one of them that will work. The feature was first rolled out by Google in late 2004. People have made a game of this feature since day one. When you play that game, you sometimes end up with a plausible sentence, though rarely a sensible one. With LLMs — billions of weights (aka “parameters”) and a long context window to evaluate — most sentences and paragraphs, even multi-paragraph answers, seem sensible too. That is the magic of LLMs.
The magic ends there. There is no mechanism in LLMs that guarantees that completions, as they are called at ground level, are correct in any factual sense. An obvious question is why AI researchers didn’t design one in. Turns out, they don’t know how to model truth in a manner compatible with the accidental efficiency of the completion algorithm.
The most important reason we have settled on this algorithm is that it is, accidentally, computable efficiently by vector algorithms, packaged as graphics processing units (GPUs). It’s the third computational task they’ve been really good at, following graphics circa 1990 and cryptography circa 2010. We used these same GPUs for gaming and Bitcoin. They did the underlying computational tasks much faster than CPUs could.
The point here is that when we talk about generative AI for text, we are stuck in and with this model of computation. We are stuck with what it is good at. We are stuck with its limitations. People who pretend otherwise are flat out lying to you or, more charitably, creating convincing surface level illusions that crack under close inspection.
“But ChatGPT says that it’s thinking, so it must be thinking!”
I’ve heard this reasoning from otherwise very intelligent people. No, it is not thinking. What it is actually doing in that step is stuffing the context window with possibilities so that completion down the road might pick one and expand on it. That’s the illusion. It is not at all how smart people think.
Do LLMs Solve the Task at Hand?
With this article, I want to give you a sense of what LLMs are good at and not good at. We now understand what they do, exactly. For any task, we can ask:
“If it’s generating one random best enough token at a time, yielding a linguistically plausible completion, does (or can) that solve the task at hand?”
I am a fan of Elon Musk. I would like him to succeed at everything he has decided is important. He is a savant at picking worthy big goals. That said, his claim of a “truth seeking AI”, based on the completion algorithm, is total bullshit. As noted above, there is no mechanism in the completion algorithm to ensure truth. There is no checking that another LLM could perform to evaluate truthiness. Perhaps there is a way to measure — likely by hand — the truthiness of a large sample of outputs, then tune training to optimize for that measurement. In practice, it’s a measurement much closer to 50% (heads or tails) than 100%. That tuned training is both computationally and humanly very expensive.
To put that into context, we can train simple neural networks to drive a car (Tesla “Full Self Driving”) to an error rate around one human intervention per 100ish miles, and an accident / death per mile-driven rate about 1/10 that of the human fleet. But truthiness of language models peaks around (call it) 60% and is easily steered or derailed by strategic human token injection mid completion. While LLMs are based on neural networks, at operational scale for purpose, driving and writing words are very different tasks with very different levels of achievable mastery.
Let’s call Tesla FSD 96% solved, and recognize that chipping away at the remaining 4% will get more and more expensive, perhaps exponentially (in the true mathematical sense) so. We have to this point achieved great utility from that 96%. Such is not the case with truthiness in LLMs, and we are at a point of rapidly diminishing returns at 60%. The problems and the algorithms at our disposal are just different.
Good For and Bad For List
We now have a comparative sense of LLMs’ limitations on their problem domain versus a different, quite successful “AI” application. From that, we can start to characterize applications that might work well with LLMs and applications that most definitely will not.
Story writing works well. It will work best with a few open ended suggestions, rather than numerous and detailed restrictions. The restrictions become the “truth” for the story. While the context window is a powerful force in shaping token production, it can also be contradictory, or have too many instructions for any single instruction to consistently have real force. “Make up a story about my dog Mona, Eeyore from fiction, and Paul Bunyan from fiction saving the forest using the 7zip application” results in delightful, if absurd, stories! These generated stories obviously aren’t true. To work as outputs, they just need to integrate the three characters and the tool. Everything else is gravy. Unexpected twists are welcome in great stories!
Drafting emails or LinkedIn posts work well if instructions aren’t too detailed and specific. In a weird coincidence, those end up being the kinds of emails and posts that get the most engagement. A sender telling a recipient exactly what and how to do something isn’t a friendly communication.
Summarizing works well if the source is coherent and the reader of the summary is familiar with the source. No trust in a good summary is required. The reader knows the material — as I’ve specified two sentences ago. So the reader can evaluate the summary for consistency with the source. If the reader is not familiar with the source material being summarized, the reader has no basis to evaluate the quality of the summary.
Translation works well with multi-lingual models trained on enough translation material. This is because sentence by sentence translation usually mostly covers full document translation. My own work with bilingual evaluators of translated stories had them consistently rating translated stories as “A” work — very good but not quite perfect with nuance — even with small, private models in the 4B size range.
LLMs suck at automation. Tool calling is great when it’s sequenced correctly. The demos are amazing when they work. The problem is that the sequencing by an LLM is still random, not deterministic. The funny thing is that we know how to automate deterministically. Code the exact sequence, the computer follows the exact sequence. The best we can hope for having the LLM handle sequencing is that it might get the sequence for something we don’t already know how to sequence. This suggests that LLMs might be good for exploring or prototyping sequences we don’t already know about. The limiting factor is downside risk of sequencing incorrectly. In processes that are worth automating, these downside risks tend to be quite high. In plain English, mistakes are very costly.
How about so-called vibe coding? That seems like applied automation and story telling. For prototyping, where we don’t totally know what the system should do, vibe coding to present possibilities would be a great tool, provided we’re willing to consider it a prototype and not try to massage it into a working system.
An unexpected turn that vibe coding has taken is into specification based automated code generation. This is a mistake, because we’re not good at writing detailed specifications, and LLMs are not good at faithfully following all the directions in them in correct proportion.
It is tempting to suggest putting a human in the loop in all of these activities, since none of them work perfectly every iteration. There are two problems with this. The first is it may just be less expensive to have a human do all the work to a higher degree of quality than to have the human evaluate every answer and repair the bad ones. This is probably the case with vibe coding. The other problem is that competent humans may not enjoy such a role that sets their creativity aside.
“Suck it up, buttercup, this is what we’re paying you to do now.”
The problem with this approach to defining work is that it will atrophy the skills that make great reviewers and fixers great at their work, and alienate the reviewers who have options.
Rule of Thumb
Here is a simple rule of thumb. LLMs are great at right-brained tasks where any linguistically plausible answer is a good task answer. They suck at left-brained tasks where a small subset of possible random answers are acceptable. When you work against this rule of thumb, you just make your project expensive or not winnable in the first place. Comfort with this rule of thumb is the “chill” you need to have if you’re going to make good decisions about what this technology should do.
“It shouldn’t be limited this way, and besides, version next will be better.”
Well, you lack the chill to make good decisions, and that lack of chill will absolutely get your ass kicked in this game. Downside risk.
Conclusion: Who Knows?
Let’s conclude with a personal observation from making this argument about tracing the actual task back to the completion algorithm for three years:
16/20 professionals in the AI space have no sense whatsoever of what tasks LLMs are good for and what tasks they are bad for. They haven’t considered that there is a dichotomy, let alone a continuum. They are the people most associated with AI hype and phrases like “it’s just a toddler” and “the next version will be even better”.
3/20 feel that there are some good applications and some bad applications, but have no idea what they are and no inclination to investigate.
1/20 have considered asking what LLMs actually do to try to figure out what they are good at.
A much smaller segment have put together a working theory.
These are my rough estimates based on my very active engagement with these people. They’re usually not bad people. They just aren’t thinking this through.
You have now seen a working theory. You are aware that at least a handful of people are paying attention to this. We’re paying attention because there is an opportunity to kick a lot of ass of a lot of people who just have their heads in the sand. As a thank you for reading this, I hope that you will pull your head out. Of the sand. You know, so you’re not an easy target for a total ass kicking.
My friend Pete A. Turner shared something with me on a private phone call the other day:
“People are making a lot of decisions based on what a robot thinks the next word should be.”
Pete is not a tech guy. He is very right brained. He has a better holistic sense of what is going on here than any tech guy I know.
#MeWriting My favorite YouTube channels are about woodworking and construction. A few of my favorites in the genre are Bourbon Moth Woodworking, ENCurtis, Make Something, Shop Nation, 731 Woodworks, and Stud Pack. I’ve always been handy with tools and ambitious with hobbies. The ongoing appeal of these channels for me is the story telling. They are entertaining!
In November, 2021, inspired by YouTube channels Brad Angove and Texas Toast Guitars (TTG), I participated in a one-week guitar building workshop at TTG. I built and painted this:
The Silly Mo.
It took 3rd place in TTG’s prestigious Great American Guitar Build-Off in June, 2022.
The guitar below was built by Brad Angove. I refined his work with days of detailed neck sanding and a perfect set-up for playability.
Not me playing Brad’s axe.
It took 4th place in the same contest.
I’ve had an opportunity to build out a nice garage woodshop with a drill press, jointer, planer, benches, hand tools, and a commercial quality CNC. With those great tools, my pinnacle creations were small cutting boards with clever epoxy decoration. My guitar designs with CNC help never materialized enough.
Jack photo carve.Cheesey cutting board.Pan Am.
Another thing I’ve done in my now extensive adult lifetime is assemble a metric f@$% tonne (MFT) of IKEA and off-brand flat-pack furniture. In my 20s, it was affordable. In my 30s, it was functional. In my 40s, it was still everywhere.
Flat-pack TV stand.
I’ve been privileged to acquire (and eventually pass on) some nice real wood furniture constructed in classic styles with classic methods. I like nice things. I would love to get back into woodworking and making nice things with modern tools and methods. Classic furniture is nice things. Flat-pack, not so nice.
Side note on flat pack: If you want to make fun of my version of obsessive compulsive disorder (OCD), show me your badly assembled flat-pack furniture. It hurts my heart that (a) you did that, and (b) you tolerate it. And yeah, I feel compelled to fix your mess. It will cost you a burger and a Dr. Pepper.
Plot twist! This article isn’t about classically constructed furniture versus flat-pack furniture. It’s about software engineering versus vibe coding. Software engineering is about creating classic software from required, durable raw materials. Vibe coding is about assembling software from pieces that have already been written and gathered by a large language model (LLM) in the cloud.
A competent, accomplished software engineer can itemize reasons to be cautious with vibe coding. A vibe coder can claim he’s doing software engineering for 1/10 the cost. Managers and marketers equating these two activities are not serious about software.
Speaking of clowns… Dario Amodei, CEO of Anthropic, stated for third or fourth time at Davos this week that AI would be better than all humans at coding in the next 6 to 12 months. Here’s a link. I’m not embedding the video or finding the exact clip because it’s just dumb. If there is ever a Nuremberg style trial for what these clowns have done to our industry, I will gladly bring a (an?) MFT of rope. Smdh.
I’m not trying to start a war. I brought the fine woodworking vs. flat-pack metaphor into the discussion as an analogue to software development versus assembly to propose a framework for peace.
Vibe coders aren’t writing software that requires five decades of academic research and a similar length of time of commercial practice — with gross missteps along the way — to work reliably. They’re assembling software that might not have to work well. I’ve previously described this as “building prototypes”. I am 100% supportive of that activity and any methods to do it so long as people are willing to call them prototypes and throw them away when real systems need to be built!
We can have a world where there is software built by traditional, time tested methods — and where there is flat-pack vibe code built by AI, allegedly under human guidance. The peace deal is that you, the manager or customer, pick the one you want and choose practitioners who want to make it for you. Traditional software engineers don’t want to assemble flat-pack. Vibe coders can’t write real software. That’s just how things are.
Can’t we all just get along?
I floated a draft of this to my small collection of reliable critics. One response came back quickly: “Tech people don’t want to read your boring woodworking metaphor or hear about what you do when you’re not coding.” Exactly. Thank you for validating my premise!
#MeWriting Americans will soon face a choice: Get off the electric grid or do their inference at home. Let me define and explain.
The All-in Podcast for Friday, January 16, 2025 floated the first option. In order to free up electricity for data centers, homeowners in the United States would install solar panels and batteries over the next decade. The All-In option would cost each detached homeowner about $30K for a company to add solar and battery to the home’s roof. Perhaps we’ll see do-it-yourself or handyman kits appear at retailers like Lowes and Home Depot in the $10K range.
Please watch the full segment — about 17 minutes. Fans of the podcast and the “besties” will appreciate this for what it is: a trial balloon from investors, industry, and government. All four were pitching the approach, ignoring obvious pitfalls:
Multi-family and high-density buildings. Two thirds of United States households live in detached homes. So, one third will have the coordination problem of who installs solar and batteries.
Suboptimal rooflines. Call south facing roofs the 100% efficiency baseline. East/west facing roofs operate from 75%-90%. North facing roofs operate from 45%-70%. Solar efficiency was not a consideration in building orientation for most existing homes.
Maintenance and repair. When local equipment or transmission wire needs to be replaced, the replacement costs is spread over many rate-payers. When the panel or battery on your home goes bad, replacement cost is spread over you, or a warranty. There’s likely no routing around the problem to keep service available, unless you’ve invested a multiple of base system cost for local redundancy.
Regulation. Most states do not currently have regulation favorable to single family homeowners buying from and selling to the grid. Many homeowners who would have been able to afford to buy systems outright have ended up in more expensive leasing situations solely for regulatory compliance associated with not being completely off-grid. With telephone deregulation in the early 1980s, we solved this problem by letting customers plug whatever compliant equipment they wanted into the phone network. We are almost 50 years behind solving this basic problem with the electric grid. Few people are discussing it.
What Problem are We Solving?
Let’s stop a moment and appreciate that there is a problem that needs to be solved. The artificial intelligence segment of the tech industry wants to build-out inference capacity in new data centers. Inference capacity is the ability to ask “AI” questions and get responses. This is usually in the form of chat or so-called “agentic” workflows. The bigger the models and the more users using them, the more compute (CPU or GPU), memory (RAM and disk), and power needed to provide the service. Let’s leave out diffusion, which is used for images, sounds, and video. Let’s also set aside network bandwidth concerns. For text inference, it’s negligible.
The states have already entered the discussion on resource allocation. Florida, led by its staunchly conservative Governor, Ron DeSantis, is saying no to land use, environmental impacts, and grid prioritization for new AI data centers. Politically, this should surprise everybody and simultaneously, surprise nobody. DeSantis is specifically questioning the need for centralized inference, even poo-pooing the “don’t let the Chinese beat us” narrative driving data center buildout.
The recently departed Scott Adams was both the creator of the Dilbert comic strip and a recent popularizer of the persuasion lens. Through that lens, we can see that dramatically boosting capacity or radically reallocating usage of the electric grid is an example of selling past the sale. The real sale is inference capacity scaled beyond imagination. We are not talking about whether that is needed. Spoiler alert: it’s not. We are talking, instead, about how to provide enough power to do it.
Are We Solving the Right Problem?
I told you that inference scaled out by data centers is not needed. For two years now, I have helped my clients and customers use small large language models (LLMs) running on their laptops or inexpensive appliances for chat. I coined a phrase that these models feel just as knowledgeable, but less annoyingly loquacious than large, popular cloud models. They’re not quite so fast either. They generate answers a little faster than you can read them rather than spitting out a page of text in an instant.
A common response to my message should be quite flattering to me: “Brad, stick to comedy.” I make this whimsical because it is absurd. When I’ve dug deeply into real people’s embrace of cloud chat, I have found that the illusion of intelligence is very important to them. It’s easy to believe that some giant machine in the cloud is “intelligent”. It is not easy to believe that an appliance computer the size of a deck of cards is “intelligent”. Both provide similarly useful answers for let’s call it 19/20 questions they’ll ask. But they want the illusion of intelligence provided by a far-away computer they will never see. That illusion you crave might cost you $30K in installation this next decade and a lifetime of maintenance headaches and worry. See the All-In Podcast trial balloon above.
Inference is not just chat. My software, Mmojo Server, provides an OpenAI compatible application program interface (API). This makes it possible for developers working on AI applications to use a private, local Mmojo Server rather than a cloud system as the AI backend to their products. One big advantage during the development phase is that they don’t pay a cloud provider for tokens. They pay for availability and capacity of a Mmojo Server. They might pay a cloud provider tens of thousands of dollars during development for what they can run for free on their laptops or package into a fast, local stand-alone server for under $2K. Developers can also eliminate a problem called “drift” — where the model changes — using a fixed, local LLM instead of the cloud. Mmojo Server has developers from companies you’ve heard of using it for both AI wrapper and agentic application development. It’s not theory or vision. It’s real.
Alternative Approach
I have a better idea than convincing all United States homeowners to spend $30K on solar and battery. What if, instead, homeowners spend $300 or $3000 on inference at home? The hardware is inexpensive and reliable. The software already exists. The application protocols are well defined and in use. My own system, the Mmojo Knowledge Appliance, is plug and play with zero configuration. Plug it into the wall for power and your router for connectivity. It is instantly available for use by any computer or device on your home network. Should it break, order another one and plug it in, just like any other small appliance in your home.
Mmojo Knowledge Appliance
I’ve built these for paying customers with inexpensive Raspberry Pi devices. If your tastes for inference tend more to race car performance, I can build you one using, for example, a Framework Desktop computer with an AMD Ryzen AI+ CPU/GPU.
Side note: That appliance at the left consumes about $4 per month if run full throttle 24/7 on grid electricity priced at California regulated peak consumer rates. A typical heavy user might spend $0.50/month at those billing rates.
Over the past two decades, the biggest reason that software has moved to the cloud is monetization. Tech companies can put a meter on your usage of software and force you to pay. There are secondary “benefits” like no installation or required maintenance and upgrades by technically challenged users. Presented with the costs of going “all-in” on cloud inference, maybe we should reconsider the appropriateness of that model.
I have a working name for this approach: Inference @ Home. If this approach interests you, please message me on LinkedIn or drop me an email. I have several ways you can participate in this mission, ranging from using the Mmojo Server software to sponsoring my work. Let’s talk! -Brad
#MeWriting I stumbled on a LinkedIn post from a connection, Emmanuel Maggiori (link), today. The post is short enough to quote in full here:
There’s a large market for “good enough” work (as opposed to excellent work). This includes for example, writing SEO-driven articles or designing banner images for blogs. High quality and thoroughness don’t matter in those cases. This is the work that will be most affected by AI, as people will use AI instead of hiring humans to do it.
It’s easy for AI enthusiasts to dismiss what Mr. Maggiori or I are suggesting are good uses of AI as “not worth anyone’s attention” or “minor use cases”. They are worth attention and they’re not minor! Imagine the greater AI industry wants to pave over the entire United States. I think that gets the scale of their ambition right. Blank page filling use cases, comparably, want to pave over California, Oregon, Washington, and Rhode Island for good measure. These use cases are very ambitious, with plenty of good work to go around for anyone who wants to attack them! You don’t have to pretend LLMs can think to keep busy.
Meanwhile, 95% of AI projects are failing. They’re failing because they are too ambitious. They’re pretending that “AI” is intelligent, thus ignoring the “effective” side of the coin.
The AI Super Power works like this:
You know what tasks generative AI is actually good at.
You have the chill to not endorse tasks it’s not good at.
I know plenty of people on the “critic” side who claim that AI isn’t capable of doing the things it’s good at, or who rake me over the coals for wanting to use it for tasks humans will not even do because the tasks are too low value. I can’t help you if you can’t recognize the value of processes you can replicate with 100% success at home.
I have a very good friend of almost 40 years. He will probably end up reading this some time. Two years ago, when I was trying to explain what AI is good for, I came up with a one-word description: “delight”. It’s consistent with what I’ve settled on. My friend then started coming to me with tasks that AI is not good for, claiming they would be delightful to him.
I won’t call that intellectual dishonest or purposely ignoring my point, because I’m kind. I will call it not having requisite chill. This wasn’t two old friends bantering over beers trying to solve the world’s problems. We were trying to find a good business idea for AI. As of now, we still try. If we ever settled on an idea that didn’t meet the chill requirement, I would waste a lot of time and he would lose a lot of money. Funny enough, it would also be my fault when we fail.
I do not know a lot of people who actually have this super power. Prior to reading and commenting on Mr. Maggiori’s post, then writing this article, I hadn’t framed it this directly. I have felt like I was on Brad Island with a strange set of beliefs that are difficult to share with, let alone inculcate into others. I know that it takes around 6 months with non-technical folks I engage deeply with on AI topics to move them into the vicinity of Brad Island. People’s natural priors of trust in technology and faith in “artificial intelligence” are hard eggs to crack.
If you accept what I claim about this being a super power, I can help you and your company. I don’t just arrive with a vision. I have actual software that will help you understand AI and develop that same super power. Message me on LinkedIn if you’re interested.