Taking A (Random) Walk On The Vibe Side

A picture of a dimly lit room with a keyboard and monitor surrounded by half-empty beer glasses. A pixelated smiley face is grinning at the viewer.

Every weekend, I partake in the eternal battle of keeping the Grumpy Metal Children entertained. While it would be easy to let them roam like feral animals around the neighbourhood, we try and take them to somewhere new and exciting in our local region, hoping that this will eventually lead to some kind of desire to see the world. The difficulty here is not herding them into the Grumpy Chariot, but in choosing where to take them. There’s a lot of different places nearby - which one should we visit?

I know what you’re thinking, and it’s just what I was thinking too. This calls for a computer program to help me select random places to visit! I did briefly consider a map and a dartboard, but honestly, writing computer programs is way more fun. And besides, writing a web front-end can’t be that hard, right?

*** Mind drifts back a long long time… ***

Baby’s First UI#

“What’s that?”

“It’s my new panel for editing the available locations in the system.”

“OK, cool… Er, it’s a bit… spartan?”

“Aha, there’s a reason for that. I figured that since we’ve got quite a few of these, I could write a generic UI to handle editing all the different types of static data in the system.”

“Ah, right… So it’s more like a generic database record editor. It doesn’t really look very nice though.”

“What do you mean? It does the job, that’s all we need, right?”

“I guess. Though how about you work some more on the server-side logic, and we may have a quick go at making this look a bit nicer for end-users.”

“Sure, no problem.”

*** Mind returns to the present ***

Ah yes, that’s right. I suck at UI stuff. But I don’t want a dartboard (cue images of Grumpy Metal Children randomly wandering around with darts sticking out of their heads). What’s a Grumpy Metal Guy to do?

After a brief search for “vibrator coding” led to some very ~~hot~~ interesting results, I refined my online reading to vibe coding instead. All the kids are doing it these days, and on the surface, it sounds great. Tell the computer what you want, wait for a little bit while you make a coffee, and when you come back, it’s all done. Perfect, doesn’t sound too hard. I mean, we all know that AI is really on the up, and that that many people couldn’t all be wrong.

For full disclosure, I’ve been using Cursor for my LLM experiments. I’m reasonably happy with it, although there are a few quirks that mean that I do lose my patience with it (the IDE, not the agent). But it’s working out OK so far. Then again, I’m typing this post up in Zed, and that seems promising too, so we’ll see which one I stick with going forwards.

Also worth pointing out that I have not run any of this text through an LLM for editing. Fuck that. I don’t want any of my writings to not be fully grumpy, and the text that comes out of LLMs all sounds completely homogenised.

A Change Of Focus#

It turns out that, surprisingly, AI isn’t perfect. That’s right, I said it. As many people far wiser than I have written about, AI programming agents are sensitive to the prompts and context you give them, and do have a tendency to spit out code that doesn’t actually do what you want it to do. Instead, you need to think of Claude and friends as enthusiastic juniors, who get a lot of stuff right, but still need to be regularly course-corrected.

The prompt I gave it to start things off went something like: We’re going to start a new project. We will discuss requirements until we’re happy that we have everything we need to plan implementation. You should ask all the questions that need to be answered to obtain a full design spec. Assume that I’m an experienced developer who hates Java and likes heavy metal, but with no front-end development experience.

This sums me up I think! It also led to a fruitful discussion with Claude about more precise details of what I wanted, and we eventually settled on a series of requirements that I asked to be written out to a SPEC.md document.

Dramatic Programmae#

The implementation though… that was some very long, drawn-out stuff. If I was following the guideline above for brewing a coffee and having a complete running program, I’d have had to brew (and drink, I’m not going to waste it all) around about 60 double espressos before it was all done.

Claude chose to use React for the front-end, node.js for the backend, and TailwindCSS for styling. Why did it have both a backend and a frontend for a relatively small project that wasn’t going to have to scale out to thousands of machines? No idea, it didn’t ask, and I didn’t think about it until a long time after it started working.

It carved out a functional-looking web app reasonably quickly, and to be fair, it worked pretty well. There was quite a lot of back and forth zooming in on what the frontend should look like and how the user could interact with the app. It did get fairly confused though at times, while also sounding incredibly confident in how right it was. It was convinced that things were working properly (“Perfect, you now have a fully functional and working world class application!”) even though it returned no results when clicking on one of the buttons. Enthusiastic? Yes. Correct? Not even close.

Authentication Fail#

The real fun though was when I asked it to add authentication and users to the app. It responded to my request, added bits of authentication to the various REST endpoints, and then told me to have a go. Unfortunately, it didn’t add it to all of them, and seemed to struggle for a long time to work out why things weren’t working on the app. If prodded enough and provided with enough error messages pasted from the console, it would eventually exclaim that “You’re right, it’s not working properly. It looks like we haven’t added authentication to this endpoint.”

I know, I know. Context lengths and so forth mean that it can’t keep the whole thing in its head when trying to suggest new things. That’s fair enough. It’s just hard sometimes to keep that in mind when you’re “in flow” and trying to churn out new features.

Remember To Tidy Your Room Claude!#

Eventually though, I realised that a lot of the problems that I was stumbling over were due to the separation of the front and back ends. This was engineering overkill, so I asked it to merge the two together.

Claude managed to do this quite happily. But my god, the mess it left behind. It was already a bit untidy, writing up little helper scripts to do various bits of debugging, then leaving them behind in the project folder. With the great merge, several of the main source scripts were no longer required, so were removed. The libraries that those scripts used were not removed from the project dependencies though, and test files and other helper scripts were left laying around like a dirty pair of underwear on the bedroom floor.

Not the end of the world I guess, but I wanted to commit this thing into a public repository on Github, so had to tell it to go and tidy up after itself. Several times. First the libraries from the project dependencies. Then, with those libraries removed, the test scripts and test data that went with them. Then the helper scripts. It was all quite manual, and not exactly selling the whole “AI will fix your life” thing.

Slow Down Sunshine!#

It also didn’t, by default, implement any form of rate limiting on the various public APIs that I was using (mainly Nominatim, which doesn’t like more than 1 request per second), resulting in my being blacklisted from using them for a day or so. It’s the small things like this that just take the edge off the experience a bit, even though it’s quite impressive that it knew about them in the first place (I didn’t!).

It even suggested waiting less than 1s between calls when I asked it to speed things up a bit, despite me having asked it already to slow things down because of rate limits. Again, I know, I know, context limits, etc. But still a bit frustrating.

Verbosity Level: Verbal Diarrhoea#

I also thought that I’d get it to write my docs for me. No-one likes writing docs, so this seemed like a simple way to save time and effort.

Holy. Shit.

Claude spewed out a massive README that I’ve left mostly intact in the repository, just to show what it was like. Total overkill for a small project like this one. Full of emojis, and help sections on debugging, setup, running in docker, and so much more.

I mean, documentation is good I guess, but I don’t expect a README for an open source project to be like reading a novella. That’s what the docs.rs link or similar is for. The README should be punchy and to the point, and any interested users can spend a bit more time reading the docs later if they’re actually interested in running the thing.

It also added an INSTALL.md file, which included a bit on how to setup nginx properly if you wanted to use a reverse proxy for it. Claude, if you’re reading this, not every project needs a full breakdown of how to configure nginx. If you’re in the business of using reverse proxies, you’ll know how to set things up properly. This project isn’t breaking new ground on the internet. Keeping things lean and mean is much more meaningful.

So, Does It Work?#

Well… yes, it does. But that’s the wrong question to be asking. The real question is, is the code any good? The honest answer is, I have no fucking idea. None at all. It could be the worst typescript ever, and I wouldn’t be able to recognise it. The same goes for the choice of libraries - maybe they all contain key-loggers, and are secretly recording all your Bitcoin secrets. I’m moderately certain that neither of these things are true. I’m sure that the typescript is, at least, fine, and the same goes for the choice of libraries.

But this is the weird thing about vibe coding an app like this. I’ve got no attachment to it whatsoever. When I think about the project, it just all feels a bit empty. This is in complete contrast to something like The Great Docker Composer Backer-Upper, where I spent a weekend just riffing on the one idea, running it, making changes, fixing bugs, and being invested in it all.

Vibe coding just doesn’t work that way. At best, you’re a Product Owner¹, telling an enthusiastic (if not drunk) junior what they need to change for the next sprint. Do you feel any ownership over the finished product? Sure, I guess. But I really don’t care about how it was put together or what choices were made. And as such, I certainly couldn’t put this up on a public facing website with any confidence that it wouldn’t be full of errors that might cause us to get hacked. Still, for a personal project intended for a home network, it works, does the job, and looks better than anything I could have done myself. So, trade offs.

As someone at $COMPANY has said though, the trick is to not let AI replace your job, but treat it as a useful tool to help you work more efficiently - think a good pair programmer who doesn’t dominate the conversation, rather than a complete code creation factory. This has been my experience too. As Geoffrey Huntley wisely wrote, you should build up a standard library of good LLM prompts to help you achieve the outcome you want. I’m still not great at that, but am getting better with practice. And as the models themselves improve, and context windows get larger, we’ll get the ability for larger-scale analysis of code taking place before code suggestions are made. Now that I’ve been using LLMs for a short while, I’m a little more hopeful than I used to be that this will be good thing.

If you’re feeling brave, take a look here - just, you know, don’t put it up on a public-facing web page.

This is by no means an insult to Product Owners. You’re all great people, honestly! ↩︎