Don't Finish My Sandwiches

29 Mar 2025
Components of a sandwich in an exploded vertical stack, including lettuce and a spanner.

When they first came along, I was sceptical about the merits of using an AI (specifically, a Large Language Model) to help you code. However, the world doesn’t stand still, especially the world of technology, and it’s been a busy two and a half years. Coding assistants have become more sophisticated and widespread, and for a while I’ve been dipping my toe into the water and trying them out myself. As such, I thought it was time for an update.

My assistant of choice is Github Copilot, not through any sophisticated evaluation process, but purely through convenience. As a user of both Github and VS Code, things have reached the point where it’s more effort not to try it out in some capacity. I’ve been doing so on and off for a few months, on small tasks.

One thing that quickly struck me was poor performance in the functionality most people would first think of when thinking of coding assistants: autocomplete. As with many things LLM, this produces results that at first appear beguiling, verging on the miraculous, but fall apart on closer inspection.

To illustrate, here’s a recent example from my book log (which I happen to edit in VS Code). First, an example of it producing a result that genuinely impressed me (autocomplete suggestion in italics):

- title: "All Systems Red: The Murderbot Diaries"
  author: Martha Wells
  url: https://www.marthawells.com/murderbot1.htm
  start: 2025-01-28
  end: 2025-01-30

- title: "Artificial Condition: The Murderbot Diaries"

Here, Copilot has suggested the title of the next book I’m reading based on the data in the previous entry. Not only has it formatted it correctly, it’s also factually accurate — “Artificial Condition” is indeed the next entry in Martha Well’s excellent MurderBot series. While it’s possible that someone else keeps a log of the books they’ve read in a YAML, using the same keys and structure as me, is also making their way reading through The MurderBot Diaries, and Copilot is just regurgitating an item from its training set, that seems unlikely. More likely, it’s actually working as advertised, and weaving together several different patterns (YAML syntax, properties of books, and the sequence of titles in that particular series) in a complex, multi-dimensional way. This is great demonstration of how LLMs go way beyond simpler statistical techniques.

It turned out that in this case the prediction, while entirely reasonable, was incorrect, as I wasn’t reading the next MurderBot book immediately (I like to spread these things out). No big deal — even if autocomplete doesn’t get it right every time, it’s still useful, and I can just ignore the suggestion and I’m no worse off. However, when I supplied the title of the next book I was reading, the suggestion was somewhat less helpful:

- title: "Monsters: What Do We Do with Great Art by Bad People?"
  author: "Claire Dederer"
  url: https://www.penguinrandomhouse.com/books/669579/monsters-by-claire-dederer/

If Copilot had done what it appeared to do — found a URL for a site about the book in question — it would have been at least as impressive as the previous example, if not more so. Fortunately, I had sufficient remaining scepticism to check the link, and confirmed that this was not the case. Instead, it produced something that looks like the right URL (this is the format of URLs used by Penguin), and it’s inserted the title and author in the way you’d expect them to appear. However, this is done without any reference to what’s on the other end of that link; the sequence of digits is basically random, and happens to resolve to “The Count of Monte Cristo”.

This is absolutely the way you’d expect an LLM to fail — in their basic form the way they work is to to produce a probable (and thus plausible) continuation of their input, with no reference to external reality. There’s some correlation between this probability and truth, but the correspondence is far from complete. More recent techniques ground the LLM with factual information in various ways, but as this example (by no means usual) shows the gap remains.

The above example is perhaps unfair, as it’s not really the kind of coding task that Copilot is targeted at. However, in my experience, using Copilot’s autocomplete while programming is worse. The kind of errors that turn up are the same, but have the potential to be both subtler and with more widespread impact. More damningly, LLM-based autocompletion compares very poorly with what it’s replacing.

I clearly remember code completion coming into the mainstream with the advent of Intellisense in the late 90s. Those of us who were already used to programming without it derided the idea as wasting more time than it saved, and we had a point. The predictions were often wrong, either in absolute terms or being non-idiomatic and overly verbose. However, Microsoft and many others didn’t stand still, and the quality of these systems steadily improved. Fast forward twenty-five years, and traditional (non-LLM) code completion is astounding; fast, reliable and idiomatic. Advances in areas like type systems and static analysis (even in dynamic languages like Python and TypeScript) mean that suggestions are not only correct but informative — while it doesn’t obviate the need for documentation or understanding, code completion can serve as a valuable aide-mémoire, avoiding the need to look up minor details and giving you the confidence that your recollection is correct.

Replacing the traditional, structured approach with an LLM completely undermines this benefit. If it offers a list of named parameters for a function I’m calling, I can no longer view that as a reference, as it might have made some of them up. It will, of course, seem plausible, but that makes it more dangerous, not less. After a few days of trying to do actual programming work with Copilot completions, I turned them off and went back to the old kind. I’ve been very happy with that decision.

Does that mean that, for me, Copilot is a bust? Not at all. There’s another mode of interaction that feels like a far better fit to the strengths and limitations of generative AI, and the more I use it the more I’m convinced that it could live up to the hype. I’ve become a big fan of Copilot Chat.

Here, instead of code completion, you interact with Copilot in a separate dialogue in a sidebar. “Chat” is absolutely the right name, as it looks and feels like a text conversation with a human being (and, absurd as it may sound, treating it as such may well yield better results). You can ask it how to perform a task or use an API, and it will produce a lucid, coherent answer with examples. Because it’s running in the context of the IDE, it has access to your own code as context, and this is evident both in the incidentals (variable names and so on) and in the fact that it incorporates relevant information not explicitly stated in the question, such as the libraries you’re already using.

Crucially, this isn’t a one-shot process. You can ask questions about the results, ask for changes or clarifications, and even correct mistakes. This last point is, I think, the one that makes it feel like a better interface to Copilot. Today’s LLM-based systems are prone to hallucinations, and it may well be that this is a fundamental property that will never go away. With autocomplete, these render individual suggestions useless and the system as a whole untrustworthy. In the context of a chat, you still need to be aware of them, but there’s a way forward with that awareness.

The first attempt to exploit a new technology is usually to use it as a drop-in replacement for an existing application. Ben Thompson frequently cites the example of advertising on the Internet, which initially aped print advertising in placing display ads alongside content. In most cases, it’s a poor substitute, and the new technology only really takes off with the advent of a different approach that plays to the strengths of the new medium (in that example, feed and search ads).

Of course, it’s not a given that such an approach will be found; most technological developments turn out to have little or no long term impact, whatever their boosters may think. For a while, I thought generative AI may fall into that camp, and my experience with code completion backed that theory. However, taking a step back, I’m now coming round to the idea that that is just the technology being misapplied. Chat interfaces, on the other hand, may or may not be turn out to be the best way to take advantage of the capabilities of LLMs, but they’re enough to suggest that there’s some there there.


The header image is a combination of photos by Sara Cervera (sandwich) and Matt Artz (spanner) from UnSplash.

The Beginner's Mind

10 Feb 2025

One of the things I’m enjoying about dipping my toe into CAD is the chance to approach something as a beginner. I rarely get — or perhaps make — the opportunity to do something that’s both unfamiliar, and low enough stakes that I can take risks and take my time. I’m making a conscious effort to not bring my preconceptions from adjacent fields, and not rush towards the first thing that will achieve my practical ends (although I freely admit that I sometimes fall short on both counts). There’s a wide world of new concepts and ideas, and it would be a waste to simply give the ones I already have a fresh coat of paint. Once I know the lay of the land, I imagine I’ll be able to put my old experience to work alongside the new, but for now I’m going to go slowly, keep my eyes open and fresh, and learn.

Three Shots

31 Jan 2025

Most keys on most keyboards are labelled in some way. Usually, these labels (“legends”) are just printed on top of the keycap, the bit of the key you actually touch. But because you’re actually touching it, the legend wears off over time. As you move into fancier keyboards, there are various ways to remedy this. The most indestructible are double-shot keycaps, where the legend is actually a separate piece of plastic that goes through the main keycap like the letters in a stick of Blackpool rock. This can let your tasteful RGB lighting shine through, but more importantly means the legend will never wear out.

Producing intricately-enmeshed plastic shapes is an area where 3D printing in general, and multi-material 3D printing in particular, stand out, so since getting my printer I’ve been itching to try and make some keycaps of my own. It also offered a perfect opportunity to try out my new 0.2mm hotend (the default is 0.4mm).

Rather than trying a full set straight off the bat, I decided to go for something more modest. I occasionally use my first mini-keyboard as a macropad, but recently realised that the otherwise-unused outer rows of my Corne would do just as well. Updating the firmware was straightforward1, so I just needed half a dozen keycaps with custom symbols.

OpenSCAD coupled with colorscad seemed like the tools for the job, and the comprehensive KeyV2 project on GitHub provided a flexible way to create keycaps in a wide variety of profiles. Importantly, this includes the DSA profile that I was already using, so I started with that.

It was easy enough to add an embedded shape in a contrasting colour, but when I printed it I ran into an in-retrospect-obvious issue; the dish at the top of the key produces some very obvious layer artefacts when sliced:

Rendering of a DSA keycap, showing layer lines

These kind of artefacts are unavoidable in FDM printing when making a surface at a shallow angle to the plane of the layers. You can sometimes minimise the impact by positioning your model at an angle, but that often introduces other problems — for example, here the sides of the keys aren’t flat, so I’d need to add support. A more general approach is post-processing the print. I had a go at sanding my test prints down, but while this produced a decent feeling keycap, I couldn’t get a visual finish I was happy with. At this point I could have moved on to more advanced finishing techniques like vapour smoothing, but I decided to think laterally.

I decided to forgo the dish, and make the top of the key flat. The keycaps could then be printed face down, which not only solves the layer artefacts, but picks up a nice texture from the build plate. Moreover, as the legend only needs to go through the top surface rather than the entire height of the key, only the first few layers contain multiple colours. This significantly reduces both time and waste.

With this approach in mind, I started putting together the symbols. Looking at my first double-shot design, a thought occurred to me: why stop at two? Putting together three or more different colours would be tricky using conventional production techniques, but not for 3D printing. With nothing stopping me, I went ahead with a red-and-white colour scheme on black keys, and I’m pretty pleased with the result:

Five black keycaps with custom red and white legends

The keycaps look great straight from the printer, and have a nice feel thanks to the texture. The lack of a dish might make them less comfortable for regular typing (or perhaps not — most modern, mass produced keyboards have flat keys and get away with it), but these are function keys and so it’s not an issue.

The OpenSCAD source file is here: CorneExtras.scad. I’ll not hold it out as an example of good style; in particular, there’s more boilerplate and repetition than I’d like. I’m not sure if this is due to my own lack of experience with the language, or its limitations. Regardless, it should hopefully serve as a useful starting point if you want to have a go at making something similar yourself.

A split Corne keyboard with the above keycaps installed in the outermost columns

  1. Generally speaking, I don’t hard-code key sequences into the firmware, but instead configure the macro keys as high-number function keys (F13, F14…) and map them to functions in BetterTouchTool. This allows for more advanced actions, and avoids the need to re-flash the firmware every time I want to reconfigure them. [back]

You Wouldn't Steal a Boat

12 Jan 2025
3D-printed sign reading

A quick, tongue-in-cheek joke about the current absurdity surrounding the licensing of 3DBenchy, the until recently beloved de facto mascot of 3D printing. Available on MakerWorld (CC BY-NC-SA, of course).

Keyboard Tray and FreeCAD

4 Jan 2025
FreeCAD rendering of a monitor arm brace

Since going remote at the start of the pandemic, I’ve had a standing desk setup at home. Rather than an adjustable height desk, I kept my existing desk and put the monitor on a tall VESA mounting arm. Combined with this, I’ve used a variety of stands to sit on the desk and raise the keyboard and trackball to elbow height. These work fine, are a bit cumbersome to move out of the way when you want to use the desk at normal height. So, as a little holiday project, I decided to upgrade to an arm-mounted keyboard tray.

My initial plan was to mount this on the same pole as the monitor, but it ran into a snag — the arm I’d found was from different manufacturer, and was designed for a pole a few millimetres thicker. What I needed was a plastic sleeve, and as it happens I now have a way to make random plastic parts on demand. A few lines of OpenSCAD and twenty minutes of printing, and I had a simple sleeve to adapt the arm to the existing pole. Problem solved. Well, the first problem, anyway.

When I connected everything back up, it became rapidly apparent that the action of typing caused the monitor to wobble quite a bit. This would be annoying at the best of times, and even worse on video calls (as the camera is on top of the monitor). I reconfigured everything so that the keyboard was on a separate pole, but this didn’t solve the issue; both were attached to the desk, and the vibrations from typing still wobbled the screen to an unacceptable degree. Clearly more was needed.

Taking a step back and looking at it, I concluded that the best approach would be to add an additional brace attaching the top of the pole to the wall (the bottom remains fixed to the desk; I didn’t want to ditch the pole entirely as it allows the height of the keyboard to be adjusted). Again, the ability to make custom parts to the exact specifications needed saved the day — I’m starting to see that this will be a complete game changer for DIY jobs around the house.

I could have designed the brace in OpenSCAD, but I took the opportunity to try out FreeCAD. Where OpenSCAD is essentially a programming environment for Constructive Solid Geometry, FreeCAD is a lot closer to a traditional CAD system. Given my background and prior experience, I’d expect the former to be more natural, but to my surprise I found the latter to be a better fit for more complex designs. It’s early days, and I suspect that I’ll end up using both packages in different contexts, but from what I’ve seen so far the extra power and flexibility of FreeCAD is worthwhile even for a beginner like me.

This capability comes at the cost of complexity; my mental model of how FreeCAD actually works is very much a work in progress, and the UI is a mass of toolbars, views and states that I’m still groping around as if blindfolded. How much of this complexity is inherent to a “real” CAD package, and how much of it is due to shortcomings specific to FreeCAD, remains to be seen. I might take a look at Fusion 360 as a point of comparison, but its “free for hobbyists” license makes me wary of investing too much time into it, at least at this stage.

Back to the keyboard tray problem, I got a first version designed pleasingly quickly, but on printing it out realised that I’d based it on the measurements of the narrower pole and so the piece wouldn’t fit. Suitably chastened, I fixed the mistake, and while I was there added a few refinements like countersinking the screw holes and adding fillets. This time, I stopped the print after the first few layers to provide a piece that could physically confirm the sizing. It was spot on, so I printed the whole thing, and installed it.

I’m pleased to report that it worked like a charm — a rock solid keyboard tray arm, and no vibration of the monitor. It was a bit more of a roundabout route than originally intended, but I got there in the end, and learned some things along the way. I’ll call that a win.

This site is maintained by me, Rob Hague. The opinions here are my own, and not those of my employer or anyone else. You can mail me at rob@rho.org.uk, and I'm @robhague@mas.to on Mastodon and robhague on Twitter. The site has a full-text RSS feed if you're so inclined.

Body text is set in Georgia or the nearest equivalent. Headings and other non-body text is set in Cooper Hewitt Light (© 2014 Cooper Hewitt Smithsonian Design Museum, and used under the SIL Open Font License). BBC BASIC code is set in FontStruction “Acorn Mode 1” by “p1.mark” (licensed under a Creative Commons Attribution Share Alike license).

All content © Rob Hague 2002-2025, except where otherwise noted.