Radar / AI & ML

On Technique

How might Copilot’s descendants change the craft of programming?

By Mike Loukides

August 9, 2022

A Lion Attacking a Horse, George Stubbs, 1762 (Yale Center for British Art)

In a previous article, I wrote about how models like DALL-E and Imagen disassociate ideas from technique. In the past, if you had a good idea in any field, you could only realize that idea if you had the craftsmanship and technique to back it up. With DALL-E, that’s no longer true. You can say, “Make me a picture of a lion attacking a horse,” and it will happily generate one. Maybe not as good as the one that hangs in an art museum, but you don’t need to know anything about canvas, paints, and brushes, nor do you need to get your clothes covered with paint.

This raises some important questions, though. What is the connection between expertise and ideation? Does technique help you form ideas? (The Victorian artist William Morris is often quoted as saying “You can’t have art without resistance in the materials,” though he may only have been talking about his hatred of typewriters.) And what kinds of user interfaces will be effective for collaborations between humans and computers, where the computers supply the technique and we supply the ideas? Designing the prompts to get DALL-E to do something extraordinary requires a new kind of technique that’s very different from understanding pigments and brushes. What kinds of creativity does that new technique enable? How are these works different from what came before?

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

As interesting as it is to talk about art, there’s an area where these questions are more immediate. GitHub Copilot (based on a model named Codex, which is derived from GPT-3) generates code in a number of programming languages, based on comments that the user writes. Going in the other direction, GPT-3 has proven to be surprisingly good at explaining code. Copilot users still need to be programmers; they need to know whether the code that Copilot supplies is correct, and they need to know how to test it. The prompts themselves are really a sort of pseudo-code; even if the programmers don’t need to remember details of the language’s syntax or the names of library functions, they still need to think like programmers. But it’s obvious where this is trending. We need to ask ourselves how much “technique” we will ask of future programmers: in the 2030s or 2040s, will people just be able to tell some future Copilot what they want a program to be? More to the point, what sort of higher-order knowledge will future programmers need? Will they be able to focus more on the nature of what they want to accomplish, and less on the syntactic details of writing code?

It’s easy to imagine a lot of software professionals saying, “Of course you’ll have to know C. Or Java. Or Python. Or Scala.” But I don’t know if that’s true. We’ve been here before. In the 1950s, computers were programmed in machine language. (And before that, with cables and plugs.) It’s hard to imagine now, but the introduction of the first programming languages–Fortran, COBOL, and the like–was met with resistance from programmers who thought you needed to understand the machine. Now almost no one works in machine language or assembler. Machine language is reserved for a few people who need to work on some specialized areas of operating system internals, or who need to write some kinds of embedded systems code.

What would be necessary for another transformation? Tools like Copilot, useful as they may be, are nowhere near ready to take over. What capabilities will they need? At this point, programmers still have to decide whether or not code generated by Copilot is correct. We don’t (generally) have to decide whether the output of a C or Java compiler is correct, nor do we have to worry about whether, given the same source code, the compiler will generate identical output. Copilot doesn’t make that guarantee–and, even if it did, any change to the model (for example, to incorporate new StackOverflow questions or GitHub repositories) would be very likely to change its output. While we can certainly imagine compiling a program from a series of Copilot prompts, I can’t imagine a program that would be likely to stop working if it was recompiled without changes to the source code. Perhaps the only exception would be a library that could be developed once, then tested, verified, and used without modification–but the development process would have to re-start from ground zero whenever a bug or a security vulnerability was found. That wouldn’t be acceptable; we’ve never written programs that don’t have bugs, or that never need new features. A key principle behind much modern software development is minimizing the amount of code that has to change to fix bugs or add features.

It’s easy to think that programming is all about creating new code. It isn’t; one thing that every professional learns quickly is that most of the work goes into maintaining old code. A new generation of programming tools must take that into account, or we’ll be left in a weird situation where a tool like Copilot can be used to write new code, but programmers will still have to understand that code in detail because it can only be maintained by hand. (It is possible–even likely–that we will have AI-based tools that help programmers research software supply chains, discover vulnerabilities, and possibly even suggest fixes.) Writing about AI-generated art, Raphaël Millière says, “No prompt will produce the exact same result twice”; that may be desirable for artwork, but is destructive for programming. Stability and consistency is a requirement for next-generation programming tools; we can’t take a step backwards.

The need for greater stability might drive tools like Copilot from free-form English language prompts to some kind of more formal language. A book about prompt engineering for DALL-E already exists; in a way, that’s trying to reverse-engineer a formal language for generating images. A formal language for prompts is a move back in the direction of traditional programming, though possibly with a difference. Current programming languages are all about describing, step by step, what you want the computer to do in great detail. Over the years, we’ve gradually progressed to higher levels of abstraction. Could building a language model into a compiler facilitate the creation of a simpler language, one in which programmers just described what they wanted to do, and let the machine worry about the implementation, while providing guarantees of stability? Remember that it was possible to build applications with graphical interfaces, and for those applications to communicate about the Internet, before the Web. The Web (and, specifically, HTML) added a new formal language that encapsulated tasks that used to require programming.

Now let’s move up a level or two: from lines of code to functions, modules, libraries, and systems. Everyone I know who has worked with Copilot has said that, while you don’t need to remember the details of the programming libraries you’re using, you have to be even more aware of what you’re trying to accomplish. You have to know what you want to do; you have to have a design in mind. Copilot is good at low-level coding; does a programmer need to be in touch with the craft of low-level coding to think about the high-level design? Up until now that’s certainly been true, but largely out of necessity: you wouldn’t let someone design a large system who hasn’t built smaller systems. It is true (as Dave Thomas and Andy Hunt argued in The Pragmatic Programmer) that knowing different programming languages gives you different tools and approaches for solving problems. Is the craft of software architecture different from the craft of programming?

We don’t really have a good language for describing software design. Attempts like UML have been partially successful at best. UML was both over- and under-specified, too precise and not precise enough; tools that generated source code scaffolding from UML diagrams exist, but aren’t commonly used these days. The scaffolding defined interfaces, classes, and methods that could then be implemented by programmers. While automatically generating the structure of a system sounds like a good idea, in practice it may have made things more difficult: if the high-level specification changed, so did the scaffolding, obsoleting any work that had been put into implementing with the scaffold. This is similar to the compiler’s stability problem, modulated into a different key. Is this an area where AI could help?

I suspect we still don’t want source code scaffolding, at least as UML envisioned it; that’s bound to change with any significant change in the system’s description. Stability will continue to be a problem. But it might be valuable to have a AI-based design tool that can take a verbal description of a system’s requirements, then generate some kind of design based on a large library of software systems–like Copilot, but at a higher level. Then the problem would be integrating that design with implementations of the design, some of which could be created (or at least suggested) by a system like Copilot. The problem we’re facing is that software development takes place on two levels: high level design and mid-level programming. Integrating the two is a hard problem that hasn’t been solved convincingly. Can we imagine taking a high-level design, adding our descriptions to it, and going directly from the high-level design with mid-level details to an executable program? That programming environment would need the ability to partition a large project into smaller pieces, so teams of programmers could collaborate. It would need to allow changes to the high-level descriptions, without disrupting work on the objects and methods that implement those descriptions. It would need to be integrated with a version control system that is effective for the English-language descriptions as it is for lines of code. This wouldn’t be thinkable without guarantees of stability.

It was fashionable for a while to talk about programming as “craft.” I think that fashion has waned, probably for the better; “code as craft” has always seemed a bit precious to me. But the idea of “craft” is still useful: it is important for us to think about how the craft may change, and how fundamental those changes can’t be. It’s clear that we are a long way from a world where only a few specialists need to know languages like C or Java or Python. But it’s also possible that developments like Copilot give us a glimpse of what the next step might be. Lamenting the state of programing tools, which haven’t changed much since the 1960s, Alan Kay wrote on Quora that “the next significant threshold that programming must achieve is for programs and programming systems to have a much deeper understanding of both what they are trying to do, and what they are actually doing.” A new craft of programming that is focused less on syntactic details, and more on understanding what the systems we are building are trying to accomplish, is the goal we should be aiming for.

Post topics: AI & ML

Post tags: Commentary

On Technique

Learn faster. Dig deeper. See farther.

Get the O’Reilly Radar Trends to Watch newsletter

Thank you for subscribing.