Models like DALL-E dissociate ideation from implementation. Do we care?
There’s a puzzling disconnect in the many articles I read about DALL-E 2, Imagen, and the other increasingly powerful tools I see for generating images from textual descriptions. It’s common to read articles that talk about AI having creativity–but I don’t think that’s the case at all. As with the discussion of sentience, authors are being misled by a very human will to believe. And in being misled, they’re missing out on what’s important.
It’s impressive to see AI-generated pictures of an astronaut riding a horse, or a dog riding a bike in Times Square. But where’s the creativity? Is it in the prompt or in the product? I couldn’t draw a picture of a dog riding a bike; I’m not that good an artist. Given a few pictures of dogs, Times Square, and whatnot, I could probably photoshop my way into something passable, but not very good. (To be clear: these AI systems are not automating photoshop.) So the AI is doing something that many, perhaps most humans, wouldn’t be able to do. That’s important. Very few humans (if any) can play Go at the level of AlphaGo. We’re getting used to being second-best.
However, a computer replacing a human’s limited photoshop skills isn’t creativity. It took a human to say “create a picture of a dog riding a bike.” An AI couldn’t do that of its own volition. That’s creativity. But before writing off the creation of the picture, let’s think more about what that really means. Works of art really have two sources: the idea itself and the technique required to instantiate that idea. You can have all the ideas you want, but if you can’t paint like Rembrandt, you’ll never generate a Dutch master. Throughout history, painters have learned technique by copying the works of masters. What’s interesting about DALL-E, Imagen, and their relatives is that they supply the technique. Using DALL-E or Imagen, I could create a painting of a tarsier eating an anaconda without knowing how to paint.
That distinction strikes me as very important. In the 20th and 21st centuries we’ve become very impatient with technique. We haven’t become impatient with creating good ideas. (Or at least strange ideas.) The “age of mechanical reproduction” seems to have made technique less relevant; after all, we’re heirs of the poet Ezra Pound, who famously said, “Make it new.”
But does that quote mean what we think? Pound’s “Make it new” has been traced back to 18th century China, and from there to the 12th century, something that’s not at all surprising if you’re familiar with Pound’s fascination with Chinese literature. What’s interesting, though, is that Chinese art has always focused on technique to a level that’s almost inconceivable to the European tradition. And “Make it new” has, within it, the acknowledgment that what’s new first has to be made. Creativity and technique don’t come apart that easily.
We can see that in other art forms. Beethoven broke Classical music and put it back together again, but different-–he’s the most radical composer in the Western tradition (except for, perhaps, Thelonious Monk). And it’s worth asking how we get from what’s old to what’s new. AI has been used to complete Beethoven’s 10th symphony, for which Beethoven left a number of sketches and notes at the time of his death. The result is pretty good, better than the human attempts I’ve heard at completing the 10th. It sounds Beethoven-like; its flaw is that it goes on and on, repeating Beethoven-like riffs but without the tremendous forward-moving force that you get in Beethoven’s compositions. But completing the 10th isn’t the problem we should be looking at. How did we get Beethoven in the first place? If you trained an AI on the music Beethoven was trained on, would you eventually get the 9th symphony? Or would you get something that sounds a lot like Mozart and Haydn?
I’m betting the latter. The progress of art isn’t unlike the structure of scientific revolutions, and Beethoven indeed took everything that was known, broke it apart, and put it back together differently. Listen to the opening of Beethoven’s 9th symphony: what is happening? Where’s the theme? It sounds like the orchestra is tuning up. When the first theme finally arrives, it’s not the traditional “melody” that pre-Beethoven listeners would have expected, but something that dissolves back into the sound of instruments tuning, then gets reformed and reshaped. Mozart would never do this. Or listen again to Beethoven’s 5th symphony, probably the most familiar piece of orchestral music in the world. That opening duh-duh-duh-DAH–what kind of theme is that? Beethoven builds this movement by taking that four note fragment, moving it around, changing it, breaking it into even smaller bits and reassembling them. You can’t imagine a witty, urbane, polite composer like Haydn writing music like this. But I don’t want to worship some notion of Beethoven’s “genius” that privileges creativity over technique. Beethoven could never have gotten beyond Mozart and Haydn (with whom Beethoven studied) without extensive knowledge of the technique of composing; he would have had some good ideas, but he would never have known how to realize them. Conversely, the realization of radical ideas as actual works of art inevitably changes the technique. Beethoven did things that weren’t conceivable to Mozart or Haydn, and they changed the way music was written: those changes made the music of Schubert, Schumann, and Brahms possible, along with the rest of the 19th century.
That brings us back to the question of computers, creativity, and craft. Systems like DALL-E and Imagen break apart the idea and the technique, or the execution of the idea. Does that help us be more creative, or less? I could tell Imagen to “paint a picture of a 15th century woman with an enigmatic smile,” and after a few thousand tries I might get something like the Mona Lisa. I don’t think that anyone would care, really. But this isn’t creating something new; it’s reproducing something old. If I magically appeared early in the 20th century, along with a computer capable of running Imagen (though only trained on art through 1900), would I be able to tell it to create a Picasso or a Dali? I have no idea how to do that. Nor do I have any idea what the next step for art is now, in the 21st century, or how I’d ask Imagen to create it. It sure isn’t Bored Apes. And if I could ask Imagen or DALL-E to create a painting from the 22nd century, how would that change the AI’s conception of technique?
At least part of what I lack is the technique, for technique isn’t just mechanical ability; it’s also the ability to think the way great artists do. And that gets us to the big question:
Now that we have abstracted technique away from the artistic process, can we build interfaces between the creators of ideas and the machines of technique in a way that allows the creators to “make it new”? That’s what we really want from creativity: something that didn’t exist, and couldn’t have existed, before.
Can artificial intelligence help us to be creative? That’s the important question, and it’s a question about user interfaces, not about who has the biggest model.