Adventures in machine-generated text: short-burst creativity, and why classical CS has it wrong

I’ve been reading Robin Sloan’s video game development diary with much interest. Partly because his joy of creation that shines through – and the world desperately needs more of that in these COVID-19 times. But also partly because he’s using machine learning to generate descriptions of castles for his video game.

A bit of background: the three people who actually do read this blog of mine (and the significantly largely number who hang out with me on Facebook and Twitter) might know that I’ve been whacking away at a similar problem ever since OpenAI released GPT-2. About a year I ended up using the 117M and 345M models to create a machine poet, which would then be the voice of a fictional machine poet in a novel, turning out lovely (if somewhat overfitted) stuff like this:

A LONG CLIMB

In a sharp gale from the wide sky apes are whimpering,
Birds are flying homeward over the clear lake and white sand,
Leaves are dropping down like the spray of a waterfall,
While I watch the long river always rolling on.
I have come three thousand miles away. Sad now with autumn
And with my hundred years of woe, I climb this height alone.
Ill fortune has laid a bitter frost on my temples,
Heart-ache and weariness are a thick dust in my wine.

It worked well for poems, right out of the box. I suspect this is because poetry relies on apophenia and interpretation, and are not as bound to formal structures as, say, a page of prose. I suspect. A large part of me writing the next couple of novels with this poet (OSUN) is the joy of being surprised every now and then by what it chains together, and how, and then spending the evening with a good glass of arrack and a cigarette wondering where that particular signal came from, amidst all the noise. [1]

But why stop at poems? Could it do whole stories? A friend and I briefly kicked around the idea of taking the CMU Movie Summary corpus and training on that.

A fairly trivial task, yes. But of course, 2020 rolled around, and turned out to be a rat’s ass of a year. Facing depression, quarantine, work and general plans-falling-apart, this inquiry started to look like the XKCD graphic:

xkcd: Automation

So placed my fiction-related GPT-2 inquiries on hold, and went back to examining them for research into identity, impersonation and so on (which I was being paid to do). Robin’s experiments jarred me back in the nicest possible way. Let’s do a small recap, as much to remind myself as anything else.

YE OLDE YE OLDE

The classical CS approach for generation seemed to be to feed massive documents’ worth of text into a neural network – and then reporting back that it has failed at the task of writing a coherent story or CS textbook, or whatever, but had managed to learn punctuation and grammar.

Part of me wants to shake my head. Perhaps with enormous amounts of compute and with an ungodly number of layers you couldn’t end up a novel-writer. Perhaps a few more breakthroughs and people like me will be able to grab some code off GitHub, read a paper or two and boot up our very own Shakespeare 2.0 bot (in fact, I’ve written a story where this future happens, and the market shifts to post-human fiction).

But right now, these models have a problem: attention.

Let’s take the view that language is a way of denoting concepts at the relationships between them[2]. Writing a sentence or paragraph demands a sense of some concepts that make up the content of the text and some concepts that make up its presentations (grammar, and so on). A longer pieces of text demands more concepts, and more (plot, story beats, say, characters, themes). Each of these concepts carries its own trail of relationships, each tying to different concepts denoted before. Unless the text is extraordinarily simple or focused on just one topic, the space of concept-relationships rockets upwards in a steep curve.

Our current crop of models fail long before they get to the novel mark: OpenAI stretched it to the length of news article by throwing in enormous amounts of text data, and that seems to be where they are right now. 

The most recent research seems to agree. Rashkin, H., Celikyilmaz, A., Choi, Y., & Gao, J. (2020). PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking. arXiv preprint arXiv:2004.14967 took a stab at generating story outlines, and they’ve arrived at a format that takes some input and generates a decent plot, staying within the news article length. I don’t think it can be extended to generate the entire story itself, or even a novella version.

Which brings me, in a very roundabout and self-referential way, back to how Robin is using this tech. He’s using it to generate descriptions of castles. In short, elements of worldbuilding. Not the whole story, but pieces of it.

He’s doing it right. It’s incredible topical. It’s not a novel. The space of concepts to explore and the relationships to maintain have been whittled down to a razor-sharp focus. The texts are short, and the experiment replicable without spending incredible amounts of time doing feature engineering. And it’s given me a nice little reminder that there are ways, if you’re creative, to use this stuff.

WHY ALL THIS COMPLEXITY?

I have to step off a bit and say that we may, in pursuing this line of thought, be going after the proverbial fly with a bazooka. There’s always the possibility that simpler methods might potentially match this level of output for less effort (collecting training data, etc). Certainly there are far more examples of programs having done so. We’re now in procedural generation turf.

Let me explain. For the Salvage Crew, I pottled around the Internet until I found an excellent planet generator by a chap called Zarkonnen.

It’s fabulous. No machine learning, nothing more complex than some really nice text chains and a visualizer. Beautiful stuff. It’s a tiny little No Man’s Sky. Inspired, I wrote a little galaxy generator in R. It uses distributions to generate objects and chains planets to stars, generating not absolutely positions, but a ‘galaxy as a community’.

Image may contain: text

Stuff like this – procedural generation – is the beating heart of many famous efforts in the gaming industry, from Diablo to Minecraft to No Man’s Sky. And who can, of course, forget Dwarf Fortress? Things like these largely rule-based, deterministic, largely explainable, and allows for finer control over the end result. They to Donald Knuth’s definitions of an algorithm – among which is that a human should be able to compute the result in a finite time given a pen and a piece of paper.

My result here was achieved with far less effort than it would have taken me to get an instance of GPT-2 to write me a galaxy or a planet – a little bit of Wikipedia, a bit of statistics, and zero training data involved.

SO DESU KA?

This sounds like a bit of bummer, but it’s not, really.

My rambling is, of course, me trying to think. My belief is that this actually means that there’s an enormous toolbox open to us in the short-text space. GPT-2? Yes! Markov chains? Also yes! Where the machine learning component really shines may not be at the short text, but at that middle sprint – the web-article-sized chunk of content that readily available tech like OpenAI GPT-2 are so capable of doing.

I don’t see why we can’t use hybrid approaches on some things. For example, imagine a story about a dungeon crawler where the dungeon, the loot and the enemies are generated by deterministic methods, but conversations with NPCs, and little snippets of backstory that you discover in books as you pass through, are written by machine learning. In fact, I hope Robin really gets around to this approach – different systems passing data to one another. I suspect the results will be incredible.


As for me, I am significantly lazier. My poet-bot (now with Pablo Neruda in the mix) is fired up, the galaxy generator is ready, and I’ve got a character generator in the works for the second Salvage Crew novel. These systems don’t interact yet, unless it’s through me: I am sort of the connective tissue, and therein lies both the weakness and the strength. I’m happy to serve as the middle ground; it involves less effort from my part and is closer to the Kasparov-esque human+machine combo I want to get to.

In the meantime, even as I write this, I’ve got GPT-2 training on – well, for some reason it’s writing about game development right now. Seeing Robin’s experiment in action makes me enthusiastic again. The next step will be to replicate the Rashkin et al paper and see how it does in practice.

Of course, we’re yet to answer the question of : is it worth doing all this? Isn’t it easier to write this stuff by hand?

Of course it is. Just like it was once easier to calculate orbital velocities on paper than on computers.

We’ll get to someplace interesting.

Eventually.


[1] GWERN’s blog is an excellent resource for this stuff: https://www.gwern.net/RNN-metadata#finetuning-the-gpt-2-small-transformer-for-english-poetry-generation

[2] See: Bertrand Russell.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *