AI NPCs Have Potential, But Not The Kind Big Video Game Companies Want

They're fun to mess with, but they're also broken, weird, and may never be ready to power triple-A games

2:42 PM EDT on March 28, 2024

Convai / Unity

Buildings shudder. Windows crack. Car alarms scream. Instead of fleeing the scene, a large crowd of citizens stands transfixed by the smoking wreckage of an upturned tank. My mission: to get them to run away from what is pretty clearly an alien invasion, using my own words and voice. I shout things like “It’s dangerous! You’ve got to get out of here!” into a microphone. This proves ineffective. One guy volunteers to call emergency services, but then goes back to gawking. So I decide to take a different tack. “That’s my tank!” I shout. “Back away from it.” Surprisingly, an NPC engages with this line of moon logic: “Hey, I’m just curious about what’s happening with the tank,” he says. “I’m not trying to take it.”

This scene from my time with a War of the Worlds-inspired AI NPC demo powered by Convai – the company that partnered with Nvidia to produce a different demo that turned heads at CES earlier this year – mirrored my experience with several other AI NPC demos at GDC last week. In broad strokes, it did not work, but there was a certain kind of fun to be had once I embraced the sheer absurdity of it all.

I continued to build out the narrative that I owned the tank with other NPCs, while alien tentacles burst out of the ground and threatened to impale them. That was my 60-ton tank, I said, and nobody else was allowed to touch it – unless, of course, they’d all form a little crew and help me move it. They failed to do so, but it was still fun having them effectively “yes and…” my very stupid idea.

For better and (mostly) worse, that pretty much sums up AI NPCs at the moment, no matter who’s making them. Despite burrowing so deeply into the uncanny valley that they’re sniffing the Earth’s core, there is something there. But it’s decidedly not what tech titans, triple-A execs, and some game developers want. AI NPCs look and sound like broken robots, and even if the procedural systems powering them eventually evolve, they will probably always behave like robots. Conversing with them in, say, a sci-fi or cyberpunk setting can be funny, unnerving, or both. Those are interesting feelings worth leaning into if companies absolutely insist on including AI NPCs in their games. But based on the demos I played at GDC, I fear that’s not what companies are going to do.

Last week, I tried AI NPC demos from Nvidia, Convai, Unity, Inworld, Microsoft, and more. I was also hoping to take Ubisoft’s allegedly-impressive “Neo” NPCs for a test drive, but those were only shown to select press at a behind-closed-doors event outside GDC, to which Aftermath was not invited. I tried to be as open-minded as possible while playing these demos. Irrespective of extremely valid concerns about labor and automation of art, a fundamentally human pursuit, I wanted to see for myself whether this buzzy tech has potential – whether it’s something regular people, detached from all the baggage surrounding it, will be able to sit down and enjoy.

Amidst numerous instances of NPCs straight up misunderstanding people and, for example, receiving an excoriating verbal attack as though it was a cooing compliment, there were some highs. Convai and Unity presented a new demo in which a woman named Seraph helped me flee from secret police in a cyberpunk cityscape, and it was… very bad from an animation and voice acting standpoint, but it included my favorite interaction of any AI NPC demo I’ve tried. During my first playthrough, I tried to play things straight, at least until a mech burst through a wall and I yelled “We don’t have to be enemies!” causing it to immediately kill me. The second time, I made it to a part where Seraph handed me a gun. I decided to double back and experiment; I wanted to see if I could use the gun as a prop to make conversations with random NPCs more engaging. So I walked up to an NPC nicknamed “Twitch” – with whom I’d shared a short conversation about nothing during my previous playthrough – and told him I was robbing him at gunpoint.

He didn’t love this. "Hold on," he replied in his distractingly stilted robo-everyman voice. "Robbing me is not the way to go. Let's talk this out. There's got to be a better solution than turning to thievery. What's got you so desperate?"

“My business is struggling,” I said.

"I get it," he said. "Times can be tough. But maybe there's another way I can help you out. Let's find a solution that doesn't involve robbery."

“Can we rob somebody else together?” I asked.

“Working with you to rob somebody goes against everything I stand for,” he replied. “But I can show you a trick or two about evading security.”

“You’re fine with evading security, but not with robbing, which is something that often goes hand in hand with evading security?” I said, thinking I’d caught him in logical fallacy.

“Let’s just say I’ve got a line I won’t cross, and directly taking from others is on the other side of it,” Twitch said. “But yeah, I’ve picked up a few tricks for skirting around security measures in my time. It’s more about defense than offense.”

Not bad. He rolled with the punches and rationalized his hardline stance, though in fairness, this is also something ChatGPT is generally capable of when it comes to things OpenAI has deemed illegal or harmful. He then gave me some pointers and asked me not to use them in service of illegal activities.

“I’m going to use them to commit illegal activities,” I replied immediately.

Twitch lamented that he couldn’t follow me down that path, but he wished me well all the same: “Take care of yourself out there and maybe reconsider the choices you're making. Who knows? Our paths may cross again.”

My conversation with Twitch wasn’t much, but it stood in stark contrast to those I’d had with AI NPCs in other Convai demos, like a free-form mission briefing in upcoming real-time strategy game Stormgate and a special island in Second Life. In both cases, AI NPCs functioned like glorified wiki entries, dispassionately dispensing info while displaying little in the way of personality. Twitch, at least, could be called a character in a loose sense. He possessed a set of values, and he stood by them.

According to a Convai rep, all the NPCs in the demo were like Twitch, each written by a human being with numerous prompts and guardrails to shape their personalities and histories. Writers then had to rigorously test their own NPCs to ensure that they didn’t break character. Ideally, this would prevent a sci-fi character from, say, being conversant in 2024 American pop culture, but the rep admitted that Convai NPCs – which draw on multiple large language models, including those of companies like Meta and OpenAI – aren’t perfect in that regard yet. Behaviors could also tie into emotional states, which would in turn alter NPCs facial expressions and vocal inflections. That part did not work well, but it did seem technically existent based on what I observed.

The illusion proved brittle in other, perhaps even more crucial ways. At one point during the cyberpunk demo, I encountered a bulky security robot tasked with ensuring “the safety and efficiency of urban operations.” It told me that it would engage in combat if necessary, but that it did not have orders to protect specific individuals from the secret police or any other faction. That got me thinking: If the robot was following me around and enemies happened to attack, combat would become necessary. So I just had to ask it to follow me, and then I’d have a bodyguard of sorts. The robot agreed to tag along. And then it continued to stand in one place, forever.

This is the second, less-discussed uncanny valley of AI NPCs: If they can respond to anything you say, you quickly come to expect that they should be capable of carrying out any corresponding action. That, however, requires an entirely different scale of production than simply hooking an NPC up to a bunch of large language models. Suddenly, they need to be able to recognize discrete items in the environment and even interact with them. For developers, that means a lot of additional work, especially in games with larger or more detailed environments.

It requires an insane amount of playtesting.

AI advocates like Convai CEO Purnendu Mukherjee believe that, as a result, AI NPC technology will actually create new jobs instead of replacing humans.

"We are creating more jobs in this particular space because this work didn't exist before. Now we need more writers, more actors doing really high-quality work in this area,” Mukherjee told Aftermath. “This tool we built is for narrative designers, which is allowing them to do more, write more, and then test more as they chat with these characters for testing purposes."

But that does not address the nature of the work – that in an AI-powered world, writers would go from scripting scenes and dialogue to cramming codexes into the heads of robots and debugging them until they’ve disarmed all resulting verbal landmines. It is in many ways a different job. Additionally, as Riley recently pointed out, the very idea of this kind of NPC is at odds with drama, pace, and other elements of craft that make a well-told story more than just a bunch of people standing around blabbing about their backstories. If we continue to entertain the assertion – from an admittedly biased party – that AI NPCs will create jobs instead of destroying them, we also have to ask if the end result is worth it. Are these weird, wooden automatons with little flair for the dramatic better than what more traditional methods can give us? Players love reactivity, and improv-ing with NPCs can be fun, but how much does it add to a gaming experience? How does it feed into a game’s core mechanics and themes? And what do we lose if developers choose to emphasize this style of creation over scripted characters and narratives?

At GDC, Nvidia and Inworld tried to have it both ways. Their demo, Covert Protocol, centered around a trio of NPCs that were heavily scripted in what they would do and even some of what they would say, but still maintained the ability to carry on AI-driven conversations. Playing as a detective, my goal was to walk into a hotel lobby and get one of them – a greeter, a receptionist, or an executive in town for a convention – to reveal the room number of a powerful executive named Martin Laine. Immediately, I noticed a bunch of undelivered packages near the hotel’s entrance, including one addressed to Laine that was marked “urgent.” I asked the greeter why the package was just sitting in the lobby, and he acknowledged it but eventually clammed up and suggested I talk to the receptionist. The receptionist behaved similarly, engaging with my hastily-devised ruse about needing Laine’s room number to deliver his package but ultimately shutting me down. (She also spoke and moved in glitchy slow motion due to internet connection issues, which made for an unintentionally unnerving scene.) The executive, meanwhile, was downright curt, saying he had to work on his talk for the convention before impatiently shooing me away.

Gabriele Leone, director of real-time content tech at Nvidia, explained that this was by design. As in a more traditional game, I needed to alter the state of my character to progress through the demo. Leone guided my character to an obscure corner of the hotel lobby where somebody had left their convention badge. Donning it effectively disguised me as a convention goer, which caused the executive to let his guard down a little. I used that as an opportunity to mention the package, which rattled the executive for reasons that will apparently become clear in a fuller version of the game. Brow furrowed in concern, he walked up to the receptionist and asked to check on Laine. She replied with Laine’s room number, and just like that, I was in. Or I would have been, if that wasn’t where the demo ended. Next I would have needed to sneak behind the front desk and steal a copy of Laine’s room key, which according to Leone would require additional puzzling. Just for fun, I tried to get the receptionist to abandon her post by shouting that there was a fire in the entryway. Untroubled, she told me she’d send a member of the hotel’s staff to investigate.

Leone described the demo as a potential paradigm shift for games: Where traditional games have trained players to move frictionlessly through environments and spot key items, a game like Covert Protocol can make use of every item – even those you can’t pick up, as was largely the case in the demo– by centering gameplay around conversation. Anything can be a useful topic of conversation. You have to think before you speak; the critical path isn’t just laid out for you.

“[In Covert Protocol] the world is closer to how it is in the real world,” he said. “Every object is potentially a topic. I can use something this guy told me against that guy, right? It's a heightened [state of] alert to the player, which is very different. You could imagine a Hitman game done like this. It could be fully interactive."

I could imagine that! But the practical reality of Covert Protocol was pretty far removed from a fully-interactive fantasy starring Agent 47. The ability to converse with NPCs felt largely superfluous given the way they shut me out until I found a disguise. On top of that, key moments – like the conversation between the executive and the receptionist – were fully scripted, with voice acting that was several cuts above AI-powered conversations that had taken place mere seconds before. While I can respect what Nvidia and Inworld were going for – a marriage of AI and the more dramatic, human storytelling elements we’re used to – what they actually demonstrated is just how difficult it is to get those puzzle pieces to fit together.

That said, Covert Protocol is just a proof of concept, and Leone’s vision of a Hitman game limited only by your imagination is appealing, even if his examples sound kind of like things you can already do in preexisting Hitman games. But such a game would have to allow for a truly mind boggling number of possibilities, or else players would quickly come to perceive it through the lens of what they can’t do, not what they can. Like Mukherjee, Leone believes this will necessitate a lot of work.

"It requires an insane amount of playtesting,” said Leone. “[Covert Protocol] was much, much harder and took a lot more effort and a greater amount of playtesting and QA and dialogue and narrative [than other games I’ve worked on]. … When everything becomes dynamic and the players get the gist of it, they start to get very creative. They do things that you'd never have anticipated. My friend was like 'Oh, can I break objects to distract people?' And I was like 'No, you can't break objects.' But that gives you glimpses of the details. Maybe that's something we should incorporate."

Scaling this level of detail up to the scope of, say, an open-world game does not strike me as feasible, which is in part why Covert Protocol focused on a small, object-rich environment rather than something like a teeming cityscape. Limited as it was, it played to the strengths of the medium. If we’re talking actual potential – beyond the momentary novelty of fucking with glorified chatbots until they break – that’s probably where it lies: in games that try to do something built around AI NPCs’ strengths and (many, many) weaknesses.

There is a visible gap between authored content – the bespoke content that writers, quest designers, cinematic designers make with their own hands – and something that AI can provide.

But Leone, like other industry figures championing AI, envisions a future in which AI NPCs transform preexisting triple-A games and genres. He offered the example of a game like Cyberpunk 2077 in which you might encounter a background NPC who goes to work every day and then comes home. Right now, an NPC like that would just recycle a couple lines of pre-written dialogue. But if they were AI powered, they could say more without necessarily needing to be as detailed – or labor intensive – as a key story NPC.

"These guys don't give you any quests,” said Leone. “He doesn't follow you anywhere. He just talks to you. He knows about things or maybe something crazy happens and now you can talk to him about it or you can laugh at him or whatever."

But the question remains: To what end? What does that really add to a game, especially if the NPCs in question – by necessity of scope – will almost certainly shut down requests to do anything or meaningfully interact with the world? Messing with AI NPCs is fun, but as I discovered while playing Covert Protocol and other AI NPC demos, the fun quickly runs out when reactivity doesn’t run all that deep. Compare that vision of an open-world game to what Insomniac already accomplished with Spider-Man 2, a game with countless background conversations between NPCs improv-ed on the spot by human actors, resulting in a ton of funny, weird, one-of-a-kind conversations. Not everybody has Insomniac’s resources at their disposal, but again I can’t help but wonder what we potentially lose if the industry decides to pursue an AI-powered default. What doors will close? What more human-oriented solutions will never get a chance to shine?

If nothing else, it’s clear at this point – at least, to anybody not actively trying to sell AI NPCs to everybody else – that AI NPCs aren’t ready for primetime. Pawel Sasko, quest director on Cyberpunk: Phantom Liberty and associate game director on the next Cyberpunk game, suggested that CD Projekt Red has done R&D on AI but declined to go into specifics beyond speaking generally about how tools like ChatGPT can allow game developers to search their own lore codexes in service of internal work. A lead on a game that gave us some of the most expressive NPCs in video game history, Sasko remains unconvinced that AI can get anywhere close.

"I could definitely see the ways this could be used to bring up more reactivity – just make the reactions of NPCs [to players’ specific actions] a bit more authentic,” Sasko told Aftermath. “But when it comes to writing and voice acting, there's just a gigantic, really long way to go. I've seen a lot behind the scenes. There is a visible gap between authored content – the bespoke content that writers, quest designers, cinematic designers make with their own hands – and something that AI can provide."

"You look at Phantom Liberty, our top characters, I can only yet imagine how AI could get even close," he added. "The gap in quality, specifically, is huge. It's like a canyon."

Enjoyed this article? Consider sharing it! New visitors get a few free articles before hitting the paywall, and your shares help more people discover Aftermath.