Posts Tagged ‘video games’

I’m still mulling over the issue of AI in relation to environmental constraints, but I thought it might be nice at this point to remind ourselves what those of us who see value in looking at video games more intelligently are up against when it comes to their portrayal in the mainstream media.  To that end, check out. . .

The Biggest Threats to America’s Youth

Be sure to check out some of the “discussion” in the comments that follows (that may be lending it more dignity than it deserves, but whatever).  What I found really amusing was that so many people didn’t seem to understand that this was satire.  Which is perhaps in itself a tragic commentary on how accurate this table is.

–Twitchdoctor

Advertisements

I think we’re getting into the ever-present, ever-frustrating topic of corporate influence on game development. Maybe it’s a copout, but I can see how “Good AI” development can be pricey and thereby unappealing to developers (specifically development firms) driven by profit. As much as I would argue that games are someone’s or a group of people’s works of art, I do recognize a significant difference between artists in the traditional sense and game developers: money. Even the most famous of traditional artists starved, often surviving only on their love for what they did. Please do correct me if I’m wrong, but I at least have the impression that there are few if any starving artists in the game development community who would have enough passion and resources to invest the time and money in developing better AI, not knowing beyond a shadow of a doubt that it would make them rich.

However, the above is my belief related to the development of a perfect (or close to perfect) single artificial entity, a bot. Because of corporate interests and the easy alternatives that Twitchdoctor pointed out, I don’t think we will see development companies focusing on making the bots in their games ‘think’ rather than simply giving them more health, stronger weapons, better aim, and of course, more grenades. Twitchdoctor’s post (Good AI, Bad AI) presents a powerful alternative to adjusting the bots though—changing the conditions of the game. As in Twitchdoctor’s example of Thief, the conditions of the game can be changed to accommodate difficulty increase and substitute for (or at least distract from) imperfect AI.

(more…)

Good AI, Bad AI

Posted: September 21, 2009 by Twitchdoctor in Game AI
Tags: , , , ,

TwinHits’ comment on my previous post perhaps had an unstated question behind it: isn’t it time to stop tearing down other people’s AI tests and come up with something creative?  And the answer would be, yes. . .in a minute.

Just to be clear, I think that both the kind of Turing Test represented by the Loebner prize and the Botprize started out on the right track but are confusing two issues: one of them useful, one of them of less so.  Turing’s original  question was twofold: can a machine think?  And if so, how would we know? His hypothesis was that if a machine could offer responses that were indistinguishable from those of a human, it could be said to think.  And since then most  Turing Tests such as the prize contests I have discussed have taken it as axiomatic that these two things are connected.  However, I don’t think they are necessarily.  Moreover, the emphasis in these tests is always on the AI itself, on its level of intelligence.  Which seems to be the blindingly obvious point of the whole exercise.  But the blindingly obvious isn’t so far yielding some very interesting results.

The “intelligence” issue may be interesting from the pure research perspective–although as I pointed out, if you were really interested in evaluating that aspect the entire contest would be run in a less tritely comparative fashion.  The end result is that these prizes now come across as little more than cheerleading exercises for our own human awesomeness: we are so smart, creative, flexible, adaptable, sophisticated and cunning that no AI can yet fool us.  Even if by some miracle an AI did manage to fool people sufficiently to pass one of these tests, that still wouldn’t be that useful a result for the rest of us.  By and large people do not interact with machines, much less their overall environment, in the form of a rigidly controlled testing procedure.  Therefore gamers and game designers should be interested in the other side of this question: not the capabilities of the AI itself, but the ability of the AI to fool us.  In other words, less emphasis on the intelligence, more on the artificial (hence the name of this blog!).

The ability of the AI to fool us into thinking that it possesses some human characteristics is going to be based in part upon the inherent capabilities of the AI, naturally.  But my argument is that it has much more to do, ultimately, with the design of the environment in which the AI is to operate, and the corresponding latitude afforded the player.  Especially for gaming purposes what counts is not how smart the AI really is but how smart it appears to be.  That perception is heavily shaped by the context of the game in general and that of the player in particular.  It is possible for a well-designed game to “fool” the player into feeling as if they have encountered some smart AI even if the pure technical capabilities of the AI may be relatively rudimentary.

At this point, then, it might be useful to start identifying some examples of AI design strategies in games that are either particularly bad or particularly effective.  I’m going to start with a couple of my favorites; you’ll notice that in several of these examples some of the good and bad aspects I’m describing can be attributed to game design issues in general as much as the AI design specifically–but that’s my point.

BAD AI

Dumb Difficulty Substitutes for Smarts: One of my least favorite approaches to creating a challenging game is where the “difficulty” settings simply boost the stats of the AI (and/or correspondingly reduce those of the player).  So the enemy doesn’t become smarter, it just becomes physically tougher, more accurate and its weapons deal more damage, and/or there are more of them.  This approach has been around virtually as long as electronic games themselves, and in small doses and in an appropriate environment it can provide a fun challenge.  But it has become the unimaginative default for game designers.  And sometimes it can be so badly implemented that it produces some unexpectedly hilarious side-effects that completely destroy a player’s immersion in the game.

One example is Call of Duty: World at War.  As with all the other titles in the series, the game features painstaking attention to historical detail as it recreates WWII battlefields in both the Pacific and Eastern Front.  The level design, sound design, weaponry, all make for a pretty immersive experience. . .until you crank it up to the highest difficulty level.  The actual quality and tactics of the AI doesn’t change significantly at any difficulty level.  But at the highest level, the Japanese AI suddenly starts up a passionate love affair with the grenade.

Picture this.  A scenario has you as a marine fighting to clear a Japanese-held island late in the war.  The defenders have been bombed, shelled, and strafed to buggery.  Their supply lines have been cut, they’ve been reduced to eating rodents and the more substantial examples of the local insect population.  They are so low on ammunition that they frequently resort to Banzai charges.  But crank the difficulty up to max, and suddenly they discover a bottomless supply crate of grenades.  Which they proceed to rain down on you with all the precision and frequency of an automated launcher.  My “Sod this for a joke” moment came when I watched no fewer than 6 grenades land in a neat circle around me, after I had dodged no less than ten in the previous two minutes.  This is not “creating a challenge,” it is covering up the limited abilities of your AI, and is a fundamentally unimaginative game design strategy (see “Change the Player”, below).

The One-Trick Pony: This is where the AI entities may have one single mode of operating which they deploy constantly every time you meet them.  Sometimes this mode is shared by many entities that are supposed to be functionally distinct.  This again tends to be a game design default, but the example that springs immediately to mind is Doom 3.  Yes, I know id software is responsible for all this: Wolfenstein 3D and Doom helped to get me hooked on the whole games thing in the first place. And yes, I know their focus is really on creating great multiplayer combat.   But their core game design and game design strategies since then have never changed: put all your effort into high end graphics at the expense of narrative and challenging AI.  Doom 3 is no exception.

While the game does a great job riffing on (or ripping off, depending on your point of view) the original Doom, the AI is amongst the least challenging and least interesting out there.  Most AI entities pretty much have one form of attack and one attack only.  After the 300th time that an imp jumped out from the shadows with blinding superhuman speed and then simply stood there ripping at me while I poured lead into it, I recognized the game for what it was.  Not smart.  Not challenging.  Simply a chew toy for the slavering OCD crowd.  Never finished it.  Glad I didn’t pay too much for it (thank you Steam!).

GOOD AI

Limit the opportunities for the AI to be stupid: One of the most satisfying game AIs I’ve encountered was that in the original F.E.A.R. The firefights always felt very tense, and organic, with the enemy soldiers maneuvering to try and get better shots on me, panicking when I had killed too many of them, taking cover and refusing to come out, using the odd grenade at a strategically appropriate moment.  But the genius of the game was the fact that the AI was made to appear smarter by the fact that it was given a very constrained environment in which to operate.  Most of the combat in the game takes place indoors in very close quarters, with limited sight lines.  First of all, the AI may have been behaving in some pretty questionable ways, for all I know.  While I was cowering behind a piece of furniture they may have been spending all their time running headlong into walls or playing “pull my finger.”  But your limited view also limited the chance that you would actually witness the kind of AI behavior that might crack the immersion for you.  This strategy was enhanced by the fact that so much of your environment could be destroyed; even if you could get a clear line of sight to your enemy your view was often filled with smoke, plaster dust and swirling pieces of destroyed furniture.  Secondly, the tight quarters means, in effect, fewer opportunities for the AI to screw up.  Not that I want to pretend that issues like pathfinding in a confined space don’t pose a significant challenge.  But this is one game where a very nice balance of AI, environmental factors, and limited player abilities all coincide to actually make the AI appear relatively smart.

Unfortunately, Monolith moved away from this design in subsequent games and resorted to the Dumb Difficulty move (see above); enemy AI didn’t change in the sequels, they just become more accurate and more resilient: not smarter, just tougher.

Change the Player not the AI: What are some alternatives to the Dumb Difficulty move?  While that move is unimaginative, it is responding to a couple of real issues: the need to promote replayability and players’ desire to challenge themselves.  An effective response to this dilemma that doesn’t simply involve turning your AI into super-soldiers is a key part of the Thief series of games.  In these games, changing the level of difficulty in the game doesn’t change your enemies at all; it changes the nature of the tasks you have to accomplish in each level and the way you must accomplish them.  The amount of loot you need to steal goes up, additional objectives are added, and at the highest difficulty level you are not allowed to kill any NPCs and sometimes you aren’t even allowed to be seen by anyone.  Voila.  The game is now challenging and you haven’t had to resort to populating your entire game with walking tanks masquerading as humans.  This is a great example of your perception of the difficulty of the game and even the enemies you meet being influenced by a change in the way that you are forced to relate to your entire environment.

I’m interested in other ideas people might have for smart or stupid AI and/or design strategies in games.

Twitchdoctor

Will this be on the test? (Part 2)

Posted: September 16, 2009 by Twitchdoctor in Game AI
Tags: , , , ,

Gain 5 points for each correct answer.  Lose 3 points for each wrong answer.  Lose 4 points for each question skipped.  Gain 6 points for attempting an answer then giving up in disgust.

In my previous post I argued that while the best-known instance of the Turing Test, the Loebner Prize, is ostensibly set up to evaluate the ability of an AI bot to fool a human in simulated conversation, the parameters of the competition focus more on testing a human being’s ability to differentiate between human and machine.

If we turn to the world of electronic games we find something very similar, albeit with some revealing differences.  In December 2008 Aussie developer 2K Games sponsored the inaugural Botprize, the “Turing Test for Bots;” a second iteration of the contest has just been played out in Milan.  The contest, held in conjunction with the IEEE Symposium on Computational Intelligence and Games, is designed to test the ability of a bot to pass as a human player of a first-person shooter.  The format is once again the classic Turing model: a judge faces off against a human and a bot in a deathmatch shootout using a modified version of Unreal Tournament (2004).  The test is operating in a different ballpark than the Loebner prize (its more of a neighborhood sandlot, really) with its offer of a cash prize of only $7,000 and a trip to 2K’s Canberra studio.  To win the major prize a team needs to fool 80% (typically 4 out of 5) of the judges.  As we might expect from the long inglorious history of the Loebner prize, no one has come close to grabbing the major award, which leaves everyone fighting it out for the minor money: $2000 and a trip to the studio for the team whose bot is judged to have the highest average “human-ness” (their word, not mine, I swear).

To cut a long, but predictable, story short, the bots fail.  Miserably.  In 2008, 2 of the five bots failed to convince a single judge.  Two bots convinced only two of the judges.  While complete results have yet to be posted for the 2009 prize, the bots as a whole did a little better, with each fooling at least one of the 5 judges.  Woohoo.

Now on the face of it this looks like a very simple challenge.  Whichever player kills you and then takes the time to teabag you, that’s the human.  (There’s an idea; let’s replace the Turing Test with the Teabag Test: the winner for the Loebner prize under these rules would be the bot that convincingly spews random homophobic insults at you at the slightest provocation).  But seriously folks. . .

The frenetic pace of an online deathmatch does make picking the bot in each round a daunting task for the casual gamer.  (You can check out a series of short videos from the 2008 contest and try it for yourself).  However the judges’ notes indicate that they have a series of behaviors that they are looking for: reliance on a single weapon, losing track of their target, failing to pursue a target, for example, can all be telltale signs of a bot.  However the Botprize as a whole suffers from the same weaknesses as the Loebner prize.  In every round the judge always knows that one of the avatars they will be facing is nonhuman which makes it a contest more focused on their skills at differentiating machines from humans (something that is tacitly acknowledged by a “best judge” award).  Although it is entirely possible to run this test with different configurations (two humans, two bots, and the judge always in the dark) there doesn’t appear to be any interest in employing such a more methodologically varied test.

However, while this form of traditional Turing test applied to chatterbots produces a completely artificial and constrained conversational context that bears little relationship to real human conversation, the method does, it is true, have some marginal utility when evaluating bot/human performance in the world of multiplayer FPS games.  After all, in the world of online gaming, cheating tools like speed or damage hacks are common enough that most players are likely to have experienced them firsthand or heard of them.  Thus, while trying to figure out whether the entity you are facing is human or not has no relevance to everyday human conversation, wondering about the possibly enhanced or downright artificial nature of the player you are facing in a game is a distinct possibility!

It is also important to note that the AI design task in each of these Turing tests is very different.  In the Loebner prize, designers are faced with the task of “smartening up” their AI to make it capable of holding the kind of relatively sophisticated conversational exchanges that are, somewhat romantically, envisaged to be the stuff of everyday human interaction.  When it comes to FPS games, however, it is relatively easy to design AI characters that are all powerful super-soldiers.  Many of us have played games with this kind of AI design (usually not for very long).  This is the NPC that can kill you with a single headshot from 500 metres while standing on their head with the weapon  clenched firmly between their butt cheeks.  Gamers just love that kind of “smart” AI.  The challenge for the Botprize designers, therefore, is to dumb the AI down, to make it play more like a fallible human.

Nevertheless, there remains this reluctance in either of these Turing tests to provide a more methodologically varied test and it is fair to ask why.  Part of the reason is undoubtedly that the Turing Test has acquired the status of Holy Writ amongst AI geeks.  Despite the fact that there is some debate as to what the parameters actually were when Alan Turing first postulated the idea of testing a machine’s ability to play the imitation game, rewriting the “rules” seems to be regarded by people as akin to rewriting the ten commandments to remove, say, that pesky adultery clause: it would make life a lot easier and more interesting but, you know, it’s just not done!

There is another, more important reason, and it is indicated by a less obvious result of the 2008 Botprize.  Of the human players involved in the contest, 2 managed to convince only 2 of the judges that they were in fact human.  Of the five players, only one convinced all five judges that he was human.  These Turing tests are not designed around criteria for meaningfully evaluating AIs, they are instead designed around a set of criteria that is supposed to define what is believed to constitute human behavior, either in a conversational or a gaming context.  What I suspect people are reluctant to acknowledge, however, is that these criteria are, at best, highly romanticized, and at worst, complete BS.  Most human conversational interaction, for example, is completely unlike that imagined by the Loebner prize.  Rather than being focused, intense, and driven by evaluative need, most everyday conversations are trivial, characterized by a high degree of inattention, consist mostly of filler, and have no purpose except to keep open channels of communication.  Most people just don’t have much that is worth saying and they spend their time saying it badly but saying it a lot.

Were the Loebner prize and the Botprize to be run in a more methodologically sound fashion, I would hazard a guess that one immediate result would be that the number of “humans” who were determined to be machines would rise dramatically, certainly in the case of the Botprize.  The patently limited parameters in both these Turing tests, in other words, are designed to prevent us from finding out how truly awful we are at attempting to affirm and enforce the criteria that supposedly render humans distinctive.  More disturbingly (or intriguingly, depending on your point of view) it might show how inclined we are already to see one another as species of machine.

Twitchdoctor