Kismet and the Problem of Machine Emotions

Kismet is a robot, "designed for social interactions with humans," made by Cynthia Breazeal at the MIT Artificial Intelligence Lab. In appearance, Kismet is something like a cartoon head, realized in the form of a 3D metal-and-plastic sculpture, and come disconcertingly to life. Kismet has comically large eyebrows arching over big round eyes, extended flappable (but not quite floppy) ears, and a broad metallic mouth with curling lips. By moving its various parts, Kismet is able to make a variety of facial "expressions." These signify different "emotions" and "expressive states," ranging from happiness and calm, to disgust, anger, and fear. But no matter what "mood" Kismet is in, it is almost unbearably cute, like a cross between Barney the Purple Dinosaur and Disney's Country Bear Jamboree.

The cuteness, alas, is largely intentional. It's a matter of form following function. Kismet is meant to inspire a warm, cuddly response, just as a Furby is, and mostly for the same reasons. Kismet is a seemingly animate object that all but cries out for its human "partner" to nurture it. "Specifically," Breazeal explains, "the mode of social interaction is that of a caretaker-infant dyad where a human acts as the caretaker for the robot." The process is illustrated in a QuickTime video clip on Kismet's website. Breazeal first makes faces at the robot, and then waves a toy block, a stuffed cow, and a Slinky in front of its vision sensors. Kismet responds to these gestures with changes in its expression. Breazeal tries to calibrate her actions until the robot signals that it is interested and happy. But when she plays too aggressively, and wears Kismet out, it closes its eyes and goes to sleep.

However much it may seem like the Cadillac of Furbys, or like a Tamagotchi on steroids, Kismet is not a toy. It's actually at the cutting edge of work in artificial intelligence. Kismet is funded, in part, by the Office of Naval Research. It's an offshoot of the Cog Project in Artificial Intelligence at MIT, led by robotics genius Rodney Brooks (best known from Errol Morris' documentary Fast, Cheap, and Out of Control). For the past decade, Brooks and his team have been working on Cog, a robot that is meant to develop sensory-motor behavior, and interact with human beings on that basis. Kismet started out as a visual-perception module for Cog, but was then spun out into a separate project.

Brooks' and Breazeal's research is innovative for a number of reasons. Brooks has pioneered the bottom-up approach to artificial intelligence. Instead of trying to create artificial minds that are logically programmed from above-the approach to AI that failed spectacularly in the 1970s and 80s-Brooks seeks to "evolve" machine intelligence in a manner roughly analogous to the processes of biological evolution. This means that the robot's overall behavior is expected to emerge on its own, as a self-organizing process, rather than being imposed and managed from some sort of centralized command center. Brooks and his collaborators start with simple devices, but give these devices the ability to "learn" on their own. Part of the impulse behind Kismet is to see if its behavior can become more complex as a result of its human interactions.

Brooks also insists on the importance of the body for any artificial intelligence research. That is why he and his team build robots, instead of just writing software. A machine, like an organism, can only learn through interactions with an outside environment. Brooks thus builds robots that move around, and gradually develop the ability to avoid bumping into walls and other obstacles. Cog cannot move, but it has "arms" with which it can point and grasp objects. In the case of Kismet, which is motionless, with a "head" but no limbs, the emphasis is on its perceptual system. Much of Kismet's circuitry is devoted to optical sensing, and pattern and facial recognition. And Breazeal is currently adding auditory sensing as well.

Cog and Kismet are amazing feats of engineering, even in their current unfinished states. The most important thing about them is not so much their anthropomorphic shapes, as the fact that they try to take emotion into account. This is something that previous generations of AI researchers notoriously failed to do. These robots are not just logic processors. They respond to the outside world by means of attitudinal states, which might be thought of as a machine simulation-or even a nascent version-of "feelings" in human beings and other mammals. This new interest in emotion has become increasingly important in AI research. Elsewhere at MIT, Rosalind Picard and her Affective Computing Lab are working directly on the question of how computers might be able to sense the emotions of their human users, and respond to them in appropriate ways. This is supposed to lead, eventually, to machines that themselves exhibit seemingly emotional behavior. Fumio Hara and his colleagues at the Science University of Tokyo are working on a robot that displays realistic, human-like facial expressions, in response to the expressions it detects on the faces of its human interlocutors. Hara's robot is gendered female, in contrast to Cog and Kismet, which are genderless. The reasons for this are not clear to me, though I cannot help wondering whether traditional gender stereotypes (of women as more emotional than men, for instance) played a role.

The recognition by AI researchers that mental activity is emotional and embodied, and not merely a matter of logical reasoning, is certainly a welcome step forward. But I continue to wonder about the underlying assumptions that drive these projects. What view of emotion is being put to work here? The theory, I think, goes something like this. The robot, like a biological organism, is assumed to have a fixed number of underlying drives. These drives refer to conditions that the organism tries to fulfill, within the limits of homeostatic equilibrium. Think of hunger, for instance, or the need for social contact. Fulfillment of the conditions of these drives-having a good meal in good company-leads to satisfaction. Either too little or too much stimulation of the drives-starvation on the one hand, or an ancient Roman banquet on the other-leads, instead, to dissatisfaction. These states of satisfaction and dissatisfaction are then expressed in the form of emotions. You are angered by obnoxious dinner companions, or disgusted after eating too much rich food. And these emotions, in turn, provoke behaviors that seek to reestablish equilibrium. You punch out your obnoxious neighbors, or barf up the excess food, and afterwards you feel at peace with the world.

Such an account is pretty much the standard one in cognitive science today. It's an instrumentalized view of "human nature" that ultimately derives from what is known as "rational choice" theory in economics and political science. This is the idea that all human behavior can be explained by the assumption that each individual acts so as to maximize his or her utility. Under this theory, human beings need not actually be rational; they still pursue their goals as if they were rational. Emotion is not denied; but it is relegated to being only a tool that helps the individual to efficiently accomplish his or her goals. Everything is supposed to work out correctly, thanks to an updated version of Adam Smith's "invisible hand." Models derived from economic competition in the marketplace are used by social scientists like Nobel Prize winner Gary Becker to explain all sorts of things, from family dynamics to drug addiction and crime to voting and political activism. But this approach has not just gained a foothold in universities and public-policy think tanks. It has also filtered down into the popular imagination, in much the way that Freudianism did fifty years ago. And it has become ubiquitous in recent computer engineering work, not just in AI and robotics research, but also in interface design and interactive games.

One finds this approach, for instance, in Will Wright's wildly popular "people simulator" game, The Sims. This is the non-competitive game in which the player micromanages the lives of little people on the computer screen, known as Sims. The Sims have a fixed number of "motives" that need to be satisfied (Hunger, Comfort, Hygiene, Bladder, Energy, Fun, Social, and Room). The varying strengths of these drives or motivations depend upon a fixed number of factors (Neat, Outgoing, Active, Playful, and Nice) whose varying strengths make up their personalities. And these factors and motives result, when combined with experience, in a certain number of skills (Cooking, Mechanical, Charisma, Body, Logic, and Creativity), that the Sims are able to profit from. The game is wide-ranging and unpredictable, since the number of possible combinations of these factors is astronomically high. Yet at the same time, it is highly rationalized and instrumental. For the Sims' behavior, however various, is ultimately defined as nothing other than the playing out of these pregiven traits in such a way as to maximize total utility, or satisfaction.

For all its popularity, such a view is sorely deficient as an explanation of the wide variety of human experience. More than a century ago, in a sneering reference to the British utilitarian philosophers like Jeremy Bentham and John Stuart Mill, Nietzsche wrote: "Man does not strive for pleasure; only the Englishman does." In a similar vein, I am inclined to say that human beings do not work to rationally maximize their utility; only economists and computer engineers do. For the "rational choice" view leaves out everything that goes beyond subsistence and reproduction; everything, that is, that makes life interesting. It leaves no room for all those feelings, gestures, and behaviors that are whimsical, impulsive, arbitrary, imaginative, obsessive, beautiful, or otherwise excessive. In short, it leaves out love, poetry, and madness, together with all else that makes for the richness, diversity, and unpredictability of human culture.

That is why, despite the recent changes in AI research, I still do not expect too much from Cog, Kismet, and their kin. As an engineering achievement, Kismet is brilliant. But as an experiment in human/machine interaction, it is really just more of the same. The video of Breazeal playing with Kismet tells me why. The (male) voiceover explanation of how Kismet works has the didactic, omniscient tone of a 1950s instructional film. And as Breazeal waves the stuffed cow in the robot's face, she seems to be playing Mom with the forcibly cheery affect of a family sitcom from the same period. It would seem, then, that Kismet is modeled not so much, as Breazeal claims, on "the way infants learn to communicate with adults," as it is on a much more particular fantasy: that of the earnest, well-adjusted, white, middle-class, suburban American child. For all the recent talk of "how we became posthuman" (to cite the title of a recent book by Katherine Hayles), it would seem that we are fated to get back from computers only what we put into them in the first place.

Return to Steven Shaviro's homepage