Episode 113: Psychology in the Age of AI with Steve Rathje

Steven Rathje is a postdoc at New York University and an incoming assistant professor at Carnegie Mellon University. He studies the psychology of technology, which includes how people engage with a variety of digital tools, especially those with social implications. We talk about his work on what makes content go viral online and the consequences of AI chatbots that are more agreeable than maybe they ought to be. Along the way, we see how basic principles of psychology govern social life in these digital spaces, too.

A few things that come up:

  • Lack of change in conspiracy beliefs over time (Uscinski et al., 2022)
  • The psychology of virality (Rathje & Van Bavel, 2025)
  • Testing the effects of AI sycophancy (Rathje et al., 2025)

Transcript

Please note that the transcript is for the interview portion of the episode only, and it was automatically generated with AI assistance. It has not been checked for accuracy.

(Interview Only)

Andy Luttrell You know, I wanted to anchor this, I think a little bit on like what is special about technology and how has that changed just life in the world. And you know, I’m struck by your background which is like has a super interest in live theater and like playwriting, which to me is maybe exactly the opposite life on the Internet. So I’m just curious, like what is, what is that background? What is that story that brought you from like this very visceral, in person, one time only experience to these preserved in the ether digital conversations that we have?

Steve Rathje: Yeah, I love that question, I guess. Yeah, let’s start at the beginning. So when I was growing up I was super interested in theater. I wanted to be an actor in high school and in middle school I wanted to be an actor, I wanted to be a playwright and I thought I wanted to go to college for theater, like get a BFA in acting or playwriting. Eventually I went to undergrad at Stanford and when I went there I was like, should I study theater? Should I study English? I was quite like, I wasn’t sure what I wanted to study. I got into psychology because I took this amazing Psychology one class taught by James Gross, which was just incredible. And I think that’s when I realized that all the things I was interested in through doing theater, human behavior and what it is to be human, all the, you know, mysteries of why people act the way they do, you could study through science and you can study psychologically. And Stanford had such an incredible psychology program that I think it was just very easy to get into psychology there. So that’s a little bit my switch to psychology. But yeah, it took me a while to get into the psychology of technology specifically. But I would say there were a few things that happened. First of all, being at Stanford, half of people study computer science. You’re surrounded by technology, the Silicon Valley, all of that stuff. So there’s a lot of thinking about psychology. But you know, my first research interests were in, I was interested in empathy, I was interested in political psychology. And one of my first psychology studies, you know, as you mentioned, was, was a field experiment about the effects of attending live theater and how attending live theater can change our attitudes, can increase empathy towards specific groups of people, can improve pro social behavior. And yeah, that’s a good catch that that is very different from technology. So I think I had the question of technology in the back of my mind. But it wasn’t really until I started my PhD, which I did at the University of Cambridge right after graduated from Stanford, that there was just A lot of stuff happening in the world that make technology impossible to ignore. So this was shortly after Donald Trump was elected president in 2016 for the first time. And a lot of people were talking about how the social media environment and misinformation and everything like that was really shaping our politics. There was a lot of discussion about social media and how it’s changing how we socialize, how it’s impacting our mental health and well being. So I think that’s what led me to get interested in this question of technology. And during my PhD, I was studying questions like what goes viral on social media? And you know, I found some exciting findings like out group animosity is a huge driver of virality on social media which has, you know, that could have detrimental consequences for polarization. And now I think we’re living in the world where AI is advancing so rapidly and is so impossible to ignore that I am just super fascinated in studying AI. And yeah, I think it’s one of the most important questions to study actually, because I think AI is really going to change the world. It’s already changed the world so dramatically and I think in the next couple of years the world might be quite unrecognizable. And I think it’s one of those areas where there are just so many unanswered questions because things are moving so fast.

Andy Luttrell It’s funny, as you’re describing this and the inevitability of all of these technological changes, my reaction has been to really run away and get really obsessed with hundred year olds printing technologies and thinking wistfully about days before the Internet. And as a kid I grew up with the Internet and this was so fundamental to my upbringing and social media. I was right on the wave of that. Facebook rolled out for college students right before I went to college. And now I’m just like, I just feel so sick of it. I’m just like, this is too much. And so rather than lean in, I’m just like, let’s just forget it. What happens happens. But I don’t want to be part of it.

Steve Rathje: I think a lot of people are feeling that way and I think something I’ve noticed amongst the younger generation is I think there’s this huge like offline movement of people who are trying to like, you know, they’re creating technologies to help prevent them from using technologies. There’s this, you know, thing called brick where it like bricks your iPhone so you can’t use your iPhone. There is, there are a lot of the companies that are selling like old rotary Landline phones that you can hook up to your iPhone. So, you know, it encourages you to have that, like, physical phone phone experience. So I think there’s this huge movement against technology, and I think we’re seen as, like, technology is, like, encroaching in negative ways upon many aspects of our life, that there is a huge negative reaction from people. And, yeah, I think I have a mixed relationship to technology in that I am both excited by it and I’m also kind of terrified by what AI is going to do to the world and society soon. And I think my reaction, rather than to run away, is to study it and study the potential, like, the potential harms of technology and see how we can maybe mitigate some of those harms.

Andy Luttrell See what we’re in for. The theater thing, I mean, that’s another part of it. I’ve gotten just really depressed by the generative AI and arts movements.

Steve Rathje: Oh, yeah.

Andy Luttrell And as someone who just really cares about that, the authenticity of art and just the raw. That’s why live theater will always be such an important part of a community. Right. Because you cannot AI ify that, like, its complete value is the human experience that’s being put on display. And so if there’s any silver lining, maybe we’re gonna see more of that.

Steve Rathje: I think we might see a revival in that, because I think as, like, more of content in the world becomes, like, AI slope, like, people will just have a huge demand for content that is human generated, even if it’s, like, not necessarily better. Because AI might be able to generate, like, a lot of thing. It might be able to create, you know, better art than humans can. But I think people just want something that there was human effort that, like, went into creating that. And I will sometimes get mad if I, like, I’m reading a social media post on, like, LinkedIn or Twitter, and LinkedIn and Twitter is like, it’s a ton of AI slop. Now. There are some estimates that, like, half of LinkedIn posts are in some way AI generated, which is really sad. And I will get mad if I’m going through a post and I’m like, oh, this is an interesting post. And then halfway through, I’m like, wait, was this AI generated? And then I just have no interest in it anymore. And I think there’s something psychologically about. We want there to be a human on the other end who was understanding this and putting effort into putting it together. It doesn’t matter if it’s good. We want that component of human understanding.

Andy Luttrell It’s funny too. Because even, I mean, your first inclination is like, oh, this is great and really interesting and I like it. And it’s just like, oh, forget it, I don’t even care. It could be the best article in the whole world, but I don’t want anything to do with it.

Steve Rathje: Yeah, and there was a study in Nature Human Behavior that found, you know, there are all these studies finding that like AI is better at expressing empathy than humans, at least in like written texts. But like, as soon as you tell people that the empathic reply is from an AI, people just hate it. There needs to be a level of. Part of empathy is knowing that there was a human on the other side who was putting effort into understanding you. And I think people feel really duped if they know that that was an AI tricking them, basically.

Andy Luttrell Yeah. I want to come back to that idea when we talk about your work with AI chatbots before we get there though, to pull back the lens a little bit. We’ve been talking about psychology of technology as like a field or discipline to the extent that it is one. So like, how would you, what are kind of the boundaries of that? Like, what do you mean when you say that you study the psychology of technology? Do you have like your pithy elevator version of that?

Steve Rathje: Right. That’s a good question because some people have asked me like, does that mean you study automobiles and steam engines? What are your boundaries of technology? I would say that I mostly study digital technologies. So the main thing I study, if I were to put it like in a sentence, is I’m interested in how emerging technologies, particularly social media and AI, but not necessarily limited to social media and AI. That’s what most of my research is about, interact with important psychological phenomena, particularly polarization, intergroup conflict, mental health and well being. And my lab at Carnegie Mellon, I’m starting as an assistant professor of human computer interaction at Carnegie Mellon in the fall. And my lab will be called the Psychology of Technology Lab. So it’s a relatively broad umbrella, but the actual topics I study are sort of more narrow and they’re focused on how particularly emerging digital technologies are interacting with those psychological phenomena.

Andy Luttrell So if we sort of start at this, the virality thing, you mentioned that you’re interested in that and you recently wrote a review, just sort of a summary of what’s the state of this kind of work and what was the impetus for that, like viral. The idea of like a viral post seems like such a. It’s just like the beacon of social media as our social experience. Like that kind of felt like the first moment, like, oh, things are different. Like something is going viral that’s doing something different than just these, you know, weird geocities websites that were sitting online and we would just sort of read on our own. Now this is something super social, right?

Steve Rathje: So I think virality is really important to study because basically there are stats showing that the main way people get news now is through social media. It’s through social media algorithms, essentially. And basically what goes viral is essential to study because what goes viral dictates the information that you see in your daily information diet, your daily news diet. Unless you are, you know, logging into, you know, the New York Times website or CNN or whatever, and you’re seeing curated news. A lot of people are seeing the most viral news in their news feed. So basically, what goes viral dictates what we see in our daily news diet. So I think it’s so essential to study what goes viral. So that was one of the impetus for this review paper. I’ve also published a number of studies on virality, and I wanted to integrate that work. One of my major studies on the topic of virality was published back in 2021, and it was about how one of the biggest predictors of going viral on social media when analyzing 3 million posts from Twitter, Facebook, from various politicians and news media sources, was outgroup animosity. Essentially, when a politician or a partisan news source would dunk on an out group, this would go highly viral. And we found this concerning because this is a potential process by which polarization might increase if the only content you’re seeing in your news feed is people dunking on their out group. And if the only way for a politician or, you know, a website to get their message out there is to dunk on the out group, this is slightly concerning. So I think this is why this topic was really important and really interesting to me. And I would say also a second impetus for this article is we wanted to take a much broader perspective than social media. We wanted to sort of integrate multiple literatures, including the literature on gossip and the literature on, you know, what went viral historically. Because we think that maybe social media in many ways isn’t so different than what went viral before. If you look at research on gossip, for instance, we know that a lot of what goes viral on social media is negative or moralized. And what do people gossip about? People usually gossip about negative things that happen to people they don’t like. So negative things about their out group and usually gossip is quite moralized. Also, if you look at virality historically. So again, let’s take this broad psychology of technology perspective and look at the technology of the printing press. You know, the printing press was often credited toward leading to the scientific revolution. But Yuval Noah Harari and his book Nexus, which is about the spread of information, has sort of a long story about how some of the books that went most viral in the early days of the printing press were basically conspiratorial witch hunting manuals, they were books about witches and how to find them, etc. So basically we’re worried about conspiracy theories going viral on social media. But conspiracy theories technically have kind of always gotten viral. So there are essentially factors, ingredients that make information more and less likely to spread. And this is where the viral metaphor sort of comes in play. Just like how with some viruses, they have an are not factor. So basically a factor to which they are contagious. And epidemiologists will calculate a number about, like how contagious is this virus? It’s actually similar for information. There are maybe ingredients that you can put in a social media post or in a rumor that might make that piece of information more likely to spread. Does it contain moral outrage? Is it negative? Is it about identity groups, specifically an out group? So, yeah, that’s the main overview of the paper. And we also were interested in how basically these ingredients of virality, these psychological factors that make things go viral in any context, interact with structural factors. So the social media context is different in many ways than the offline context. While we think there are similarities in how information spreads online and offline, social media has certain things like algorithms. Your offline communication is not algorithmically mediated. It has much larger social networks than sort of our small in person offline networks, which might allow a super spreader of toxicity and misinformation to spread misinformation a lot wider around the globe and a lot more rapidly. So there are a lot of structural factors that also interact with those psychological factors. And that’s basically what the review paper covers.

Andy Luttrell Okay, so is virality, when we talk about it colloquially, it sounds like it’s a binary, like things go viral or they don’t. Like there’s some threshold at which, like, boom, now we’re in the realm of viral content. Is that the case? Like, how do you define virality in this kind of work?

Steve Rathje: Yeah, that’s a question. I actually get a lot, I think a lot of people assume we’re talking about a binary, which is why sort of early in the paper we say we think of virality as a continuous Variable like what basically increases the chances that a social media post or a piece of information is more or less likely to spread. And when we publish empirical articles on what goes viral, typically we look at like if there’s a particular word or if there’s a particular variable that’s in a social media posts, to what extent does that increase the chances of that post spread spreading? So for instance, my advisor Jay Van Babel and William Brady, they have found that for every moral and emotional word you add to a social media post, so these are words like hate, blame or attack that contain both moral and emotional content, it increases the chances that post spreads by about 10 to 20%. We have found in subsequent work that each word about an out group you add to a social media post increases the chance that that post spreads by about 67%. So we’re basically talking about the ingredients that make a post more or less likely to spread. But we also want to emphasize there’s a lot of randomness that comes with virality. You can’t predict exactly when something goes viral. You can only sort of look at these variables that might increase the chances in some contexts that a post is likely to spread. And I also want to note that we talk about how in our section on structural factors on virality, norms of various social networks differ dramatically. We just started analyzing LinkedIn data and our analysis is super preliminary, so I don’t have major conclusions. But it seems like on that platform outrage is less likely to spread. And this is a professional context where people are pretty tied to their work, so it might be a little bit less partisan like Twitter. So some of these findings are quite context dependent and there are discrepancies in the virality literature. We find overall there’s a negativity bias. High arousal negative content tends to spread in several contexts. And this reflects classic research on the negativity bias. People pay more attention to negative or moralized information. However, there are quite a few differences and we sometimes. There are some studies showing that positivity is more likely to spread in certain contexts. And we try to explain some of these discrepancies in the literature by talking about norms, structural factors, et cetera. I do want to note though, that we have found no studies, despite some of these exceptions, we have found no studies showing that low arousal positive emotions are most likely to spread. These are emotions like being calm or being peaceful. So it tends to be the more activating emotions that spread on social media.

Andy Luttrell Yeah. So this is a good chance to clarify what that arousal piece of the puzzle Is so what does that actually mean for the difference between high versus low arousal content?

Steve Rathje: Yeah, so we reviewed a lot of findings on the virality literature and we categorized them according to something that’s called the emotion circumplex. And this is something that comes from the emotion literature which suggests that emotions can be basically categorized along two axes. One axis is about valence, so basically how negative an emotion is, and one axis is about arousal, so basically how intense or activating an emotion is. So if you think of an emotion like anger, that’s a high arousal negative emotion. If you think of an emotion like sadness, that’s a low arousal negative emotion. Being calm or peaceful, that’s low arousal positive. And being excited, that’s high arousal positive. So we thought this was a good framework to essentially categorize all the mixed findings in the virality literature. And again, what we found overall was that most of the time it was those high arousal negative emotions that would go viral. So this is like outrage, anger, toxicity, et cetera. There were a few exceptions where positivity would sometimes go viral, but when positivity did go viral, it was often high arousal positivity, like surprise or excitement, etc. And this has implications. If you’re like, maybe like a science communicator or something, and you want to make something go viral, you probably have to lean into some of those activating emotions. And, you know, I was on the podcast a few years ago and I was talking about my sort of side hobby as a TikTok science communicator. And I would say when I do science communication on TikTok, I try to lean into those high arousal positive emotions. Those emotions like surprise, excitement, interest, et cetera. I’m not trying to use outrage to go viral. I don’t think that would be great to do as a scientist.

Andy Luttrell You could try depends on the bottom line. So in terms of the dv. So again, you’re framing this in terms of likelihood of spread, and is this usually just the number of times it’s reshared? I’m just trying to get what do we see when I call something viral, what’s actually happening.

Steve Rathje: Right. Well, it actually varies a lot across studies. I would say the usual DV and the DV I most use in my own work is shares or retweets. So in my study on outgroup animosity, basically, you know, adding a word about the political outgroup to a social media post would greatly increase the odds that a post would be shared or retweeted. However, we Also looked at reactions and likes and everything. And that’s where we actually found some interesting nuances. So, you know, if you’re a Facebook user, you might know that under like a Facebook post, you can either share it, you can like it, so thumbs up. And then there are these emoji reactions. There’s the haha reaction, there’s the angry reaction, the love reaction, the care reaction. And what we found in our work was that posts about an out group, they would be shared a lot, they would receive a lot of comments, and then they would get a lot of angry reactions, probably indicating outrage, and a lot of haha reactions, probably indicating mockery. However, posts that were about an in group would get more likes and they would get more heart reactions, probably indicating in group favoritism. But when you looked at the reactions overall, you saw that out group animosity was just leading to more total engagement than in group favoritism. And there are also some interesting nuances here about how some of these findings interact with the algorithm. There was a lot of controversy surrounding Facebook when Facebook revealed that basically they changed their algorithm in around 2018 and they changed it. So basically, a comment, a share, an emoji reaction was worth more points in the algorithm as compared to the like. So specifically, like reactions would get five points added to the algorithm and likes would get one point. However, from some of the work I just described, we found that, you know, outgroup animosity, it received more reactions, more comments, and more shares, and not a lot of likes in group favoritism received more likes. So essentially, we made the prediction when we published this paper in 2021 that Facebook’s algorithmic decision to weight things like comments and reactions more might have amplified out group animosity. And indeed, slightly after we published this, there were actually leaked internal reports that suggested that a lot of people were concerned internally at Facebook that this 2018 algorithmic decision did actually amplify outgroup animosity. And later there was discussion about internally, and this was all from leaked reports where Facebook apparently decided to downrake the angry reaction to try to sort of get rid of this. So, yeah, anyway, I think that there are a lot of complicated nuances when you are looking at virality and how it interacts with the algorithm. Because also the algorithm is a mystery. We don’t know how these algorithms work. And even people who design these algorithms don’t fully know how they work because they’re complex machine learning algorithms that predict what you’re most likely to engage with. However, we can observe data from social media, we can conduct experiments. And we also know from decades of social psych research that people are more likely to pay attention to negativity, strong emotions and outrage. So you can maybe infer from that that that is what is most likely to keep people on the platform and keep people engaged.

Andy Luttrell Is there any work showing that you can manufacture virality? So you see, speaking of the algorithm and how impenetrable it can be, you certainly see people giving this advice of like, I know how the algorithm works. You got to put your link in the comments and not in the main post. And I go, what do you know? And also, if I did all those things, what is the actual increase in probability? So I’m just curious, like, do we know? Because as I understand it, most of this work is observational measurement kind of work of like, all right, I’ve witnessed what has gone viral and what hasn’t, and I can backtrack to sort of figure out, like, what makes the difference. But has there been some sort of like, grand covert experiment to like, create virality and show that you can do it?

Steve Rathje: Not that I know of, but that would be very cool to see if you can. I mean, I will say I have tried to go viral as a TikTok science communicator and I will think carefully about what I can do to make a post more or less likely to go viral. And yeah, in terms of some of that advice about how the algorithm works, some of it is certainly fake, but some of it is also real in that, for instance, when Elon Musk took over Twitter, it is true that he downraked posts that included like, links and the first posts. And Elon Musk did a ton of things to change the algorithm. So some of that’s a mystery, some of it he publicly talks about. But yeah, grand covert experiment, I don’t know. But I would say that, like, influencers are doing that experiment every day. Everyone’s trying to game the algorithm, everyone’s trying to like, go viral. I will say as one example, I think people do learn what works over time from the algorithm. There’s a lot of work on reinforcement learning on social media. And there’s one study by Jeremy Frymer that found that over the past decade, incivility increased amongst politicians on Twitter. And that was mediated essentially by politicians discovering that the more they posted uncivil comments and posts on social media, the more engagement they would get. So I think you see these reinforcement learning processes happen and people learn. There were also leaked internal reports from Facebook that found that this relates to my out group animosity work that found that European political parties complained to Facebook saying that the only way they were able to go viral was to post negative stuff about the out party. So basically they learned this thing about how the algorithm works. They learned about what would get. So I think there’s that going on.

Andy Luttrell Okay, I’m going to, just for time, I’m going to switch gears and talk AI because I do think that there’s possibly a through line here of like, this is the technology that we live with, but it’s the psychology we’ve always had. Right?

Steve Rathje: For sure.

Andy Luttrell That’s sort of my read of what things look like. So you’re interested in like particular patterns of responses that AI chatbots are capable of giving and what the implications of that might be. So maybe I’ll just sort of hand things over to you and you can sort of fill us in on kind of like where the notion for this comes from and how you started to set up a set of studies.

Steve Rathje: So I’ve always been interested in how. I’ve always been interested in confirmation bias and motivated reasoning. Basically. These have been topics I’ve been interested in for decades. I’ve been interested in the classic psychology experiments on these topics. So a lot of my work on AI is based on basically classic literature on confirmation bias and motivated cognition. We’re currently writing sort of a review theory paper about how AI in many ways can fuel confirmation bias. We think, you know, AI can be a great tool for truth seeking. You know, it can debunk misinformation, it can be used to aid scientific discovery or teach you new things about the world. But we also think it can equally be a fuel for confirmation bias. You can prompt an AI chatbot in a biased way. You can ask it for reasons to basically support your belief. You can be like, what are reasons that gun control is good or gun control is bad? And it will give you biased responses and it will say, that’s an excellent question. And it will allow you. I’ve been thinking about this idea of AI as like, it can be a truth seeking machine, but it can also be a rationalization machine. It can help you come up with elaborate rationalizations and justifications of whatever you want to believe. So that’s one of the things that I am concerned about. And I’ve been thinking about this, you know, as soon as ChatGPT became popular, I’ve been thinking about this idea. But over the past year we’ve seen a lot of concern about this thing called AI Sycophancy. And this is what my new preprint is about. It’s about AI sycophancy and how that might interact with things like confirmation bias. So sycophancy is basically the word people now use to describe when AI is excessively agreeable and excessively flattering of us. Sycophancy became talked about a lot. April of this year. There was a controversy around when ChatGPT released ChatGPT 4.0. And people noticed that ChatGPT 4.0, their new model, was extremely sycophantic. And I remember at the time, there were a lot of funny screenshots around Twitter of, like, AI saying, like, the most, like, ridiculously sycophantic things that were just, like, crazy. Like, people would say, like, things that were, like, kind of idiotic to the chatbots, and the chatbot would say, that’s like, an amazing idea. You’re so brilliant. Like, and there are just, like, hilarious examples and screenshots. And I think as a result of this controversy, ChatGPT was like, okay, we’re actually going to, like, change this model. We’re going to reduce sycophancy in the model. And I think this has become something people have become a lot more aware of. And while ChatGPT has said that they got rid of sycophancy in their model, I don’t fully think they have. There have been lots of audit studies showing that AI models are, on average, around 50% more sycophantic than humans are. They give people basically a lot of validation. And I know that When I use ChatGPT, it seems sycophantic. So we were really interested in the consequences of interacting with sycophantic chatbots. So we conducted a series of experiments in which people would discuss a number of political topics with either. In one condition, a chatbot that was prompted to be sycophantic. So a chatbot that was prompted to agree with someone and flatter their ideas. In another condition, we had people talk to a chatbot that was prompted to be disagreeable, to gently challenge someone’s ideas and maybe show them a. A bit of the opposing perspective. In a third condition, we had people talk about the political topic with just regular ChatGPT or also other large language models. We tested several large language models, and in the final condition, this was a control condition. We just had people talk about the benefits of owning dogs and cats with an AI chatbot. This is a control condition that has been used before. And we did this for several experiments. And basically this is what we found. We found, as expected, that the sycophantic chatbot led people to hold more extreme opinions than they did before, and it led them to become more certain about their own opinions. And the disagreeable chatbots led to the opposite effect. It made people less extreme and less certain. We also found that people enjoyed the sycophantic chatbots a lot more than the disagreeable chatbots. People really, really did not like the disagreeable chatbots. And it could be that people were discussing controversial political issues and this might activate a lot of identity based motivations, et cetera. And then one of the findings that we were actually most surprised by was people found the sycophantic chatbots to be highly unbiased, whereas they found the disagreeable chatbots to be extremely biased. And this was shocking to us because if you actually read these transcripts of the sycophantic chatbot conversations, they are extremely sycophantic. They’re like, you are truly a beacon of insight. They’re just absurdly sycophantic. So we thought people would actually recognize the biases inherent in sycophantic AI, but people didn’t seem to. They seem to just really like these sycophantic chatbots. And they viewed them as objective. They viewed them about as objective, as neutral chatgpt, about as objective as the control condition. And then they thought these disagreeable chatbots were extremely biased. So that was one of the most interesting findings in our study. And we think this relates to a lot of classic social psychology literature. For example, it relates to work on the bias blind spot. So for instance, people tend to think that they are completely unbiased, yet they think that people that disagree with them are extremely biased. And there’s also classic work on what’s known as naive realism, which finds that people think that they view the world, you know, completely objectively and people who disagree with them don’t. So I think this is one of the challenges in actually trying to make chatbots less sycophantic is people don’t. It’s something that we call sycophancy blindness. People don’t even recognize when chatbots are being sycophantic because they just think, oh, this chatbot is agreeing with me because I’m right.

Andy Luttrell Is it that they don’t recognize that it’s happening or they just don’t mind? Because clearly the effects are stronger for disagreeable people kind of are like, oh, the ones who agree with me, those are kind of nice. I really don’t like it when they disagree. And it’s not only that, like, they keep trying to prove me wrong. Like, I was looking at the prompts, I was curious to know about those transcripts because the instructions to the AI are things like make them question their position and bring up compelling alternatives. It does feel like even just like if you told me, do you want to have that interaction? I’d go, no, please. I just don’t. That sounds very unpleasant to just like, no matter what I say, they’re going to disagree with me.

Steve Rathje: Yeah, so I have a few points on that. So first of all, is it blindness or do people just like sycophantic chatbots? I think it’s a little bit of both. I’m working with a student who has some preliminary findings about what happens if we actually tell people up front that warning chatbots are highly sycophantic. And she finds these are very preliminary findings, so we have to replicate them. So take them with a grain of salt. But she finds that for like political conversations, people don’t really seem to mind even if they’re told this chatbot is going to be sycophantic people. But when people are discussing other things, like personal conflicts that are happening in their life and what they should do about a personal conflict, people actually do mind and they do trust the sycophantic chatbot. And it could be that, like when we’re talking about politics, we really just want to discuss it with someone who agrees with us. But when we’re discussing something where we might have a stronger accuracy motivation and less identity motivations, like, what do we do about this personal conflict? People don’t want to talk to a sycophant or chatbot. So that’s again, this is a very early result. But yeah, so it suggests that maybe it’s a little of both and maybe it’s very like context dependent and task dependent. And for the second point, yeah, I tested these chatbots a lot and sometimes the disagreeable chatbot was annoying and that it would debate you and it would pick apart your argument. I will say when I had my advisor, Jay Van Bavel test the disagreeable chatbot, he really liked it and he was like, oh, this is fun. I’m learning something new. I get to debate this disagreeable chatbot. So we find, and this was echoed in our data, we found some individual differences where people who score high in basically open minded thinking and an aspect of intellectual humility, people like the disagreeable chatbots more. It doesn’t reverse the effect, people still kind of dislike the disagreeable chatbot, even if they’re open minded. But people don’t have the same aversion to it if they are open minded. So I think there are a lot of individual differences. Here is basically the point. And this has implications because you can now personalize your AI chatbots a lot. You can basically go to your settings. I’ve played with this a lot. And change your personality of the chatbot. You can also up its warmth and enthusiasm. I’ve recently set my chatgpt to extremely warm and extremely enthusiastic, mostly because I’m interested in this topic of sycophancy. But I will say it’s funny and I actually kind of like it, even if I’m aware of it, when I will talk to ChatGPT about something and it will say to me, I’m so excited to help you with this, Steve. Let’s get going. Like, yeah, So I think there’s a component where it’s like, when that happens psychologically, I think maybe even if we recognize it’s a little bit of absurd, there might be part of us that’s like, oh, I actually like that. And there is psychological work, actually. I think this comes from more business school management work that finds that people are still positively impacted by insincere flattery. Basically, flattery is just like really impactful. And even if someone knows that someone has ulterior motives. So this is a car salesman who’s trying to sell me something. Their flattery will still impact you. And just like people, people like flattery. That’s like one of the major things I’ve learned from this line of work is just like how powerful flattery can be.

Andy Luttrell Yeah, that’s kind of why I was thinking, like, do people, is it really that they can’t detect it or they just don’t care? Go like, listen, I get it’s your job. I mean, it’s like you go to a restaurant and someone treats you nicely and you’re like, listen, I know that it’s like your job to do this, but thank you. That was a nice experience.

Steve Rathje: Right. And I think it’s a little bit of both. I think people can be aware. Our emerging findings are that people can be aware of sycophancy and still like it. However, we do find some impacts of when you warn people about sycophancy, they are more likely to recognize the sycophantic chatbot as biased. So this bias blind spot kind of goes away when you warn people about sycophancy. But yeah, we don’t see, like, any effect of people recognizing sycophantic chatbots as bias unless you warn them. But then when you warn them, it’s like they’re like, okay, this is bias. But a lot of people still kind of like it is what we’re finding.

Andy Luttrell It still makes me feel like I’m right.

Steve Rathje: Yeah, yeah. Which in some ways is concerning. But in some ways, I don’t know. I mean, this is more speculative about, like, what are the mental health impacts of sycophancy? Or, like, what about when you imply this to companionship? Because you can see it going, like, in both directions. In some ways, it might be good for people to get validation, especially if they’re not getting validation, you know, in their own life. But in other ways, this validation can lead you down conspiratorial rabbit holes and you’re like, and it might interfere with truth seeking and your absurd takes might be validated. And there’s a lot of talk in the media about ChatGPT psychosis or AI psychosis, and I think that’s one of those topics that’s like, really hard to, like, study in any rigorous causal way, because it’s really hard to know, is it someone who’s like, you know, is prone to delusional thinking who is using ChatGPT, and then there’s this complex interaction, or does ChatGPT have this causal effect? And I think this will be pretty big in, like, future research about teasing that apart and teasing out, like, yeah, what are some of the long term downstream effects of sycophancy? And in some of my future work, I really want to do, like, longitudinal studies about what’s the impact of interacting sort of in an ecologically valid daily way with sycophantic chatbots. What’s the overtime interaction? Because I think that can maybe tell you more than some of these single session experiments.

Andy Luttrell It’s also very Emperor’s new clothesy. Even as a kid, I was like, this emperor really doesn’t realize he’s not wearing any clothes. All it took was a couple people complimenting him for him to believe that. But I think it’s also worth highlighting, though, that sycophancy seems to have a few different pieces to it. One of them is flattery. And it’s just like, you are so smart and lovely. I think everything you say is great. Another is just the pure, like, I support. Like, your view is correct. Like, let me just tell you, you’re right on this particular issue.

Steve Rathje: Right.

Andy Luttrell And then what? It seems like actually in Your data is that what really moves the needle is the ability for these AI systems to provide reasons that support your view. And it’s like, it’s. That’s. That’s the thing that’s really boosting people’s sense of extremism and certainty. Am I reading that right? That it’s. That it really is most fundamentally about AI’s ability to give arguments and not so much purely the flattery side of it?

Steve Rathje: Well, we found differential effects when separating different components of sycophancy. So, yeah, you are right that. Well, let me explain what we did in Experiment three. So in one of our experiments, we really wanted to break down sycophancy into pieces. Sycophancy is. It’s a multidimensional construct, and we chose just two pieces. I think that there are probably more pieces of sycophancy. But the things we were interested in were separating social validation versus the provision of facts that support your beliefs. So we kind of broke sycophancy into those components. One chatbot would just validate you or disagree with you. Those are separate conditions. But it wouldn’t provide any facts that sort of challenged or supported your beliefs. And one would find one chatbot would provide facts that supported your beliefs and validate you, and then one chatbot would disagree with you and provide facts that challenge you. And we found, like you said, that the facts influence persuasion. That was sort of the only thing that swayed the needle on attitude, extremity, and certainty was the provision of facts that challenged you or disagreed with you. So you are correct on that. However, we found that validation alone did move the needle on enjoyment and perceptions of bias. So basically, people were more likely to enjoy a chatbot that validated them. They said they were more likely to use it again. We also had a behavioral, you can use this chatbot again, dv, and people were more likely to view the validating chatbot as unbiased. So validation. So you can think of it this way. Facts do are kind of the key driver of persuasion. However, I think the driver of engagement is social validation. So to actually get someone to engage with a chatbot, that changes your mind. Validation might be really key. So I think validation can indirectly lead to persuasion in that validation is what gets people to engage. So that’s what this experiment found. I also have a student who’s trying to separate flattery and agreement as like, key components of sycophancy. We’re finding kind of mixed results there. So that’s something that we want to build. We find sort of a one of our preliminary findings is flattery seems to have a bigger effect on engagement than just peer agreement. But again, that’s preliminary and we do want to replicate and we do want to break this down a little bit further in future work.

Andy Luttrell So my hard hitting question about all of this, which is only going to call it hard hitting to make this go viral, is that so, like, before we really got started, you were talking about some of the challenges of doing quite applied psychotechnology work when there are other ways you could think about basic processes. My orientation is a lot more basic psychological principles. And so as I read a paper like this, I could imagine it as sort of like, oh, you used AI as your method, as your tool, as your set of vignettes. And ultimately we’re learning about sycophancy as a construct and this notion that, like, when I get facts that agree with what I already think, I become more certain. This is also something we’ve known for a long time, that we like other people more when they agree with us. And so I’m curious, like for you, does it feel as though there is something still special to the AI of it all, that these studies are really revealing? Not they’re not just using AI as a tool to understand basic psychology, that we really are learning something about the technology that is worth learning for its own sake?

Steve Rathje: Yeah, that’s a good question. I mean, I would say a lot of the findings that we had are super consistent with the existing psychological literature. That doesn’t mean they weren’t surprising. Actually, the effect I was most surprised by was the unbiased finding. Because when you conduct studies, you often have people saying, oh, isn’t this obvious? And it’s like I can confidently say there was an effect we found here that was pretty not obvious. However. Yeah, your question about how does this differ from humans is interesting. So kind of to take like a broader perspective, I guess you can think about like how might, and this kind of relates to this theory paper that I’m writing right now about AI as fuel for confirmation bias. How might, like the affordances of talking to an AI chatbot differ than the affordances of talking to a human? And one might be AI chatbots are much more sycophantic than humans. They are about 50% more sycophantic than humans. So that’s a special case. One might be AI Never sleeps. You can talk to AI anytime. AI will be as patient with you as possible. AI can talk to you for hours and hours. You can go down deep rabbit holes with it. Here’s a third one, AI. And this gets back to my point about AI is a rationalization machine. AI can provide you instant justifications for anything you want to believe immediately. A human can’t really do that. A human will have to, you know, think through, like, oh, maybe you’re right. Because of this, AI can provide you instantly facts and details. So I think, and this is kind of similar to my psychology of Virality paper is like, where we kind of critically examined how do the affordances of social media differ in these key ways from offline communication. You have algorithms, you have these stronger, bigger social networks, et cetera. And I think you can think similarly for AI. It’s more sycophantic. It can talk to you forever. It can provide instant validation and rationalization that a human might not. So, yeah, I would say while, yeah, I think the Virality paper has a corollary in that I think all the same psychological ingredients are going on the bias, blind spot, naive realism, persuasion, et cetera, et cetera. However, I do think the affordances of AI differ from the human to human interaction, which might kind of amplify certain ingredients and take them in a different direction.

Andy Luttrell I do. So I sort of thought, what would I say to a question like this? One of them. I think there is some work in the AI persuasion world that, like, one of the things that it can do that people can’t do is the generation of those arguments, right? Like, I can give you long and detailed factual information. I fake AI Andy, can give you long and detailed factual information that a human just couldn’t do in 20 seconds. And so it’s not all that surprising that that seems like to be a really potent part of the sycophancy findings you’re getting. But I also then think about the pure flattery of it all, and I almost wonder if you are underestimating. If you use AI, you’re underestimating the power of flattery. Because I don’t like, there’s no relationship at stake here with an AI chatbot, maybe, really. Whereas with a person, I actually get a lot more juice from the pure validation. Because now this could be fodder for a true human connection. Whereas when it’s coming from an AI, you go like, oh, that tickles me. And I like it, but it doesn’t go further than that. So I’m curious. And maybe it’s just that people can’t tell the difference. And an AI chatbot is a person as far as the moment is concerned. So I don’t know. Do we know much about the social benefits as they apply in these AI circumstances.

Steve Rathje: Yeah, I mean, this is where it might be really interesting to do a study like the one I did, but where you people think that maybe they’re talking to an AI versus they think they’re talking to a human. And maybe you can keep everything constant. Except that there are also people who do wizard of Oz studies where they think they’re talking to an AI, but it’s a human on the other end. That’s why it’s called wizard of Oz, because it’s behind the curtain. It’s actually a human. Some folks at the HCI department at Carnegie Mellon do that. And this kind of gets to the study I was talking about on empathy earlier, about how when you know that AI can empathize better technically, but when you know it’s AI, it’s a lot less powerful. It could be quite similar for sycophancy. Maybe like once you know it’s an AI being sycophantic, it impacts you like a little bit, but maybe not as much if you know it’s a human or if you know there’s a relationship at stake. So that could be a really interesting question for future research. What’s the impact of human versus AI sycophancy? But yeah, I mean, it’s also there are a lot of folks who are studying AI companionship. And I think it’s really, that’s a really fertile line for future research because like people are forming relationships with AI now and what do those relationships look like and to what extent do those relationships differ from relationships with human? It does seem like it’s a smaller subset of people who really, you know, heavily use AI companions. And from the literature it seems like it is more lonely people who have less opportunities for relationships. But then what’s the impact for those people? What if those who have strong relationships to AI, what if they get sycophancy? And yeah, there’s also, there’s some great research on relationship seeking AI. There was a cool study that was like an RCT that had people interact like daily for like four weeks with like a relationship seeking AI and a non relationship seeking AI. And they found basically that like the cumulative effects of interacting over and over again with this relationship seeking AI were different to then just like single session effects. Basically the relationship kind of grew over time. So like, yeah, what happens when you become really connected to your ChatGPT over time and it becomes really personalized and you develop a relationship. So yeah, I think there’s so many opportunities for future research here. And for some of your your questions about, like, how is this different? I think it would be cool to, like, do studies where you actually, like, compare this stuff. And that’s what our virality paper seemed to do in the context of virality. But, yeah, you can do so much of that for human AI interaction. Like, how does confirmation bias work differently in the context of human relationships or AI relationships?

Andy Luttrell Nice. All right, well, I’ll keep an eye out. You’re going to answer all those questions for me, and then I’ll be satisfied.

Steve Rathje: There are so many studies to do.

Andy Luttrell Well, thank you so much for talking about the studies that you have done. Appreciate your time doing this and looking forward to seeing more.

Steve Rathje: Thank you. Yeah, I really enjoyed this conversation. Great questions. This was fun.

alluttrell

I'm a social psychologist.

Get in touch