Sandra Matz is a computational social scientist at Columbia Business School. She uses big data to understand people and what motivates them to act. And she has a new book out! It’s Mindmasters: The Data-Driven Science of Predicting and Changing Human Behavior, and it’s an enjoyable, easy-to-read introduction to what your online data say about who you are and how communicators can use those insights to serve up compelling content–for better or worse.
At the top of the show, I also mention a big new academic book I edited with Richard Petty and Jake Teeny: The Handbook of Personalized Persuasion: Theory and Application.
Transcript
Andy Luttrell: When it comes to persuading people, it can feel like a quest for just the right message that will change people’s minds. What’s the best advertisement? The best argument for your candidate? The best way to get people to make healthy choices? But this search for the holy grail of persuasive communication? Maybe the wrong approach.
There are a lot of people out there. What if the thing that gets through to you is different from what gets through to me? If you look back to the old treatises on Rhetoric, you can find thinkers appreciating this.
back around 100 CE, the Roman educator, Marcus Quintilianus–I think is how you say that–wrote that an orator must consider, “to whom and in whose presence he is going to speak, for it is more allowable to say or do some things than others in addressing certain persons or before certain audiences.”
And early on in a formal science of persuasion, scholars like Carl Hovland wrote about how different persuasion variables should work differently for different people.
Over time, researchers tested this notion explicitly in a lot of ways. And to me, one of the key examples gets to a common question in communication. Should you speak to the head or to the heart? Some experts will urge you to stir your audience’s emotions. Others will insist that you’ve got to have the facts on your side if you’re going to get anywhere.
But maybe the answer is somewhere in between. There’s been a bunch of research showing that the answer to our question, emotional versus rational persuasion strategy, depends on who you’re trying to persuade. Some people are driven more by emotion, or they at least build their opinions on an emotional foundation. For them, emotional messages are more compelling.
But there are plenty of other people who prioritize careful reasoning. And for them, rational messages can be more effective. And so if you pick one strategy for everyone, you can see how you’d be losing out, at least some of the time. And it’s not just emotional versus rational messages that depend on the audience.
Persuasion scientists have found that the impact of all sorts of communication choices depend on who’s going to receive the message. What moral values does your message lean on? Do you make a moral pitch at all? Who delivers the message? How do you frame the message? Do you focus on the benefits of adopting a behavior or the dangers that come with not doing so? Do you present your point abstractly or concretely? And on and on and on.
Those questions depend on who’s going to receive the message. And wouldn’t you know it, there’s a new book that explores this. It is a big, fussy academic tome that pulls together chapters written by a bunch of social scientists and experts in their particular field.
But I’m one of the editors of this big, fussy academic book, and it’s called The Handbook of Personalized Persuasion: Theory and Application, which I helped edit alongside past Opinion Science guests Rich Petty and Jacob Teeny. I don’t actually expect you to shell out the cash for it yourself, but I am proud of it, Gosh, darn it!
And, I don’t know, maybe tip off your university library to the fact that this exists. But, okay, so Personalized Persuasion is an idea with legs, I would argue. But practically, there’s a clear challenge. Like, sure, maybe I’ve got an emotional version of my message and a rational version ready to go. But how do I know which one to give you? Or even, like, there’s evidence that messages are more effective when they’re tailored to a person’s personality traits.
But how the heck do I know what your personality traits are if I’ve never met you? Well, it turns out that you’re probably leaving clues right now in the digital world that can paint a shockingly detailed portrait of who you are and communicators can take advantage of it.
You’re listening to Opinion Science, the show about our opinions, where they come from, and how we talk about them. I’m Andy Luttrell, and today, I’m excited to share my conversation with Sandra Matz. She’s an associate professor at Columbia Business School, and she churns through the piles of data we leave behind on the internet.
She wants to know what can be learned about us from those data, and what it means for the promise of targeted messaging. In fact, she contributed to a chapter on Targeted Messaging on social media for the Handbook of Personalized Persuasion that I just told you about.
But even more exciting is that Sandra has a new book of her own on the market that’s an accessible and captivating introduction to the traces we leave behind online. It’s called Mindmasters: The Data-Driven Science of Predicting and Changing Human Behavior. I really enjoyed reading it, and it was great to have this excuse to meet Sandra and give her the chance to share this really interesting research that she’s been up to. So here we go. My chat with Sandra Matz.
Andy Luttrell: In the book, there’s a nice framing device of your origin story as a human being on planet earth, but I’m also interested in your origin story as a behavioral scientist. So, what is the link then between sort of this upbringing that you had and sort of charting out a course for yourself is like studying people in a digital world?
Sandra Matz: Yeah, it’s funny because in a way I didn’t even plan to, first of all, embark on an academic career, let alone end up professing somewhere in the U. S. And I think for me, at some point it just became incredibly interesting to combine these two worlds of psychology, which is my home discipline that I grew up in, and then computer science, which was suddenly this new methodology that allowed us to study human behavior in everyday life.
I think for me, that was the point that I was, no, this is something that I love doing. And it suddenly felt like it’s living up to the promise of psychology to be a behavioral science. And I think where it’s coming from is that I was always intrigued by questions, like, why do people behave the way that they do? What makes me different from someone else? Why do I behave in that way and someone else makes totally different choices?
Then trying to explain this, not by having people answer questions in a questionnaire, but looking at the traces that they leave. I think for me was this pivotal moment that I felt it’s the right time to embark on research. And it probably tells us something about psychology that we didn’t know before.
Andy Luttrell: Did you have like independent psychology and computer science interests that you just suddenly went like, Oh, these can do the same thing, actually?
Sandra Matz: Well, so, I started with psychology, but I love math in school.
So, one of the reasons why I went into psychology is because everybody told me, well, the hard part is statistics. And I’m like, that sounds amazing. I think one of the very few people in the cohort that was actually excited about that, the stats part. And then, I did an exchange year at the University of Cambridge and when I just bumped randomly almost into this group of people around me, Kosinski was like one of the leading figures in that space, who had just started collecting this data from Facebook where they had people self-reported personality and like other psychological traits.
Everything from their profile, from likes to post to pictures to like just general information about people. And they were starting to combine computational methods with psychology. And so that was the point that I felt it’s not just interesting in terms of the topic and the questions that we can ask and answer, but also probably something that I would love doing.
If I like statistics, then probably computer science and modeling is right up my alley. So, it was like serendipitous encounter with people at the psychometric center at Cambridge. And then I decided to pursue that as a career.
Andy Luttrell: It’s a good, I liked that the story runs in that direction because oftentimes you’ll see computer science work, tackle social questions and you go like. But you don’t, it’s like very superficial. It’s just like, but there’s this like wealth of like history behind how we understand these things. That’s just kind of like this big gaping hole in some of this like computer science work…
Sandra Matz: It is very different objectives.
Andy Luttrell: Whereas your work is like very in that world. Yeah. But it draws on like, like theory.
Sandra Matz: Even though I think when you… and it’s changed a little bit because when you look at the early computer science work in that space, it was, can we improve accuracy of our personality predictions by 1% area under the curve? And he was like, well, it’s an objective, right? But it doesn’t really lead us to understanding psychology necessarily better or like understanding why people behave a certain way. So, I feel like that’s changed as. They were like more computer scientists working with psychologists and vice-versa. So, it’s almost like this cross-fertilization where psychologists come up with interesting questions.
And here’s something that we don’t know. Here’s something that would be really interesting to explore and look into. And then the computer scientists are like, Oh, but how about we explore it from thins angle, and that makes it a lot more interesting. And because we can either start studying a scale with different data points. So, I think over the last 10 years, that’s one of the developments that I’ve seen, which has made it a lot more interesting I think for both sides to work with one another.
Andy Luttrell: So, the whole premise of all this is that, and you listed some of these is that in the course of going about our lives, we’re leaving little records of ourselves, all over the place, particularly like digital records of who we are. And that those are piling up to paint a pretty compelling portrait of like who we are as a person, right? It’s not just like just a couple random little bits of knowledge about you and like, oh, you happen to go here. But like I’ve actually learned something deeper about this.
And so I’m curious just so that we’re all in the same level when we start this conversation like what are those kinds of records that we’re leaving? You mentioned a few social media ones, but like can you just give like a sampler plate of things that we might not realize we’re leaving behind us as we traverse the internet or the world.
Sandra Matz: I think one of the interesting things is that this is actually coming from an offline world. So, I was inspired by the work of Sam Gosling, who looked at this and in the context of people’s physical spaces.
So, it all started by having strangers snoop around your office and snoop around your bedroom and kind of try to make these inferences about who’s the person living in these spaces without ever having met them. And then I think what they distinguished, which is a distinction that was really helpful for us as well in the digital space is there’s identity claims and those are these very explicit signals of who you want to be and how you want to be seen by the world.
If you put up a poster in your bedroom or your office or you kind of have a bumper sticker on your car, that’s probably you intentionally signaling to the outside world. Here’s the person who I want to be. Now there’s also all of these other traces that they call behavioral residue.
Those are all of the traces that you don’t necessarily intentionally create, but you just leave by existing. So this is if you came to my office, you’d see that there’s paper everywhere. It’s like, it’s certainly not super organized. If you look at my phone and we’re going to come to digital traces in a second, it’s constantly running out of battery.
So, there’s like all of these traces that we leave that we don’t intentionally put out there, but they still tell a lot about, who the person is that’s creating these traces. So fast forward, I think what we’ve done over the last 10 years or so is to move from physical spaces to digital spaces.
And as you said, the same way that we can break it down into identity claims and behavioral residue and then in an offline context, the same applies to digital spaces. So, we oftentimes latch onto these very explicit identity claims when we think of the traces that we generate. Most of the time, something like personal blog, social media. It’s you telling the world, here’s what my life looks like. And here’s the person that I am/want to be. And we can maybe talk about this later.
But there’s all of these other traces that are first of all, not as often covered by the media when they talk about data that we generate and data that’s potentially intrusive ranging anywhere from your Google searches, your browsing histories, your credit card spending, and the data that gets captured by your smartphone. So if you have your smartphone in your pocket, which we typically do almost 24×7, and that means that your GPS sensor is capturing your location 24×7.
And you’re not putting these cues out there intentionally. You’re not saying, well, I’m now going to walk from here to Starbucks, and then, I’m going to go home, walk across campus so that someone else who’s looking at these traces, thinks that I’m person X. This is just like, we interact with technology and we leave these traces.
So, it’s like a pretty broad spectrum of cues that we can study as psychologists.
Andy Luttrell: This is reminding me for the first time, there are these like, TikTok accounts that I watch occasionally where someone will be like, what’s my birthday? And this person will like go, all right, like you have left very little trace on like social media, but I can still work out based on this account or that account, your username, when you posted this thing, who I think your friend is to be like, this is your birth date and year. And I could figure that out. You never said what that was online ever, but you left all these clues just by existing that I could get this like very specific and unique piece of information, which I think it’s, it’s probably like a nice encapsulation of what you mean.
You’re not intending to leave this identifiable information behind, but if you know where to look, you can figure it out.
Sandra Matz: It is still out there. And I also appreciate that you admit to watching these TikTok videos. It’s usually like a friend of mine mentioned that they watched something.
But you’re right. It’s all of these traces that you leave and it’s not, some of them, you don’t even create yourself, right? So you mentioned that someone else might post something about like happy birthday somewhere and they tag you or they post a picture and they don’t even have to tag you because we have facial recognition and the algorithms are pretty good at doing it themselves. So, it is, I think this notion that, well, if only you didn’t use social media, you’d be protected in terms of visibility to the outside world. I think that’s just a myth.
Andy Luttrell: I also, as you were describing it, drew this comparison to like all of this as a hyperspeed digital Sherlock Holmes. Just kind of looking around and going, Oh, this is a little off. This is unique. This is something that wouldn’t have happened otherwise. And being able to piece together a story from all of these little things that if you’re not looking, you don’t know that they’re there, but these stories are just sitting there waiting to be discovered.
Sandra Matz: And it’s Sherlock Holmes on steroids. Because essentially, it’s Sherlock Holmes that sees the entire world, Sherlock Holmes has done like a gazillion cases and knows exactly how some of these cues are related to outcomes and so, yeah, I like that analogy.
Andy Luttrell: So beyond birthdays, what are the sorts of things that people are interested in pulling together, right? Because it’s one thing to say like, Oh, you once liked a picture of cookies. And so I’m going to shove my cookie product at you. Really? Your point is that you can go way deeper than that. It’s not just the surface level one to one like, Oh, you said this. And so I’m going to show you this. You can actually learn something more deep about a person.
Sandra Matz: And it goes all the way from social demographics, right? You can imagine that it’s very easy to predict, gender. It’s relatively easy to predict age, ethnicity, political ideology. And then it goes to these deeper psychological traits that sometimes we don’t really necessarily want to put out there. All the way from sexual orientation, sometimes income, personality values,IQ. So, I think what we’ve learned over the last 10 years or 15 years is that pretty much everything that you’re trying to predict can be predicted with some degree of accuracy.
There’s still some variation in what’s easier to predict, right? If you take personality, there are some of the traits that are more intrapsychic. So, they’re more focused on what’s happening inside our own worlds. And then there’s ones that are much more social. That we express to the world extraversion, right? It’s like, it’s how do we interact with the external world that are easier to predict.
But generally speaking, pretty much everything that you can think of in terms of here’s the defining characteristic of a person that’s might be then again, also driving behavior. If you think about applications, you can predict from pretty much any data source that you can get your hands on
Andy Luttrell: And just to be clear, how do we know this? Like, it’s one thing for me as ad exec to be like, Oh, people who do this are extroverts or like men do this more than women, but like, it’s a different thing to say, no, we actually can establish what are the diagnostic predictors of these things?
Sandra Matz: So that all comes down at the very high level to machine learning and AI, right? Essentially, what machines are really good at is to take a lot of examples. So, if you think of can we predict whether you’re extroverted or introverted from your posts on social media, what AI and machines can do is they can just take data from thousands, millions of people who have some kind of ground truth.
So, this is something that we always need is like at some point we need to sample people’s self-report or other reports of their extraversion. So we kind of have something that a computer can predict, then we map it against all of the traces. So, everything that you say on social media and then like Sherlock Holmes, essentially the model learns to associate certain words or certain phrases or certain topics to extroversion, and that’s what they’re extremely good at, right?
They can take the data from millions of people and then come up with these cues that we might have actually missed as humans. My favorite example, which also nicely shows that there’s something that we can actually learn as psychologists, right? Just by looking at data in like a more bottom up data driven way that’s not necessarily coming from theory is the use of first-person pronouns. So I, me, myself, like references to the self.
And I still remember that there was like this conference, like one of the big psychology conferences, and Jamie Pennebaker, who’s one of the leading figures in the world of natural language processing in the context of psychology. He was giving this talk and he just asked the audience, well, what do you think the use of first-person pronouns is associated with? And I remember all of the psychologists at the table were like, it has to be narcissism. It’s probably narcissism because people talking about themselves, that’s surely, just the self-focused center of attention thing.
Turns out, and it’s actually a sign of emotional distress. So, when you talk a lot about yourself, that signals that you might be suffering from anxiety, depression. So, kind of like emotional vulnerability, if you want, and some level, like knowing this now, it makes sense because when you’re really feeling bad and sad and down, you’re not thinking about solving the world’s problems, right?
You’re not thinking about start fixing climate change and how do you change the nuclear war protocol? What you think about is like yourself and how you get better. And I think that it’s a nice example of when machines get access to these like vast amount of data and they just kind of bottom up without any theory.
Look at, okay, here’s certain use cases of like certain words or phrases, topics, and here’s an outcome that it’s related to. We can actually learn something that we that we didn’t know before. So that’s what’s happening oftentimes behind the scenes of these predictive models. Now, the interesting part in the last couple of years is that.
Models like Open AI’s ChatGPT, the GPT models, the generative language models that have never been trained for a specific purpose. So, there was never a ground truth that said turn someone’s text into predictions of extroversion, just because they’ve read the entire internet and they know what extroversion at its core is all about, they can make the same similar predictions.
So, they can take social media input, for example, and translate it into anything you asked them for. Back in the day, we were limited by the ground truth data that we had. If we didn’t have people’s moral foundations captured by self-reports, then we couldn’t train a model. Now, ChatGPT doesn’t need that because it’s again, read the entire internet. You can ask it about any psychological characteristic that you can imagine or dream of, and it will probably spit out a reasonably accurate prediction based on the input data you give it.
Andy Luttrell: In cases like that, though, do we lose the granularity of like, what is it seizing on to make that prediction, but like in the early work, it’s like, oh, there’s a correlation between liking parties and being extroverted.
And so now I’ve got like a concrete cue, whereas these models, I think, are just like, I get this extroverted vibe from this person and that might be an accurate vibe, which is like practically very useful, but as a psychologist, sometimes you’re like, but like, what? What did you see?
Sandra Matz: And the interesting part is that you can ask the models, right? So, I’ve done some recent work on using ChatGPT to predict the outcome of speed dates. So, you can ask it like, here’s a transcript of two people on their first date. And now do you think they’re going to exchange numbers in the end? It’s certainly not perfect, but it’s actually at the level of human judges, and then you can ask it, so you can ask it, okay, what made you say that, and sometimes if you ask it to be very specific, it actually comes up with certain passages of the of the transcript, or if not, it can kind of come up with these general explanations that then as a researcher, you can take and say, well, let’s bottom up, build this framework of what makes ChatGPT and predicted someone is romantically attracted to each other. So, there are ways in which you can probe it.
Andy Luttrell: It would be great if these models were as subject to the lack of introspective awareness that we are. Like, that’s a hallmark of psychology that like people make decisions and can’t explain why. And I wouldn’t be surprised if the same was happening here. Like that’s not like, I mean, I don’t know, but I presume that there’s not a built in like, Oh, you should like store some representation of how you came to this.
Sandra Matz: And also you, like humans make up stories, right? So, we need narratives and we need to make sense of the world. So, I think the same is true for language models.
Now, what you can do is you can still at the end, like in the speed dating paper, for example, we let it come up with explanations and then we have other people rate the transcripts based on some of these explanations, and we see if it maps with the prediction. So, you do see that when it says, well, I think they had a lot of shared interests, and you have it externally rated the transcript based on shared interest and see if that corresponds to the predictions that it makes.
You do see like a pretty strong overlap. So, it’s not at least the predictions that it makes seem to be somewhat related to the explanations it gives, but you’re absolutely right. I am sure that it would come up with something, even if it couldn’t figure it out.
Andy Luttrell: Right. I’m curious also about your example of the personal pronouns is a good one of like surprising connections. Have there been cases, cause a lot of the times when you sort of highlight these cues, they’re kind of like, well, yeah, of course they like don’t seem super surprising. I’m curious have there been—and this is putting you on the spot—cases where there’s a cue that you’d go like obviously, this is a predictor of x and it just doesn’t pull it off. Like does anything come to mind?
Sandra Matz: And it doesn’t work.
Andy Luttrell: It’s like predictor that doesn’t pan out.
Sandra Matz: I guess usually we do it the other way, usually. We usually kind of take a look at what is the model using instead of here’s something that I would have expected to work.
Let me think of we rarely go the other way. There’s nothing that comes to mind because we don’t do it in that direction usually. It must be something in the speed dating context because there we looked at it. But I can’t recall from the top of my head, but it’s a good question. It’s like essentially like here’s something that we would assume is true and the model somehow doesn’t pick up on it.
Andy Luttrell: Yeah. I think probably where the question is coming from is I think about practitioners deploying personalized messages to audiences and like they’re leaning on some intuition oftentimes as to like, what do I need to say to reach this audience? And I guess in these cases, you’re just using the model to generate these things. But if you’re like I want to reach a certain audience who has these characteristics, I suppose you just train a model on, like, predict that characteristic and then, anyhow, that’s really… coming from.
Sandra Matz: Then it would be an interesting, so I mean, one of the things that I’ve been thinking about actually in the context of the speed dating is that there are certain queues that large language models are using that we might have overlooked, but there’s probably also cues that humans are using that large language models are overlooking. So, I think the ideal case is like it’s a back and forth. Humans tell the model. Here’s something that you’re not using. Why are you not using it?
And then there’s something that the model could tell us. And then there’s obviously the question of like, which ones are actually accurate, like, which ones are just our own biases and stereotypes. So, at the end of the day, if we’re really thinking about optimizing, prediction and messaging, if you want to take it to the applied context. Ideally, we just look at what actually lands in the end as opposed to here’s my human intuition. Here’s the intuition of the LLM. But then work our way back from the outcomes. That we’re interested in.
Andy Luttrell: If you let me, I also want to go into the philosophical on this. Because as I was reading in the book about like, accuracy rates, right? There’s some like wild accuracy rates in terms of like the claim kind of seems like as long as you have a lot of inputs, you’re gonna have like an ever more accurate prediction of someone’s personality. Right? And so my question was like, is there an upper limit? Is there like a limit at which like these inputs just won’t fully predict the outcome, but it also made me wonder, what is personality? Is what we’re trying to predict real? Or is it the predictors that are the actual personality, right? Like, are these digital traces? The gold standard of personality. And it’s our questionnaires that are trying to predict that. And so I’m just curious, you spent a lot of time stewing in these kinds of data. What resonates in that with you?
Sandra Matz: Well, I mean, I could probably talk about this for hours. So first of all, this definitely an upper bound because question is have upper bounds. But if you think about the retest reliability of a questionnaire, like if I give you the test today, and I give you the test again in three months, it’s not going to be exactly the same.
In that sense, we’re always limited by the fact that we humans don’t always think of ourselves in the same way, which I’m also going to come back to in a second that I think this is actually a feature of personality, not necessarily a bug. There’s also the case, and I’ve been kind of, I haven’t looked at it empirically, but I’m interested in from an almost ethical point of view is I do think that there is an argument to be made that more data is not always better because more data almost swamps the model with too much information.
That’s both true for, it’s kind of, it’s crowds out the signal and just adds too much noise and this is true for predictions. I was also thinking about it in the context of what you mentioned earlier is like creating messages that resonate, right? One of the things that I’m currently trying to study is, technically speaking, if I given a large language model access to all of your data and ask it, come up with some advice for how you should spend your weekend in New York, it has access to everything and without any kind of collapsing of the data or getting rid of some of the interesting inputs and data traces.
So technically speaking, given that all of the data should lead to the best outcome. Now, rather than going through site personality, I could also take your data say here’s your big five profile and then, here’s the big five profile what should he be doing on the weekend. My question—and again, I don’t know the answer to because we’re only looking at it right now—is it could be that personality in a way focuses the attention of the model the same way that it focuses the attention of human creators.
But if I give you access to everything, like the model might just latch on to one thing that you’ve done like two years ago and then just goes down this rabbit hole. It doesn’t really know how to integrate and it feels like almost this piecemeal. Like, yeah, maybe he was interested in sports here and then maybe he was also interested somewhere here, but it doesn’t really kind of come out as like a consistent narrative.
And I think that like humans, I think I have a hard time with that, right? If I give you all of access to all of your data, where do you even start? And I think the same would. probably be true for large language models. I think they’ll just pick something and then maybe it’s very customized to one or two data points, but it might not capture the entire picture of you.
In that sense, losing some of the accuracy and some of the nuance and by collapsing it to personality, as opposed to just the wild west of your behavioral traces and might actually make it better by focusing attention. Again, I don’t know and I’ll come back to you once we have some results.
But the broader question of like, what is personality? I think is such an interesting one because even the way that personality was developed, the personality traits were developed by looking, going through all of the adjectives in the English dictionary. So, it was very much bottom up saying like, how do we talk about people like what comes to mind when we think about describing other people?
And now let’s just kind of try to condense it, and turn it into something that we can make sense of in like these five dimensions. So from that point of view, I think it actually started in a very similar spot to us just taking whole swaths of behavioral data. Now, the challenge that you sometimes have, and we have these papers where we do something like similar to what you suggested.
So, instead of saying, we’re going to have a self-reported big five personality trait, and now, we’re trying to map that onto the data, we just let the data speak for itself, right? So, we kind of have these bottom up models, they become hard to interpret.
I think psychologists notice with factor analysis, even when you have questions and text. At the end of the day, you know, Oh, here’s just these 10 questions that cluster together and they all seem to be loading on the same dimension and you still have to find a name for that dimension and describe it and you can imagine if you have like millions of data points that load onto one dimension that you have to make sense of, it’s not a trivial task at least. So, then the question again is if bottom up into this dimension, do we gain or lose?
And the final thing I say because that for me is interesting, both in prediction and also in persuasion at the end of the day is, the way that we think about personality as psychologists, I think has shifted a little bit over the last two decades. And as we thought of it very much as a point estimate in the beginning. So, it’s like, well, your score higher than 80% of the population on extroversion. And that’s like a relatively static view of personality.
It’s like a, always it was this tendency to act and behave and think and feel consistently across different situations, as opposed to the social psychologists who are all about, no, it’s the situation, forget about like dispositions, it’s all determined by the situation.
Now, where we’ve moved to is essentially this idea that. You yourself are a distribution of personality states. So, you have a certain mean and a certain tendency, right? If you’re more extroverted, most of the time, you’re probably going to feel and behave in a very relatively extroverted way. But depending on the situation that you’re in, I was interested, for example, in the impact of places that you visit.
So, if you’re in a social spot, like a bar or a coffee shop, that’s lively and there’s other people, and there’s just like a lot of social stuff going on, you probably move from your average extroversion score a little bit up because you kind of get pulled in and you also feel a little bit more social.
If you sit at home or in the library where there’s absolutely nothing happening, you probably kind of go get pulled down a little bit from your average extroversion.
And for me, this kind of conceptualization of personality as a distribution of states is nice because it allows us to be a bit more dynamic without being hypocritical. So, it acknowledges that we’re not always the same, and we can be like a little bit more extroverted, a little bit less extroverted. And it’s not that we’re just acting out of character. It’s just depending on the situation, it’s an appropriate response for us to have.
And it also means that if you’re trying to understand someone’s personality in the moment, let’s say you have an extroverted product or you want to target extroverts, you could imagine that it really matters where on a distribution they are and I don’t even know in which direction it goes.
It’s been on my list to do for like ever, because it could be, if you have an extrovert who’s like generally extroverted, should you target them in the moment that they feel even more extroverted because their self-concept is activated and they’re like, Oh my God, I’m like, the most extroverted person I can be.
Now, it would be a good time. Could also go the other way where it could be that the extrovert is currently in an introverted situation and they’re lacking or they craving this kind of additional sense of extroversion or experience of extroversion. So, now might be a really good time to come in and say, Hey, I’m going to fill that gap, by kind of advertising this product to you.
The more we kind of take these nuances and dynamic versions of personality into account, I think the more we learn about the individual, but we might also become a lot more precise in kind of crafting messages and kind of customizing offerings.
Andy Luttrell: It seems to depend too on like where we’re getting the base level extroversion score from, right? If we’re getting it from digital traces, is there any sense that we can bound to that by context, right? Like we’re getting traces of your personality in a certain context and I guess the assumption is that across a million data points, we average across all these situations.
But you know is there a way in which we can sort of add a context tag and wait those inputs in a way that makes it more or less relevant to a given situation
Sandra Matz: Or even just take the situation itself, right? So, I think you’re absolutely right in that what we do to say it’s in a way the same with questionnaires, right?
Questionnaires also ask you to what extent are you the life of the party and do or extend you make a mess. So in your mind, you kind of average across situations to come up with that answer. And the same is true for like these predictive models of personality traits where we just take your history. We throw it all in and then we come up with an estimate. However, because we now interact with technology, that’s much more dynamic. And in the moment, we can also add context, right? If I can take your social media and predict that you’re generally extroverted, I can now also tap into your phone, could be GPS.
And I know that you’re currently in this coffee shop and I can map it against Google and I know that this is a very busy time. They have live music playing. So, I could tap into your microphone to see, is there a kind of ambient sound I can kind of understand based on Bluetooth, whether there’s like a lot of other people around.
So, there’s many ways in which I can use these more momentary traces that we generate to add this context, and I think that’s then giving us this more dynamic read of the person in the specific moment.
Andy Luttrell: I also want to make sure that that we highlight an important implication of all of the things that you’re talking about, which is the importance of holding on to theory when we’re in the world of big data, right?
Particularly when you were saying, like, if you just let these models kind of run rampant on any data with no leash, it just is kind of a mess. But if you impose this structure that we built through a bunch of very careful and developmental work, then that wild west of data makes a little more sense, right?
And I think kind of to my earlier derisive comment about computer scientists producing heartless research like that’s the danger, right? Is that if you’re just gonna like celebrate big data for its potential I think people might be tempted to just go like well, that’s the answer. Why are we doing all of these silly theory building exercises But I think this is all good reason why we go like, well, no, those theoretical developments really guide how we’re using these data. And the proof is in the pudding, right? Cause it improves the output that we get.
Sandra Matz: So I always think of it, like how long are these models valid for? So, it’s true if we find relationship in the data, and I’ll just give you an example. There was one of the first papers that was published, was published on personality traits or like psychological traits and Facebook like, so the pages that you follow on Facebook, one of the strongest predictors of IQ was liking curly fries.
Doesn’t make any sense, right? So, unless, like, maybe we’re just missing something, but it doesn’t have any, what we would call, face validity. So, you don’t look at it and you go like, Oh yeah, that has like a lot of theoretical depth there. And the problem with some of these correlations that we see, they’re predictive in the moment, right?
So, if you’re using the model right now, you meet a person on the street, you know nothing about them, you still take these relationships because they offer you something, otherwise they wouldn’t have come up in the model. Over time, they might completely decay. It could be that it was like a running gag, like a joke within like a small college community, whatever. We don’t know.
It’s also the moment that you publish it, people might start liking this, all of these ways in which the relationships that don’t make theoretical sense are probably the ones and that are not going to be relevant in the future. So, I think that even if we think about improving the performance of models in the long run, there’s totally like an argument to be made of, well, either we have to update them. That’s one option. That would be the computer scientist approach. That’s like, if we can get new data, that relationship is going to disappear and the model will learn itself. If that’s not the case, then having like this theory behind it is actually incredibly valuable.
Andy Luttrell: We’ve been dancing around it too. I want to make sure we get to messaging, and so, for me, the promise of this, I think people probably find this interesting for a million reasons, just the inherent interest that like myself is evident in all of these digital traces. But where it comes a little into where I do spend a lot of time thinking is in persuasive messaging.
You talk about the Cambridge Analytica Event, let’s call it, and also how people have talked about it, right? Like this notion that, oh, and this goes back forever, right? Like the fear of subliminal messaging, all these fears around tracking and targeting. There is a real sense that like the reason we should be terrified is because we’re going to be vulnerable to these like very targeted interventions on how we see the world.
And you do a nice job of throwing a little bit of cold water on that being like. It is probably not nearly as catastrophic an intervention as it’s made out to be, but it’s also not a pure fiction either, and so I wonder if you could talk a little bit about like the work that you’ve done translating these digital trace artifacts that are out there into the potential for targeted messaging.
Sandra Matz: Yeah, no, totally. And I like that you picked up on this. I was so frustrated with the Cambridge Journalist. I was happy that people suddenly cared, right? I think we’ve been doing this research for a while, and we’d always try to get people to care about it.
We took a bit of a scandal to get people shaken up, but as you said, it’s not as brainwashing machine that changes who you are from the core and kind of flips your identity entirely, but there’s also something to it that is not zero. And I think what people can oftentimes relate to when they think about, well, what is possible, what’s not, and I’m going to give you examples after.
But I think it’s just like, how does it work in an offline context, but in an offline context, it comes so naturally to us to adjust to who’s on the other side that we don’t even think about it anymore. Like kids, kids figure out really quickly. Like, how do I ask mom to get the candy as opposed to how do I ask dad?
And it’s the same for us, right? We have like certain topics that we talk about with certain friends. You don’t talk to a three-year-old the same way that you talk to your boss or that you talk to your spouse.
So, any type of communication that we engage in as humans is personalized and we all have a sense that, there’s like certain ways in which I can get someone to do what I want by talking in a certain way and kind of persuading them in a certain way, whether that’s always, right, the way that we do this is not necessarily given because we’re very self-centered and we have like only one view on the world, but the same is true for algorithms.
I think that like, once you understand who’s on the other side, once you understand their preferences, their motivations, their hopes and fears, it’s not surprising that you can potentially push them in a certain direction that you want them to go in. And some of the studies that we’ve done, and this is like now, almost 10 years ago, and we started in the consumer space. So, one of the early studies with a beauty retailer that was essentially trying to get women to click on a Facebook ad, go to a website and then buy something from the online store. They were agnostic to which products the women bought. They just wanted to get them to click on the ad, go to the website.
And what we did there is we said beauty products don’t have in and by themselves a certain personality appeal. There are certain products where you say, well, skydiving is probably not the best product for people who are extremely neurotic because they’re probably going to enjoy jumping out of that airplane.
But like beauty products can really kind of can be relevant to anyone. So how do we bring out the motivation for different types of personality. So, extroverts, for example, would be, well, can you use it to essentially become the center of attention? So, all of the creatives that we came up with were, it was like something going on, like a woman on a dance floor.
There was like saturated colors. You could see it as like a lot of social stimulation and excitement. The text would say something like dance like no one’s watching, but they totally are. So, playing with the need of extroverts to be the center of attention.
Then, the introverted ones would always just be like one woman, like quiet setting, the copy would say something like beauty doesn’t have to shout. So, really kind of playing into this, how do you use it to make the most of the quiet me time that you have?
And then we kind of put those ads on Facebook. Again, it’s like not going to get into detail, but through some of these behavioral traces, we were able to define audiences based on their interests that were more extroverted and introverted.
Then, what you saw is like what you would expect in terms of matching effects that if extrovert received the extroverted messages and introverts, the introverted ones that there was like, I think the purchase rates were about 50% higher. So, which actually was a like really surprisingly large effect for the fact that it was like only one ad on Facebook.
Once they clicked on the ad, the website looked the same for everybody. And the targeting that we used was also extremely crude. So, it’s not one of these great models that takes all of your traces and turns it into a detailed profile. But it was very, very crude.
For me, the interesting part is that all of this can now be automated with generative AI. So one of the reasons for why only the app was customized was because we couldn’t ask the creative team of the beauty retailer to now come up with a thousand different versions for all of the different parts of the website, but generative AI can, like, once you give it a, not even a description, you could say, come up with an ad that’s customized to someone who’s extroverted and open-minded, a few iterations before, like ChatGPT didn’t actually do a good job at combining them and it would just have a sentence on extroversion and a sentence on openness. The one that we’re in right now is actually doing a good job combining them into like this bigger picture.
And then you can take it a step further and say, okay, like now that you’ve given me this ad, which actually reads beautiful, can you also come up with a description, like with a prompt for Dali, which is the image generation, sister, and just kind of come up with a prompt that I can use to generate an image that goes with the ad also beautiful. So, you can customize this entire journey. And because like LLMs can now do everything from both predicting your psychology to generating the content that matches, you could create an entire website for someone landing on that page if you had access to that data, for better or worse. There are certainly opportunities for persuasion. Also, opportunities to lose our shared reality entirely.
Andy Luttrell: That’s the fear I jumped to, too. And also, I was thinking about, like, generational differences in this, where I think you’ve seen a shift where younger folks are more like, Yeah, I want a customized experience. Wouldn’t I want this to be, like, just for me? Whereas other generations that grew up on, like, true mass media, where it was like, we all got the same newspaper on Sunday morning, and we all watched the same news report at six o’clock the night before. It just feels real bizarre to be like, I won’t have had the same experience as the person I sit next to at work. And I won’t have learned about this stuff in the same way as this person.
Sandra Matz: Or have something to talk about. For me, it’s like, wait, there was this James Bond Trailer that was generated by AI and it’s just amazing was written by AI produced by AI. t looks like a perfect trailer and it just never happened and I always give this example like if AI can do this now, you could have your own James Bond movie based on everything that I know about you here’s the action that you like here’s what you want the characters to look like your favorite actors. Your James Bond movie could look totally different to the James Bond movie that I see now.
Maybe that’s interesting because it plays into your, but we have nothing to talk about. There’s something about collective experiences that makes life beautiful and that makes social interactions valuable. And the more we personalize, the less we have of those ones.
Andy Luttrell: Yeah, it’s a great point to like, what does it mean to like something if you can’t share it? The first thing you’d want to do is like, Oh, I’ve got to show this to my friend, like, they’re gonna love this movie. And like, Oh, wait. Oh, yeah, that was the point, wasn’t it? The point was to have something that would connect you to other people.
So, I have a question about this. This is going to get a little technical, the question about the persuasion study. So, I include that paper in my attitude seminar every other year when I teach it. And every time I go, wait a second, how did you do this? And I just can’t find it in the supplement. It’s probably a trivial answer, which is, how are you able to trace purchase outcomes to the tar, like the audience, like just how did you draw that line between the audience and the ad and the purchase outcome?
Sandra Matz: Yeah. So, in this, I mean, we teamed up with a beauty retailer, so the ads will run through the beach retailer, but it was like a separate audience. For each of the four different conditions. So, we had one campaign that was extroverted audience, extroverted messages, one campaign, do the crossover, extroverted audience, introverted messages, introverted audience, extroverted and so on. And so for each of the campaigns, you essentially see the clicks and conversion rates.
Andy Luttrell: But in terms of like, then they spent money at the website, it was it for different online stores.
Sandra Matz: Oh, so this was one beauty retailer. So in this case, all of the data came from the beauty retailer. And what Facebook does, it gives you a pixel, right?
Facebook essentially almost like an identifier of the ad. So you have to embed something in the website that says every time this pixel gets hit, kind of tell me that, this purchase came from this campaign. So internally the beauty retailer can link in. So they see on their end, here’s person number X came from campaign number eight. And now because we can track person’s X purchase, that’s kind of how it gets accumulated.
So, you are absolutely right. You don’t see the purchases in your Facebook campaign. What your Facebook campaign shows you is clicks and conversions that you have to get from the retailer because they can track it because they see person with, like, in this case, it’s like whatever, like session ID, plus the campaign that they came from gets written into a database.
Andy Luttrell: Okay. That is about as straightforward as I thought that it was, but every time, I mean, this is the part that students really like seize on is like, like they’re almost more, concerned that the research is possible, right? How are you able to trace like what I bought to the ad I saw to the person I am, and it’s one thing to be like, oh, proof of concept, it’s possible, but like, oh, these are actually so traceable that it’s like relatively trivial to just like, test the question of whether this happens.
Sandra Matz: Yeah, so, depending on the action that you want, you would just have to embed different pixels. Essentially, it’s like, you think of it as almost like a trap. There’s like traps on the website. And if you get to the checkout of whatever the headphones that you’re trying to sell, like after the checkout for the headphones, your little Facebook ad campaign tracker, like you walk around and you fall into the trap and then now that gets sent to Facebook.
That’s how I think of it. Essentially, like it’s a capture of your campaign ID that gets passed on in the URL.
Andy Luttrell: So, speaking of concern and to come back to the like doomsday version of this, how concerned should we be? You’ve spent a lot of time with this, has it changed how you interface with messages online or your data online? Are you concerned about, like, the size of these kinds of interactions? What’s your take?
Sandra Matz: So, I think, what I’m concerned about is a couple of things. One is that I think we’re losing this, first of all, collective excitement, collective culture, that we talked about, in the sense that if we don’t see the same things and everything is personalized, we have nothing to talk about.
And there’s also no real opportunity for collective oversight. So propaganda is nothing new, right? Persuasion has been around forever. We talked about mass media, but right now I don’t really see it anymore. So, at least back in the day, we had an opportunity to talk about and say, Hey, did you see this nut case?
It was just posting this stuff. Isn’t that just crazy, right? I don’t even know what you see and you don’t know what I see. So we don’t talk about the stuff as much anymore. So that is like one concern that I have.
The other one is. That we’re just going to become so boring. So, I think if everything is fully personalized, right? And like the better these models get, the more we kind of, and if it’s all about here’s something that we’ve done in the past, and yes, maybe if we translate it into personality still makes it a little bit broader and opens it up to kind of exploring around that. But if it’s all about trying to maximize fit existing preferences, it’s going to get narrower and narrower, and I’m really worried that it’s just going to turn us into boring individuals that are somewhat unit dimensional.
I mean, I think there’s a New York Times article by Kashmir Hill, which was she kind of outsourced her decision making for a week to LLMs and like different kinds of generative AI. And I think she nicely summarizes. She’s like, well, it did a pretty good job, but it turned me into this basic bitch. Like everything was like somewhat average and like somewhat customized, but not fully like you lose some of this individuality.
And I think that’s the second part that I’m a little bit worried about. Now, coming to my own behavior, I think that’s one of the reasons why I’ve become a lot more pessimistic over the years in terms of what we can do with user control, because it’s like, first of all, I have to keep up with everything that happens in the technology space, and then it’s a full-time job. If you really wanted to try to protect yourself, manage terms and conditions properly, every time that you download an app, go through all of the permissions, go through like for all of the Europeans, the cookie stuff, it’s just it’s a full-time job and we have better stuff to do it much rather kind of write another paper, spend some time with family, you name it, and then now go through all of the legalese in terms and conditions. So, I think there’s better ways in which we can protect people in that space than just putting the burden on users.
Andy Luttrell: I wonder too, so maybe as a more optimistic note to end on, I have to imagine that you are privy to like how these kinds of targeted campaigns are being used in the wild, like people who are kind of leveraging this sort of thing for their own outcomes, and so I wonder, like, are there any of these that strike you as like a particularly interesting or novel or hopeful way in which people are taking advantage of these digital traces in a way that is like, Oh, this is better news. I would like more of that.
Sandra Matz: So many examples. I think that the crux is that it always happens essentially with user input, it happens with like, I have a goal and you help me accomplish it, whether that’s save more, which is one of the campaigns that we did, similar to the beauty retailer, we just tailored messages based on personality to help people accomplish their saving goals. And you see like the similar 60% uptake in the number of people who managed to do that.
Mental health is another one. It’s like not just a tracking. So like usually the way that it works right now is you have to get into this like valley of depression and now try and fight your way out of that by getting support, which oftentimes doesn’t happen will be much better to say, I can passively track using your data, GPS for example, you’re not leaving the house as much anymore. There’s much less physical activity, you’re not making, taking as many calls as you used to. Maybe like the valence in your social media post is going down using more first person pronounces.
Out of these traces that maybe it’s nothing. Maybe you’re just on vacation and you’re having a chill time, but it could be like this early warning system. Then, if you take it to the next level and say, what do we do about this? There’s, obviously like a huge potential to say, how do I help you get better in a way that doesn’t require you to go see a therapist in blood and flesh, once a week for 500 bugs a session. And in most parts of the world, you don’t even have access to anyone. So I think there’s this huge potential to say, once we understand what you want and what you kind of the help that you’re trying to get, we can cater to your needs preferences and so on.
Andy Luttrell: That is great. I’ll keep my fingers crossed for more of that. In the meantime, thank you so much for talking about this stuff. I really enjoyed the book and appreciated the chance to talk about it.
Sandra Matz: This is fun. Thank you.
Andy Luttrell: That’ll do it for this episode of Opinion Science. Thank you so much to Sandra Matz for taking the time to talk about her work. Her new book again is Mindmasters: The Data-Driven Science of Predicting and Changing Human Behavior. I definitely recommend it. You can check out the episode webpage for links to that book and to Sandra’s website where you can learn more about what she does.
And, of course, don’t you dare walk away without subscribing to this show, Opinion Science, wherever you get your podcasts. Plenty more good stuff is coming that you don’t want to miss. Also, hop over to OpinionSciencePodcast.com for all the past episodes, ways to support the show, and a bunch of other stuff.
And are you a straggler who hasn’t left a review yet on Apple Podcasts or some other podcast purveyor? That would be a digital trace you could be proud of. I don’t know, I’m just saying. Okay, that’s it for me. Happy March. Thanks for being here, and I’ll see you back next month for more opinion science. Bye-bye.