Brown CS Blog

Brown CS PhD Student Zainab Iftikhar Asks How Machines Understand Empathy

    Screenshot 2024-05-09 at 7.28.58 PM.png

    The above conversation is a snapshot of an exchange between a peer trained in support techniques and a person seeking support. Current computational models rated the peer’s response as 1 out of 6 in empathy; how would you rate it?

    Do a machine and a human understand empathy in the same way? Brown CS doctoral student Zainab Iftikhar and her collaborators suggest not entirely. 

    The emergence of ChatGPT has stirred up discussions around the essence of humanity in the world of artificial intelligence. Traditionally, humans have been recognized for their critical thinking, compassion, and empathy. However, a shift occurred as large language models joined the club, changing the focus of computational psychotherapy to use AI to develop models that write empathetically and blur the lines between human and machine understanding of empathy.

    So, do these models share the same understanding of human empathy as us? A study conducted by researchers at Brown University, soon to be published in the ACM CHI conference on Human Factors in Computing Systems 2024, sheds light on the disparity between human and machine interpretations of written empathy. While users often perceive high levels of empathy in their interactions, computational models tend to score these interactions lower in empathy. The disparity arises from the fact that machine models prioritize flawless sentence structure over the human nuances that underpin genuine empathy and connection. This study highlights the challenge of navigating the intangible aspects of human qualities in the digital realm, particularly in the context of mental health support through AI. The authors offer insights into how human qualities are hard to quantify to a metric and how the digital mental health industry can develop human-centered metrics.

    "We're likely going to see an increasing human-AI interaction for mental health support," says Zainab, a doctoral student in Computer Science at Brown and one of the lead authors of the study. Iftikhar foresees an increasing reliance on human-AI interactions for mental health assistance since users, in light of the acute shortage of mental health, have turned to alternate solutions. She emphasizes the importance of designing these interactions with the users' perspectives at the forefront, highlighting the need to align algorithms with the experiences of those directly impacted by them.

    "How will we design to support this interaction, and what can that interaction look like? We will have to sanction these algorithms with users directly affected by them."

    Driven by this goal, Iftikhar, her co-author Sara Syed, a former undergraduate student, and Jeff Huang, an associate professor of Computer Science at Brown, are working with online peer counseling conversations based on Cognitive Behavioral Therapy (CBT). By analyzing exchanges from a text-based peer service called Cheeseburger Therapy, the researchers uncovered factors influencing empathy in text-based mental health support conversations. They assessed whether the state-of-the-art machine learning models and humans have a shared understanding of empathy.

    Each conversation on Cheeseburger Therapy takes place between a peer trained in CBT skills and an individual seeking support. While Cheeseburger does not provide any diagnostic services, it offers a platform for users to apply CBT techniques to help others navigate their challenges. Peer supporters, referred to as helpers, employ empathetic strategies like active listening, validation, and cognitive restructuring (helping replace unhelpful thoughts with empowering ones) to assist users in reframing their perspectives. At the end of each session, users reflect on the interaction, sharing feedback on what went well and areas for improvement. They also assess whether they perceived their helper as empathetic. 

    The study found that, shockingly, despite 85% of users reporting their helpers as empathetic, state-of-the-art algorithms assigned lower empathy scores to these interactions, an average of 1.6 out of a total score of 6. This discrepancy can be attributed to the models analyzing the interactions in isolation without considering the conversational context and displaying a bias toward prioritizing flawless sentence structures over nuanced human connections.

    As suggested by these algorithms, insisting on maximum quantitative empathy may not always align with the natural flow of the sessions as it necessitates prolonged responses that might disrupt the organic dialogue exchange. Throughout the conversations, the helpers engaged in small talk, shared personal anecdotes, discussed Wi-Fi problems, or diverged into unrelated topics to ease in. While these utterances may not directly correlate with an algorithmic understanding of empathy, they contribute to essential components of a successful peer support session, such as building a therapeutic alliance, enhancing social presence, and fostering deeper connections.

    “The practice of assigning a singular score to a support provider's responses could introduce bias, particularly for non-native English speakers.”

    This evaluative approach diverts attention from crucial empathy-building strategies like establishing an early connection, creating safe spaces for exploration, and validating emotions, often achieved through shared experiences, as highlighted when reviewing users’ feedback. While quantifying empathy may offer benefits such as providing real-time feedback to support providers or suggesting modified responses to enhance empathy expression, such an evaluation method places an undue burden on support providers. 

    "When we talk about scalable mental health support through AI, the common inclination is to emphasize expanding datasets and refining model accuracy," remarks Iftikhar. "However, since mental health support hinges on subtlety and personalization, focusing on generalized approaches can miss out some important things that resonate with people seeking support."

    The researchers mention the complex dance between human and machine understanding of empathy in text-based interactions, raising important questions about how to infuse AI-driven mental health support with human-centered metrics. As technology continues to shape our interactions and support systems, understanding the gap between computational models and human emotions is critical in designing effective and empathetic digital solutions.