Unparsed 2024

The team speaks about usability testing at this highlight of the CxD calendar

6/19/20242 min read

Spencer and Ufonia's Nikoletta Ventoura (with Adam in absentia) spoke at this year's Unparsed conference in London, discussing what Conversation Analysis can bring to usability testing in Conversational AI development.

The talk draws on data from the Dora corpus to show how usability testers behave differently from the eventual actual users for whom the system is being developed. This risks the conversational agent being less able to cope with end-user expectations and behaviours. That in turn leads to users having to make adjustments to what they want to achieve in the interaction, limiting their input to the kinds of behaviour that testers produce.

In the presentation, we argued that one way to address this would be to study data from equivalent interactions in human-human contact. This would flag to the conversation designers a range of ways that people behave in these types of interaction that are difficult to predict or imagine, and could feed into the design process to augment the usability testing.

Nikoletta also discussed how to curate a targeted cohort of testers, as a way to improve the quality of feedback from the testing process.

Nikoletta a Prompty Award (One to Watch) and Dora received a Prompty Award for best Conversational User Interface

Abstract

Highly regulated industries, like healthcare, require evaluation of technology before it can be deployed for use by the wider public. This leads tech companies to recruit user testers to aid in the development of their product. While this is a more sophisticated and accurate process than, say, the Wizard of Oz method, the question remains: do user testers engage and interact with systems in the same way that actual users do?

Conversation Analysis has proved to be an effective research tool for demonstrating the differences between role-play interactions (such as those used in police training, GP training, and mystery shopper interaction) and their high stakes ‘real’ equivalent. It has shown how interactants behave differently, in both subtle and not-so-subtle ways, when the stakes are different. For example, role-playing speakers may overuse certain phrases or exaggerate particular actions if they are being assessed for a professional certification. However, little research has examined the ways in which differences exist in the testing of conversational user interfaces.

In this presentation, we use Conversation Analysis to compare and contrast the use of Dora – Ufonia’s automated clinical assistant – by user testers and real NHS patients. Based on our observations, we provide suggestions for how developers should approach the use of user testers in product development.

Unparsed 2024

Contact

Address