Detecting the Undetectable: Human Judgments and the Challenge of Synthetic Voices
Fulltext URI
Document type
Additional Information
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Synthetic voices generated using artificial intelligence (AI) are becoming increasingly indistinguishable from human voices, raising important concerns about trust, deception, and detection in digital communication. This preliminary work synthesizes the current landscape of research on human perception in detecting synthetic voices. We reviewed 13 papers from databases including ACM, IEEE, Springer, and MDPI, and identified five main types of perceptual cues that users rely on to detect voice synthesis: Intuition/Gut Feeling, Liveliness, Emotions, Linguistic Features, and Acoustic and Environmental Features. Our findings highlight the need for further empirical user studies to better understand how individuals perceive and assess the risks posed by synthetic voices. Such research can inform both educational and regulatory strategies aimed at increasing awareness and mitigating the potential harms of synthetic voice technologies.