Novel Python Research Tool to test Voice Transcription

Python-based research tool to conduct HCI experiment on speech-to-text transcription accuracy with various levels ambient noise.

PythonProgramming Languages

Project Overview

This research tool was developed to automate an HCI study examining the impact of ambient background noise and speech rate on voice assistant transcription accuracy. The tool systematically presents transcripts for participants to read aloud while playing different background noise levels, capturing recordings and transcribing them for analysis.

Challenges

Ensuring precise timing and synchronization of transcript presentation with background noise playback was a key challenge. Additionally, automating speech capture and transcription while logging performance metrics, such as word error rate (WER) and words per minute (WPM), required careful implementation of a robust data processing pipeline.

Solution

The tool was built using Python with a multi-threaded architecture, ensuring smooth execution of simultaneous tasks, including transcript display, audio playback, and voice recording. It leverages Google’s speech-to-text API for transcription and calculates key performance metrics like WER and WPM. The experiment results are logged in a structured CSV format for further statistical analysis.

Results

The tool successfully facilitated the HCI research study, automating data collection and analysis while improving the efficiency and consistency of the experimental process. The collected data provided insights into the effects of background noise on transcription accuracy, contributing to a broader understanding of voice assistant usability.

Image Gallery

Interactive Prototype

No interactive prototype available for this project.