Case Study: Enhancing Accessibility for Large Scale Studies
At K-Optional Software, we build custom tech solutions that researchers can apply to large scale studies. Over the years, we’ve designed and built bespoke platforms with functions such as:
- “Smart surveys” that support hundreds of question pathways.
- Data export formats that meet some niche spec.
- Offline functionality that protects Personally Identifiable Information PII.
However, our latest challenge introduced a unique constraint: accommodating subjects with visual impairments (VI)
The Challenge: Accessibility for Visual Impairment (VI)
Think of the last time you used an app. Your sight likely drove the experience: where to click, when a field is selected, what page you are on all hinge on a visual concept of the screen.
And even if you can depend on users’ sense of sight, providing a pleasant e-survey is no small task. This type of app must accept copious data points, convey the subject’s progress, and ask follow-up questions based on answers.
We sought to preserve the rich interaction of a survey without such a key interface. Speech and sound afford a means of two-way contact with a user. We can give context by talking to a user and absorb context by listening to a user. But as the old adage goes, a picture is worth a thousand words: we lose detail when we convert sight to sound.
Our Assessment: Leveraging ARIA-WAI for Accessibility
The ARIA-WAI spec empowers web apps that support a Screen Reader. Screen Readers are built-in to phones and desktops; they read aloud the text on a page and emulate a mouse cursor. We sought to leave intact these functions that VI cohorts deftly use each day.
However, ARIA-WAI apps still rely on a full keyboard, which may not be suitable for a tablet-based survey. To bridge this gap, we designed three accessibility modes:
Audio Mode
In Audio Mode, the software uses Tensorflow.js to recognize user speech without sending any data off device. For free response questions, we transcribed speech on device via WebAssembly port of Whisper, a state of the art speech recognition model.
Single Button Mode
Single Button Mode simplifies interaction by allowing users to press, hold, or “mash” a single key for answering, skipping, or accessing a menu.
Pure Screen Reader Mode
This mode ensures compatibility with traditional screen reader functionality.
Results and Ongoing Progress
This study is ongoing, and we don’t expect results for some time. So far we’ve tracked these benchmarks internally:
- Over 95% accuracy for spoken answers on the first attempt.
- Near-perfection in accuracy by asking users to confirm each answer.
- A modest 70% speed reduction when compared to a visual survey.
- Widespread support for a variety of devices, including phones, PCs, and Macs.
- Complete anonymity: all speech encoding occurs on device, while end-to-end encrypted results transmit to a single workstation.
As we continue this study, we anticipate further refinements and improvements in accessibility for users with visual impairments, ultimately enhancing the inclusivity of our technological solutions for large-scale research studies.