SPEAK: An AI-Based Assistive Video Communication System for Speech and Sign Language Translation
Abstract
It is still very difficult for the hearing and deaf/hardof-
hearing (DHH) communities to effectively communicate, especially
when it comes to digital video conferencing. Despite
the widespread use of platforms like Zoom and Google Meet,
they frequently require costly human interpreters or invasive
hardware sensors due to their lack of native, real-time bidirectional
translation capabilities. In order to close this modality gap,
this paper presents SPEAK (Sign Processing Enhanced Audio
Kommunicator), a novel sensor-less browser-based platform. By
translating spoken language to text captions for DHH users
and sign language to text/speech for hearing users, SPEAK
enables smooth, two-way communication. By translating spoken
language to text captions for DHH users and sign language to
text/speech for hearing users, SPEAK enables smooth, two-way
communication.
For visual recognition, the system’s architecture makes use
of the Detection Transformer (DETR) model with a ResNet-50
backbone.DETR formulates detection as a direct set prediction
problem using a bipartite matching loss and self-attention mechanisms,
in contrast to conventional CNN-based detectors that
rely on region proposals. enhancing robustness against complex
backgrounds and doing away with the need for intricate, handcrafted
anchors. The audio pipeline simultaneously incorporates
Microsoft’s SpeechT5 for natural Text-to-Speech (TTS) synthesis
and OpenAI’s Whisper model for high-fidelity Automatic Speech
Recognition (ASR). optimized to save bandwidth using Voice
Activity Detection (VAD). To guarantee synchronization between
video frames and translation outputs, all modules are coordinated
within a low-latency WebRTC environment using a Flask-React
framework. SPEAK is validated as a scalable, affordable solution
for inclusive digital interaction after experimental evaluation on
a custom dataset in various lighting conditions shows a sign
detection accuracy of 92
Keywords:
Sign Language Recognition, DETR,, WebRTC,, OpenAI Whisper, Assistive Technology, Deep LearningPublished
Issue
Section
License
Copyright (c) 2026 International Journal on Emerging Research Areas

This work is licensed under a Creative Commons Attribution 4.0 International License.
All published work in this journal is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
How to Cite
Similar Articles
- Amal P Varghese, Simy Mary Kurian, Advancements in ECG Heartbeat Classification: A Comprehensive Review of Deep Learning Approaches and Imbalanced Data Solutions , International Journal on Emerging Research Areas: Vol. 3 No. 2 (2023): IJERA
- Ryan Leo , Mathews P Jose, Eirene Nikky , Lloyd Micheal, Chinnu Edwin A , Controlling a Mini Game using a Brain-Computer Interface , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Jincy Lukose, Anita Ann Joseph, Meenakshy BR , Nevin Siby, Rosaine P Lal , ENHANCED PNEUMONIA DETECTION IN CHEST X-RAYS USING ATTENTION AND FNMS , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Rehan T Raj, Rinil Johns, Reema Maria Suresh, Reema Maria Suresh, Nehala Noushad, Anishamol Abraham, A Survey of Automatic Brain Tumor Detection and Classification Techniques , International Journal on Emerging Research Areas: Vol. 6 No. 2 (2026): IJERA
- Richa Maria Biju, Merwin Maria Antony, Mishal Rose Thankachan, Joshua John Sajit, Bini M Issac, Enhancing Image Forgery Detection with Multi-Modal Deep Learning and Statistical Methods , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Sebastian Biju, Samuel Michael, Thomas Mathew Jose, Mathew Abraham, Minu Cherian, A Review of Machine Learning Approaches for Canine Skin Disease Detection Using Image Processing Techniques , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Betzy Babu Thoppil, Midhun P Mathew, Sania Elsa Reji, Nazreen Shanavaaz, Unnimaya v Ashok, Nila S S Nila, Comparative Study of Deep Learning Models for Pneumonia Classification , International Journal on Emerging Research Areas: Vol. 6 No. 1 (2026): IJERA
- Akshaya Babu, Amala Saju, Athulya C A, Mary Niya Sebastian, Nisy John Panicker, PlateGuard: Ensuring Security with YOLOv5 ANPR Technology , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- NS AkhilRaj, Snehil Jacob Raju, John Basil Varghese, Sreeraj K S, Yadukrishnan P, Directio-AR Assisted ShopMate , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Mrs. Lis Jose, Akhil Lorence, Akhil Manohar, Amal Jose Chacko, Arjun J, Lung Disease Detection From Chest X-ray Images Using Hybrid Machine Learning Model , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
You may also start an advanced similarity search for this article.
