Lip Reading and Reconstruction using ML
Abstract
Lip reading is a technique of comprehension of speech through visual interpretation of lip movements. Although lip reading is most often used by people who are deaf or hard of hearing, most people with normal hearing process some voice information from the sight of the moving mouth. In addition, understanding the language cues of lip readings can enhance the clarity of conversation in noisy environments. This paper proposes a model that identifies the impact of intermodal self monitoring for speech reconstruction (video-audio) by taking advantage of the natural occurrence of audio and visual streams in videos. The model that has an autoregressive encoder-decoder with an attention architecture, to map directly the sequences of silent facial movements to mel-scale spectrograms for speech reconstruction, which requires no human annotation.
Keywords:
lip reading, self supervised pre-training, speech recognition, speech reconstructionPublished
Issue
Section
License
Copyright (c) 2023 International Journal on Emerging Research Areas

This work is licensed under a Creative Commons Attribution 4.0 International License.
All published work in this journal is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
How to Cite
Similar Articles
- Dr nitha C Vellayudan, Akshay K.P, Muhamed Adhil P.M, C.A Sivasankar , Crop Yield and Price Prediction , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Amal P Varghese, Simy Mary Kurian, Advancements in ECG Heartbeat Classification: A Comprehensive Review of Deep Learning Approaches and Imbalanced Data Solutions , International Journal on Emerging Research Areas: Vol. 3 No. 2 (2023): IJERA
- Amala Jayan, Feneesha V B, Rameesa Dilsa C P, Sandra Maryam Binu, Sandra Maryam Binu, Stockwise: A survey on stock price prediction models , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Meenu Harikumar, Navya Sajeev, Sayoojya Saji, Sona Sunny, Prof.Thushara Sukumar, COMPARATIVE SYSTEM OF PRIVACY PRESERVING IMAGE BASED ENCRYPTION , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Juby Mathew, Maria Jojo, Neha Ann Samson, Noell Biju Michael, Ron T Alumkal, PulseSync: IoT-Enabled Monitoring and Predictive Analytics for Healthcare , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Meenu Harikumar, Navya Sajeev, Sayoojya Saji, Sona Sunny, Prof Thushara Sukumar, COMPARATIVE SYSTEM OF PRIVACY PRESERVING IMAGE BASED ENCRYPTION , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Dr.Sinciya P.O, Aaron Varughese Bino, Anamin Fathima Anish, Aathira Krishna, Dona Maria Joseph, Unveiling Stress through Facial Expressions: A Literature Review on Detection Methods , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Jannies Varghese, Hariprasad Prasanth, Blessy Mariam Babu, Chris Joseph, Bini M Issac, Deep Learning Techniques for Image Steganography: A Comprehensive Review , International Journal on Emerging Research Areas: Vol. 6 No. 1 (2026): IJERA
- Remya K R, Sudhama Swaminathan R, Vishnu Sudheer, Vishnukant PK, Nevin Nelson M, Automated Voice-Controlled PowerPoint Presentation Generation System from Voice/Text Prompts , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Joyal Joby Joseph, Michael Abraham Philips, Noel J Abraham, Steffi Maria Saji, Shiney Thomas, A Review of Parkinson Disease Detection Techniques , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
You may also start an advanced similarity search for this article.
