Lip Reading and Reconstruction using ML
Abstract
Lip reading is a technique of comprehension of speech through visual interpretation of lip movements. Although lip reading is most often used by people who are deaf or hard of hearing, most people with normal hearing process some voice information from the sight of the moving mouth. In addition, understanding the language cues of lip readings can enhance the clarity of conversation in noisy environments. This paper proposes a model that identifies the impact of intermodal self monitoring for speech reconstruction (video-audio) by taking advantage of the natural occurrence of audio and visual streams in videos. The model that has an autoregressive encoder-decoder with an attention architecture, to map directly the sequences of silent facial movements to mel-scale spectrograms for speech reconstruction, which requires no human annotation.
Keywords:
lip reading, self supervised pre-training, speech recognition, speech reconstructionPublished
Issue
Section
License
Copyright (c) 2023 International Journal on Emerging Research Areas

This work is licensed under a Creative Commons Attribution 4.0 International License.
All published work in this journal is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
How to Cite
Similar Articles
- Honey Thomas, Linna Benny, Saya Nezrin, Navya Neethi S, Niya Joseph, Smart Communication Software for the Hearing Impaired Using Artificial Intelligence , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Khalid Hareef, Neenu, M N Sulthana , Nesmi Siddique, Number Plate Detection in Fog and Haze , International Journal on Emerging Research Areas: Vol. 3 No. 1 (2023): IJERA
- Evelyn Susan Jacob, Joel John, Raynell Rajeev, Steve Alex , Syam Gopi , Malware Classification using Image Analysis , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- Kashinath Remeshkumar, Abhijith R R Abhijith, Dan Philip Bobby, Kevin Varghese Theveril, Hema H H Hema, Zero Shot Low Light Image Enhancement using Vision Language Models and Semantic Diffusion , International Journal on Emerging Research Areas: Vol. 6 No. 1 (2026): IJERA
- Elsa George , Alphonsa Francis, Anna Job, Ann Maria James, Shiney Thomas, YOLOv8-Driven Approach for Wildlife Detection and Recognition , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
- George P Kurias, Gokul Krishna AU, Jifith Joseph, Sharunmon R, Linsa Mathew, A Review of Methodologies for Detecting Missing and Wanted People Using Machine Learning and Video Surveillance , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Shiney Thomas, Elsa George, Alphonsa Francis, Anna Job, Ann Maria James, Wildlife Detection And Recognition Using YOLO V8 , International Journal on Emerging Research Areas: Vol. 4 No. 2 (2024): IJERA
- Don Joseph, Fiyona Ann Sojan, Jimmy Mathew, Jobin Jomy Mathew, Bibin Varghese, A Review on Image and Video Processing with IoT-Enabled Supervised Learning for Intelligent Surveillance Systems , International Journal on Emerging Research Areas: Vol. 6 No. 1 (2026): IJERA
- M Manoj, A S Athira, Rishna Ramesh, Sandhra Gopi, Firoz P U, Smart Attend Insights , International Journal on Emerging Research Areas: Vol. 4 No. 1 (2024): IJERA
- Linsa Mathew, Jifith Joseph, George P Kurias, Gokul Krishna A U, Sharunmon R, TraceFusion: Precision AI for Missing and Wanted Person Detection , International Journal on Emerging Research Areas: Vol. 5 No. 1 (2025): IJERA
You may also start an advanced similarity search for this article.
