logo

Zero Shot Low Light Image Enhancement using Vision Language Models and Semantic Diffusion

Authors

  • Kashinath Remeshkumar

    Sree Buddha College Of Engineering Pattoor
    Author
  • Abhijith R R Abhijith

    Sree Buddha College Of Engineering Pattoor
    Author
  • Dan Philip Bobby

    Sree Buddha College Of Engineering Pattoor
    Author
  • Kevin Varghese Theveril

    Author
  • Hema H H Hema

    Author

Abstract

Capturing clear images in low-light conditions remains a significant challenge across surveillance, mobile photography, and diagnostic imaging. Traditional enhancement methods require extensive paired datasets or risk introducing visual artifacts. This paper presents a zero-shot low-light image enhancement framework combining vision-language models (CLIP) with latent diffusion models (Stable Diffusion) to enhance images without task-specific training. CLIP extracts semantic embeddings to guide the enhancement process, while the diffusion model performs iterative denoising to restore brightness and detail. By constraining enhancement through semantic similarity, our method preserves scene content while improving visibility. The system achieves competitive PSNR (15.556 dB) and SSIM (0.729) scores on standard benchmarks without requiring paired training data, demonstrating practical applicability for real-world deployment scenarios including embedded and mobile platforms.

Keywords:

low-light enhancement, zero-shot learning, diffusion models, vision-language models
Views 0
Downloads 0

Published

29-05-2026

Issue

Section

Articles

How to Cite

[1]
K. Remeshkumar, A. R. R Abhijith, D. Philip Bobby, K. Varghese Theveril, and H. H. H Hema, “Zero Shot Low Light Image Enhancement using Vision Language Models and Semantic Diffusion”, IJERA, vol. 6, no. 1, pp. 77–83, May 2026, Accessed: May 29, 2026. [Online]. Available: https://ijera.in/index.php/IJERA/article/view/380

Similar Articles

21-30 of 208

You may also start an advanced similarity search for this article.