Text to Image Conversion using Stable Diffusion

Ashy Correya; Amrutha N

doi:10.54105/ijdm.A1639.04010524

PDF

Published: May 30, 2024

DOI: https://doi.org/10.54105/ijdm.A1639.04010524

Keywords:

Text-to-Image Conversion, Stable Diffusion, Latent Diffusion Model, Fine-Tuning, LAION-5B Dataset

Ashy Correya

Department of Computer Science, St. Albert’s College, Kochi (Kerala), India.

https://orcid.org/0009-0008-8438-9055

Amrutha N

Department of Computer Science, St. Albert’s College, Kochi (Kerala), India.

Abstract

In this paper, we introduce a pioneering technique for translating textual descriptions into visually compelling images using stable diffusion methods, with a particular emphasis on the latent diffusion model (LDM). Our approach represents a departure from conventional methods like Generative Adversarial Networks (GANs) and Attn GAN, offering enhanced accuracy and diversity in the generated images. Through extensive experimentation and comparative analysis, we validate the efficacy of our method. Leveraging the LAION-5B dataset, we fine-tune the stable diffusion model, resulting in superior performance in text-to-image conversion tasks. Our findings underscore substantial advancements in accuracy, showcasing the promise of stable diffusion-based approaches across a spectrum of applications. By embracing stable diffusion techniques, we overcome some of the limitations encountered in previous methodologies. This enables us to achieve a higher fidelity in image generation while maintaining a diverse output spectrum. Our method excels in capturing intricate details and nuances specified in textual descriptions, facilitating a more faithful translation from text to image. The significance of our work extends beyond mere technical improvements. By pushing the boundaries of image synthesis, we contribute to the evolution of artificial intelligence, fostering new possibilities for creative expression and content generation. Our approach not only enhances the capabilities of AI systems but also democratizes the process of image creation, empowering users to effortlessly translate their ideas into visually stunning representations. Through our research, we aim to inspire further exploration and innovation in the realm of text-to-image conversion. The success of stable diffusion-based methods underscores their potential to revolutionize various domains, including computer vision, graphic design, and multimedia content creation. As we continue to refine and optimize these techniques, we anticipate even greater strides in the field of AI, ushering in a new era of intelligent image synthesis and interpretation.

Downloads

Download data is not yet available.

How to Cite

[1]

Ashy Correya and Amrutha N , Trans., “Text to Image Conversion using Stable Diffusion”, IJDM, vol. 4, no. 1, pp. 17–20, May 2024, doi: 10.54105/ijdm.A1639.04010524.

Issue

Vol. 4 No. 1 (2024): Volume-4 Issue-1, May 2024

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

CC-BY-NC-ND 4.0

How to Cite

[1]

Ashy Correya and Amrutha N , Trans., “Text to Image Conversion using Stable Diffusion”, IJDM, vol. 4, no. 1, pp. 17–20, May 2024, doi: 10.54105/ijdm.A1639.04010524.

Download Citation

References

Vincent, James (May 24, 2022). "All these images were generated by Google's latest text-to-image AI". The Verge. Vox Media. Coldewey, Devin (6 April 2022). "OpenAI's new DALL-E model draws anything — but bigger, better, and faster than before". TechCrunch.

Reed, Scott; Akata, Zeynep; Logeswaran, Lajanugen; Schiele, Bernt; Lee, Honglak (June 2016). "Generative Adversarial Text to Image Synthesis" (PDF). International Conference on Machine Learning.

Monica, Kambhampati., & Rao, D. R. (2020). Text to Image Translation using Cycle GAN. In International Journal of Engineering and Advanced Technology (Vol. 9, Issue 4, pp. 1294–1297). https://doi.org/10.35940/ijeat.d8703.049420

Vinoth, V. V., & Kanniga, E. (2019). Steganographical Techniques in Hiding Text Images – System. In International Journal of Recent Technology and Engineering (IJRTE) (Vol. 9, Issue 2, pp. 6535–6537). https://doi.org/10.35940/ijrte.b3578.078219

Angadi, S. A., & Purad, H. C. (2023). Image Retrieval Through Free-Form Query using Intelligent Text Processing. In International Journal of Innovative Technology and Exploring Engineering (Vol. 12, Issue 7, pp. 40–50). https://doi.org/10.35940/ijitee.g9618.0612723

A., O., & O, B. (2020). An Iris Recognition and Detection System Implementation. In International Journal of Inventive Engineering and Sciences (Vol. 5, Issue 8, pp. 8–10). https://doi.org/10.35940/ijies.h0958.025820

Monica, Kambhampati., & Rao, D. R. (2020). Text to Image Translation using Cycle GAN. In International Journal of Engineering and Advanced Technology (Vol. 9, Issue 4, pp. 1294–1297). https://doi.org/10.35940/ijeat.d8703.049420

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

How to Cite

References

Most read articles by the same author(s)