Generating Visuals for Non-Words Using Phonetic and Phonological Similarity Models
DOI:
https://doi.org/10.47392/IRJAEH.2025.0305Keywords:
Text-to-Image Generation, Nonword Processing, Phonetic Conversion, Phonological Analysis, Deep Learning, Computational Creativity, Auditory-to-Visual MappingAbstract
Generating images from textual descriptions is a complex and captivating area of artificial intelligence, blending computational creativity with linguistic analysis. This study pioneers an innovative approach by exploring the relationship between phonetic and phonological structures and their visual representations, extending text-to-image generation to include nonwords—linguistic constructs that do not exist within a given language. By analyzing acoustic properties such as intonation, rhythm, and stress patterns alongside linguistic features, our system establishes a robust mapping between auditory inputs and visual outputs. Using a phonetic conversion and phonological similarity module, nonwords are transformed into meaningful embeddings, which are then processed by an interpolation module and an image synthesis network. The model generates images that align with the phonetic essence of both real and imagined language constructs, expanding text-to-image synthesis beyond traditional semantics. This approach offers applications in language learning, digital art, and AI-driven creativity, enhancing contextual relevance in text-based visual generation.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Research Journal on Advanced Engineering Hub (IRJAEH)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.