Talklingo: A Smart Solution for Multilingual Communication

Authors

  • Ms.R.R. Owhal Assistant professor, Artificial Intelligence and Data Science Department, All India Shri Shivaji Memorial Society’s Institute of Information Technology, Pune, Maharashtra, India. Author
  • Pauravi Vinchurkar Artificial Intelligence and Data Science Department, All India Shri Shivaji Memorial Society’s Institute of Information Technology, Pune, Maharashtra, India. Author
  • Harsh Raut Artificial Intelligence and Data Science Department, All India Shri Shivaji Memorial Society’s Institute of Information Technology, Pune, Maharashtra, India. Author
  • Abhijeet Ravatale Artificial Intelligence and Data Science Department, All India Shri Shivaji Memorial Society’s Institute of Information Technology, Pune, Maharashtra, India. Author
  • Swapnil Pokale Artificial Intelligence and Data Science Department, All India Shri Shivaji Memorial Society’s Institute of Information Technology, Pune, Maharashtra, India. Author

DOI:

https://doi.org/10.47392/IRJAEH.2025.0064

Keywords:

RAG (Retrieval-Augmented Generation), LLMs (Large Language Models), ASR (Automatic Speech Recognition), MT (Machine Translation), TTS (Text-to-Speech synthesis)

Abstract

In today’s interconnected world, language barriers hinder access to essential services like education, healthcare, and global collaboration, creating a pressing need for efficient multilingual communication tools. Traditional text-based translators, while useful, often fall short in supporting natural, spontaneous speech, making them inadequate for live conversations. To address this, TalkLingo introduces an innovative speech-to-speech translation system that seamlessly integrates Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS) technologies. By incorporating Retrieval-Augmented Generation (RAG) into its MT framework, TalkLingo enhances translation accuracy and context-awareness, leveraging a vast knowledge base to deliver precise and natural translations. TalkLingo utilizes advanced large language models (LLMs) to improve translation quality and speed, ensuring real-time performance even in fast-paced dialogues. The system employs Whisper ASR for reliable speech recognition, mT5 with RAG for contextually accurate translations, and Edge TTS for lifelike voice output, creating a smooth and intuitive user experience. It balances translation speed with high accuracy, making it highly effective for real-world applications. Rigorous testing has demonstrated TalkLingo’s exceptional performance across diverse languages, accents, and challenging environments, outperforming traditional systems in accuracy and fluency. Its RAG-based architecture provides a significant advantage over conventional models, particularly for low-resource languages and complex linguistic nuances. TalkLingo’s applications are vast, from aiding travelers and professionals in cross-lingual communication to supporting individuals with speech impairments. By breaking language barriers and enabling natural, real-time conversations, TalkLingo fosters inclusivity and global connectivity, positioning itself as a transformative tool for multilingual interaction in an increasingly globalized world.

Downloads

Download data is not yet available.

Downloads

Published

2025-03-15

How to Cite

Talklingo: A Smart Solution for Multilingual Communication. (2025). International Research Journal on Advanced Engineering Hub (IRJAEH), 3(03), 465-472. https://doi.org/10.47392/IRJAEH.2025.0064

Similar Articles

1-10 of 247

You may also start an advanced similarity search for this article.