Lightweight Transformer-Based Cyberbullying Detection for English, Malayalam, and Manglish Social Media Texts

Authors

  • Clive Lawrence Xavier Scholar, Department of Computer Science, Sacred Heart College Kochi, India. Author
  • C Vishnu Mohan Assistant Professor, Department of Computer Science, Sacred Heart College Kochi, India. Author

DOI:

https://doi.org/10.47392/IRJAEH.2025.0661

Keywords:

Cyberbullying detection, NLP, Malayalam, Manglish, IndicBERT v

Abstract

Detecting cyberbullying in multilingual and informal online environments remains a significant challenge, particularly for low-resource languages such as Malayalam and for mixed-script variants like Manglish. This work proposes a lightweight, deployment-oriented framework for cyberbullying detection that aims to classify English, Malayalam, and Manglish social media text with high accuracy and efficiency. The design centers on a single multilingual transformer encoder—IndicBERT v2—adapted using Low-Rank Adaptation (LoRA) to reduce computational requirements while preserving strong representational capacity. A combined dataset, envisioned to include publicly available internet text, existing corpora, and generative AI–augmented samples, is planned to support broad linguistic and contextual coverage. Minimal preprocessing and native SentencePiece tokenization are incorporated into the design to retain natural text characteristics across diverse languages. The proposed system is intended to output binary bullying predictions alongside interpretable indicators such as class-level confidence scores and attention-based token importance. Further optimizations, including INT8 quantization and ONNX/TFLite export, are outlined to facilitate efficient real-time use on resource-constrained devices. Overall, this work presents a scalable and practical model design for cyberbullying detection in linguistically diverse and code-mixed social media environments.

Downloads

Download data is not yet available.

Downloads

Published

2025-12-26

How to Cite

Lightweight Transformer-Based Cyberbullying Detection for English, Malayalam, and Manglish Social Media Texts . (2025). International Research Journal on Advanced Engineering Hub (IRJAEH), 3(12), 4499-4505. https://doi.org/10.47392/IRJAEH.2025.0661

Similar Articles

1-10 of 380

You may also start an advanced similarity search for this article.