Harnessing Ensemble Techniques for Sentiment Analysis and Toxic Comment Classification

Authors

  • Prof. Pramod Patil Assistant Professor - Computer Engineering, GCOERC, Nashik, Maharashtra, India. Author
  • Ankur Ahire UG - Computer Engineering, GCOERC, Nashik, Maharashtra, India. Author
  • Shubham Daware UG - Computer Engineering, GCOERC, Nashik, Maharashtra, India. Author
  • Sudarshan Gunjal UG - Computer Engineering, GCOERC, Nashik, Maharashtra, India. Author
  • Mohit Warke UG - Computer Engineering, GCOERC, Nashik, Maharashtra, India. Author

DOI:

https://doi.org/10.47392/IRJAEH.2025.0119

Keywords:

Machine learning, Natural language processing, Multi-label classification, Fasttext, Ensemble learning, Sentiment analysis, Toxic comment classification

Abstract

The rapid growth of user-generated content on digital platforms has raised concerns over toxic comments, which can disrupt online interactions. Sentiment analysis and toxic comment classification play a crucial role in moderating such content; however, traditional models often struggle with class imbalance, contextual ambiguity, and linguistic complexity, leading to inaccurate predictions. While machine learning and deep learning models have been widely applied, individual models frequently lack generalizability across diverse comment structures and sentiments. This research introduces FusionBoost, an ensemble learning approach that integrates Logistic Regression (LR) and XGBoost, leveraging their complementary strengths for improved predictive performance. The dataset undergoes rigorous preprocessing, including tokenization, stopword removal, and FastText embeddings, ensuring effective feature representation. Experimental results indicate that FusionBoost outperforms individual classifiers, significantly reducing false negatives in toxicity detection and improving sentiment classification accuracy. The study underscores the effectiveness of ensemble learning in addressing contextual challenges and enhancing model interpretability. Future research may explore transformer-based architectures like BERT to further refine classification performance. This work contributes to the development of more robust and interpretable natural language processing (NLP) models, facilitating safer and more meaningful digital interactions.

Downloads

Download data is not yet available.

Downloads

Published

2025-03-28

How to Cite

Harnessing Ensemble Techniques for Sentiment Analysis and Toxic Comment Classification. (2025). International Research Journal on Advanced Engineering Hub (IRJAEH), 3(03), 841-849. https://doi.org/10.47392/IRJAEH.2025.0119

Similar Articles

1-10 of 543

You may also start an advanced similarity search for this article.