Wav2Vec-based Audio Data Augmentation for Low-Resource Speech Recognition
DOI:
https://doi.org/10.47392/IRJAEH.2026.0014Keywords:
Audio Data Augmentation, Low-Resource Speech Recognition, Wav2Vec, Automatic Speech Recognition, Audio ProcessingAbstract
Audio Data Augmentation (ADA) is a transformative process of small datasets into voluminous datasets. ADA can be performed on any type of dataset namely Images (Mel Spectogram), Audio and Text, based on applications such as Gender Identification, Speech Recognition, Text Summarization. ADA plays a vital role in the development of Automatic Speech Recognition (ASR) systems when the experimental language datasets are smaller in size and are being low resource. This article focuses on performing ADA techniques namely the addition of noise, pitch shifting, increasing or decreasing of speed and adding reverberation to the audio signals. The proposed method includes preprocessing, data augmentation, audio transcription using pre-trained Self-Supervised Learning based Wav2vec models; and finally with the post-processing of data on the removal of induced tags from the transcribed data. The article integrates audio transcription after performing audio augmentation techniques to evaluate the quality of speech using Word Error Rate (WER). The proposed Audio Data Augmentation for Low Resource Speech Recognition (ADA-LRSR) with the integration of Wav2Vec (Vakyansh) achieved an overall WER of 0.5231, which was promising than that of other Wav2Vec variants (Base and Large). The suggested approach is evaluated on a manually recorded 39 preprocessed audio files and obtained 312 audio files after augmentation. In addition, ADA-LRSR’s framework chose the addition of noise and reverberation as the best augmentation techniques with preservation of speech quality.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Research Journal on Advanced Engineering Hub (IRJAEH)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
.