AI-Driven Data Quality Assurance in Multi-Cloud Data Warehousing Environments
DOI:
https://doi.org/10.47392/IRJAEH.2025.0462Keywords:
Multi-cloud data warehousing, Natural language processing (NLP), AI-assisted data quality frameworkAbstract
Multi-cloud data warehousing has emerged as a critical enabler for organizations seeking enhanced agility, scalability, and resilience in today’s rapidly evolving data-driven and cloud-native environments. Being subjected to various cloud platforms makes inconsistencies, latency, duplication, and governance imbalances harder to maintain and oversee, which is considered a significant problem today. This study aims to keep data quality across the cloud by developing an AI-driven data quality strategy. This framework employs a machine learning model that identifies, categorizes, and corrects data quality issues in cloud-based systems. This article implements a supervised learning model that relies on datasets from industry-specific cloud repositories to monitor data anomaly and data integrity infringement. Also, metadata and data lineage can be analyzed using NLP, enabling better traceability. Having executed the framework on AWS Redshift and Google BigQuery, the systems display effectiveness in scale, precision, and operational performance. The evidence indicates a 30% increase in anomaly detection accuracy with a reduction of 45% in overall time spent during the process. Like the prior models, this improves quality data management more anticipatively by using evolving data patterns. In addition, the AI-powered DQA solution proposed in this work considerably enhances data trustworthiness in multi-cloud data warehousing environments.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Research Journal on Advanced Engineering Hub (IRJAEH)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
.