A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research
Md Rajaul Karim, MM Mahbubul Syeed, Ashifur Rahman, Khondkar Ayaz Rabbani, Kaniz Fatema, Razib Hayat Khan, Md Shakhawat Hossain, Mohammad Faisal Uddin
Scientific Data (Nature) , Vol. 12 (1)

Abstract
Assessment and monitoring of surface water quality are essential for food security, public health, and ecosystem protection. Although water quality monitoring is a known phenomenon, little effort has been made to offer a comprehensive and harmonized dataset for surface water at the global scale. This study presents a comprehensive surface water quality dataset that preserves spatio-temporal variability, integrity, consistency, and depth of the data to facilitate empirical and data-driven evaluation, prediction, and forecasting. The dataset is assembled from a range of sources, including regional and global water quality databases, water management organizations, and individual research projects from five prominent countries in the world, e.g., the USA, Canada, Ireland, England, and China. The resulting dataset consists of 2.82 million measurements of eight water quality parameters that span 1940 - 2023. This dataset can support meta-analysis of water quality models and can facilitate Machine Learning (ML) based data and model-driven investigation of the spatial and temporal drivers and patterns of surface water quality at a cross-regional to global scale.
Citation
Md Rajaul Karim, MM Mahbubul Syeed, Ashifur Rahman, Khondkar Ayaz Rabbani, Kaniz Fatema, Razib Hayat Khan, Md Shakhawat Hossain, Mohammad Faisal Uddin. "A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research." Scientific Data (Nature) 12.1 (2025).
BibTeX
@article{pub15_2025, title={A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research}, author={Md Rajaul Karim, MM Mahbubul Syeed, Ashifur Rahman, Khondkar Ayaz Rabbani, Kaniz Fatema, Razib Hayat Khan, Md Shakhawat Hossain, Mohammad Faisal Uddin}, journal={Scientific Data (Nature)}, volume={12}, number={1}, year={2025}, doi={https://doi.org/10.1038/s41597-025-04715-4} }
Publication Details
2025
Scientific Data (Nature)
12
1
Share
Related Publications

Ripen Banana Dataset: A Comprehensive Resource for Carbide …
Elman Alam, Md Tarequl Islam, Ishrat Zahan Raka, Onamika Sarkar Ritu, Md Shakha…

Predicting the effect of Bevacizumab therapy in ovarian can…
Md Shakhawat Hossain, Munim Ahmed, Md Sahilur Rahman, MM Mahbubul Syeed, Mohamm…

Automated Gleason Grading of Prostate Cancer from Low-Resol…
Md Shakhawat Hossain, Md Sahilur Rahman, Munim Ahmed, Anowar Hussen, Zahid Ulla…