International Core Journal of Engineering 2020-26 | Page 109

2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM) An Air Quality Grade Forecasting Approach Based on Ensemble Learning Weike Liu Hang Zhang Qingbao Liu Science and Technology on Information Systems Engineering Laboratory National University of Defense Technology Changsha China [email protected] Science and Technology on Information Systems Engineering Laboratory National University of Defense Technology Changsha China [email protected] Science and Technology on Information Systems Engineering Laboratory National University of Defense Technology Changsha China [email protected] Leveraging Bagging [7] is used to learn the training data set to get the forecasting model, which makes predictions on the prediction dataset. Experiments also compare the accuracy of the prediction models generated by Leveraging Bagging with other learning algorithms. The rest of this paper is organized as follows: Section 2 describes the impact factors for urban air quality prediction, followed by an introduction to data acquisition and data preprocessing. Section 3 introduces the ensemble learning methods and the process of the air quality grade forecasting method. Section 4 shows the experimental results. And the conclusion is shown in Section 5. Abstract—This paper proposes an air quality grade forecasting method based on ensemble learning. First, the training data sets are formed of the air quality data and related meteorological data crawled from air quality data website. After that, use the ensemble learning algorithm Leveraging Bagging to learn the training dataset and generate initial air quality grade forecasting model. And the initial forecasting model is used to make prediction on the prediction dataset. In total, the experiments test the learning algorithm both on the city scale and the station scale. Experimental results show that the proposed method has good prediction effect and good forecasting ability on the real forecast dataset. II. A IR Q UALITY I MPACT F ACTORS AND D ATA A CQUISITION Keywords—ensemble learning, air quality grade forecasting, MOA A. Air quality impact factors Most of the existing researches on air quality impact factors are based on the air quality data of a city or a province. Their findings indicate that the relationship between air quality and meteorological factors in different regions of the same country are usually different. This relationship is determined by the regional economic development, which influences the amount of fuel consumption, contaminant types, contaminant emission amount, pollution treatment technology, urban green area and other factors. These factors affect the adsorption and degradation of atmospheric pollutants within the city. In addition, regional topography also affects the diffusion conditions of atmospheric pollutants through atmospheric horizontal and vertical exchanges. Therefore, the level of economic development of the city, the pollution emissions of major industrial facilities, as well as the topography and climate of the region where the city is located will have an impact on local air quality impact factors. I. I NTRODUCTION Traditional air pollution modeling includes Gaussian models of different complexity, Lagrange models, chemical transport models and so on [1]. In spite of making use of the technological development of atmospheric science and computer science, these models are severely dependent on real-time updated meteorological data and a detailed list of emission sources. It imposes significant limitations on the use of these models. Therefore, some statistical models based on machine learning algorithms are used to predict air quality grade. Bougoudis et al. [2] aimed at finding the conditions of high-pollution and used a more generalized hybrid model which is based on unsupervised clustering. The hybrid model integrates artificial neural networks (ANN), random forests (RF) and fuzzy logic to predict the multi-index pollutants in Athens. Zhao et al. [3] proposed a Deep Recurrent Neural Network (DRNN) to predict daily air quality grade. Huang and Guo [4] also used the hybrid model of neural network and Long Short-Term Memory (LSTM) to predict the concentration of PM2.5. Liu et al. [5] obtained a reliable air quality prediction model using Support Vector Machines (SVM) by using monitoring data from three cities of Beijing, Tianjin and Shijiazhuang. Vong et al. [6] also used SVM to predict air quality (NO2, SO2, O3, SPM) from pollutants and meteorological data in Macao and China. Air pollution Index (API) combines several air contaminant concentrations and form a single index, which measures the air quality and is suitable for describing the short-term city air quality status and trends. Air pollutants include: soot, total suspended particulate matter, respirable Particulate Matter (PM10), Sulfur Dioxide (SO2), Nitrogen Dioxide (NO2), Ozone (O3), and Carbon monoxide (CO), volatile organic compounds, and so on. However, with the rapid development of economy and society, API can no longer meet the requirements of current air pollution monitoring. For example, fine Particulate Matter (PM2.5), the main pollutant that frequently appears now, is not included in the API index. Therefore, since 2012, AQI has gradually replaced the past API to be the current air quality assessment indicators. Among This paper proposes an air quality grade forecasting method based on ensemble learning algorithm. First, a web crawler is used to collect public air quality data from website. Then, the data is preprocessed to generate training datasets and prediction datasets. The ensemble learning algorithm 978-1-7281-4691-1/19/$31.00 ©2019 IEEE DOI 10.1109/AIAM48774.2019.00024 87