International Core Journal of Engineering 2020-26 | Page 109
2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM)
An Air Quality Grade Forecasting Approach Based
on Ensemble Learning
Weike Liu Hang Zhang Qingbao Liu
Science and Technology on
Information Systems Engineering
Laboratory
National University of Defense
Technology
Changsha China
[email protected] Science and Technology on
Information Systems Engineering
Laboratory
National University of Defense
Technology
Changsha China
[email protected] Science and Technology on
Information Systems Engineering
Laboratory
National University of Defense
Technology
Changsha China
[email protected]
Leveraging Bagging [7] is used to learn the training data set
to get the forecasting model, which makes predictions on the
prediction dataset. Experiments also compare the accuracy of
the prediction models generated by Leveraging Bagging with
other learning algorithms. The rest of this paper is organized
as follows: Section 2 describes the impact factors for urban air
quality prediction, followed by an introduction to data
acquisition and data preprocessing. Section 3 introduces the
ensemble learning methods and the process of the air quality
grade forecasting method. Section 4 shows the experimental
results. And the conclusion is shown in Section 5.
Abstract—This paper proposes an air quality grade
forecasting method based on ensemble learning. First, the
training data sets are formed of the air quality data and related
meteorological data crawled from air quality data website. After
that, use the ensemble learning algorithm Leveraging Bagging
to learn the training dataset and generate initial air quality
grade forecasting model. And the initial forecasting model is
used to make prediction on the prediction dataset. In total, the
experiments test the learning algorithm both on the city scale
and the station scale. Experimental results show that the
proposed method has good prediction effect and good
forecasting ability on the real forecast dataset.
II. A IR Q UALITY I MPACT F ACTORS AND D ATA A CQUISITION
Keywords—ensemble learning, air quality grade forecasting,
MOA
A. Air quality impact factors
Most of the existing researches on air quality impact
factors are based on the air quality data of a city or a province.
Their findings indicate that the relationship between air
quality and meteorological factors in different regions of the
same country are usually different. This relationship is
determined by the regional economic development, which
influences the amount of fuel consumption, contaminant types,
contaminant emission amount, pollution treatment technology,
urban green area and other factors. These factors affect the
adsorption and degradation of atmospheric pollutants within
the city. In addition, regional topography also affects the
diffusion conditions of atmospheric pollutants through
atmospheric horizontal and vertical exchanges. Therefore, the
level of economic development of the city, the pollution
emissions of major industrial facilities, as well as the
topography and climate of the region where the city is located
will have an impact on local air quality impact factors.
I. I NTRODUCTION
Traditional air pollution modeling includes Gaussian
models of different complexity, Lagrange models, chemical
transport models and so on [1]. In spite of making use of the
technological development of atmospheric science and
computer science, these models are severely dependent on
real-time updated meteorological data and a detailed list of
emission sources. It imposes significant limitations on the use
of these models. Therefore, some statistical models based on
machine learning algorithms are used to predict air quality
grade. Bougoudis et al. [2] aimed at finding the conditions of
high-pollution and used a more generalized hybrid model
which is based on unsupervised clustering. The hybrid model
integrates artificial neural networks (ANN), random forests
(RF) and fuzzy logic to predict the multi-index pollutants in
Athens. Zhao et al. [3] proposed a Deep Recurrent Neural
Network (DRNN) to predict daily air quality grade. Huang
and Guo [4] also used the hybrid model of neural network and
Long Short-Term Memory (LSTM) to predict the
concentration of PM2.5. Liu et al. [5] obtained a reliable air
quality prediction model using Support Vector Machines
(SVM) by using monitoring data from three cities of Beijing,
Tianjin and Shijiazhuang. Vong et al. [6] also used SVM to
predict air quality (NO2, SO2, O3, SPM) from pollutants and
meteorological data in Macao and China.
Air pollution Index (API) combines several air
contaminant concentrations and form a single index, which
measures the air quality and is suitable for describing the
short-term city air quality status and trends. Air pollutants
include: soot, total suspended particulate matter, respirable
Particulate Matter (PM10), Sulfur Dioxide (SO2), Nitrogen
Dioxide (NO2), Ozone (O3), and Carbon monoxide (CO),
volatile organic compounds, and so on. However, with the
rapid development of economy and society, API can no longer
meet the requirements of current air pollution monitoring. For
example, fine Particulate Matter (PM2.5), the main pollutant
that frequently appears now, is not included in the API index.
Therefore, since 2012, AQI has gradually replaced the past
API to be the current air quality assessment indicators. Among
This paper proposes an air quality grade forecasting
method based on ensemble learning algorithm. First, a web
crawler is used to collect public air quality data from website.
Then, the data is preprocessed to generate training datasets
and prediction datasets. The ensemble learning algorithm
978-1-7281-4691-1/19/$31.00 ©2019 IEEE
DOI 10.1109/AIAM48774.2019.00024
87