Estimation of Concentration of Air Pollutants in Shazand Thermal Power Plant with Support Vector Machine Model Based on Selection of Effective Input Variables with Partial Mutual Information (PMI) Algorithm of Distribution of Air Pollutants



1 Department of Environment, Islamic Azad University, North Tehran Branch, Iran

2 Department of Environment, Ahvaz Branch, Islamic Azad University, Ahvaz, Iran

3 Department of Water Resources Engineering, Ahvaz Branch, Islamic Azad University, Ahvaz, Iran



Due to the difficulty of estimating the pollutant gas concentration in power plants, this study aimed to estimate the concentration of the air pollutants in a thermal power plant using the support vector machine model (SVM).
The concentration of environmental pollutants in the thermal power plant, Shazand, Iran, at different distances from the chimney was estimated using SVM. The effective input variables in the SVM model were selected using the Partial Mutual Information (PMI) algorithm. The modeling period was weekly from December 2018 to December 2019.
The PMI algorithm showed that the effective input variables for estimating the concentration of carbon monoxide (CO), carbon dioxide (CO2), sulfur dioxide (SO2), and nitrogen dioxide (NOX) pollutants at different distances are the same gas’ concentration at the power plant chimney. Among air pollutants, the maximum concentration is related to Co2 (2811.63 µg/m3), occurring at a distance of 5 km from the power plant chimney and the lowest concentration is related to Co (5.5 µg/m3, occurring at a distance of 20 km from the power plant chimney.
The polynomial kernel function is the best kernel function of the SVM model for estimating SO2 and NOX concentrations at different distances and the best kernel function in the SVM model for estimating CO2 and CO concentrations.
The SVM model has good accuracy and performance in estimating the pollutant concentrations, and selecting effective input variables in the SVM model with the PMI algorithm increases the model accuracy.


Main Subjects

Adams, Derrick, et al. (2020)            Prediction of SOx–NOx emission from a coal-fired CFB power plant with machine learning: Plant data learned by deep neural network and least square support vector machine. J Clean Prod 270:122310.
Akaike, H. (1974)                A new look at the statistical model identification. IEEE Transactions on Automatic Control 19:716-723.
Alexiadis, MC, et al. (1998)              Short-term forecasting of wind speed and related electrical power. Sol Energy 63(1):61-68.
Arain, MA, et al. (2007)    The use of wind fields in a land use regression model to predict air pollution concentrations for health exposure studies. Atmos Environ 41(16):3453-3464.
Awasthi, Seema, Mukesh Khare, and Prashant Gargava 2006  General plume dispersion model (GPDM) for point source emission. Environ Model Assess 11(3):267-276.
Conti, John, et al. (2016)   International energy outlook 2016 with projections to 2040. USDOE Energy Information Administration (EIA), Washington, DC (United States).
Cover, Thomas M (1991)  J. A. Thomas, Elements oflnformation Theory: New York: Wiley.
David, Florence Nightingale (1938) Tables of the ordinates and probability integral of the distribution of the correlation coefficient in small samples: Cambridge University Press.
Davies, Laurie, and Ursula Gather (1993)      The identification of multiple outliers. J Am Stat Assoc 88(423):782-792.
Goebel, Bernhard, et al. (2005)        An approximation to the distribution of finite sample size mutual information estimates. IEEE International Conference on Communications, 2005. ICC 2005. 2005, 2005. Vol. 2, pp. 1102-1106. IEEE.
Hamel, Lutz H (2011)        Knowledge discovery with support vector machines. Volume 3: John Wiley & Sons.
Hasenfratz, David, et al. (2012)        Participatory air pollution monitoring using smartphones. Mobile Sensing 1:1-5.
Hosseinnezhad, Vahid, and Ebrahim Babaei (2013)    Economic load dispatch using θ-PSO. Int J Electr Power Energy Syst 49:160-169.
Kariniotakis, GN, GS Stavrakakis, and EF Nogaret (1996)        Wind power forecasting using advanced neural networks models. IEEE Trans Energy Convers 11(4):762-767.
Lin, Kuo-Ping, Ping-Feng Pai, and Shun-Ling Yang (2011)      Forecasting concentrations of air pollutants by logarithm support vector regression with immune algorithms. APPL MATH COMPUT 217(12):5318-5327.
Lippmann, Morton, et al. (2003)      The US Environmental Protection Agency Particulate Matter Health Effects Research Centers Program: a midcourse report of status, progress, and plans. Environ Health Perspect 111(8):1074-1092.
Liu, Xiyu, Hong Liu, and Huichuan Duan (2007)        Particle swarm optimization based on dynamic niche technology with applications to conceptual design. Adv Eng Softw 38(10):668-676.
Lu, Wei-Zhen, et al. (2004)               Potential assessment of a neural network model with PCA/RBF approach for forecasting pollutant trends in Mong Kok urban air, Hong Kong. Environ Res 96(1):79-87.
May, Robert J, et al. (2006)               Critical values of a kernel density-based mutual information estimator. The 2006 IEEE International Joint Conference on Neural Network Proceedings, 2006, pp. 4898-4903. IEEE.
May, Robert J, et al. (2008)               Non-linear variable selection for artificial neural networks using partial mutual information. Environ Model Softw 23(10-11):1312-1326.
Pearson, Ronald K (2002) Outliers in process modeling and identification. IEEE Transactions on control systems technology 10(1):55-63.
Perera, Frederica (2018)    Pollution from fossil-fuel combustion is the leading environmental threat to global pediatric health and equity: solutions exist. Int J Environ Res Public Health 15(1):16.
Pope III, C Arden (2007)   Mortality effects of longer term exposures to fine particulate air pollution: review of recent epidemiological evidence. Inhal Toxicol 19(sup1):33-38.
Reikard, Gordon (2012)     Forecasting volcanic air pollution in Hawaii: tests of time series models. Atmos Environ 60:593-600.
Shannon, Claude E (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379-423.
Wahid, H, et al. (2013)       Neural network-based meta-modelling approach for estimating spatial distribution of air pollutant levels. APPL SOFT COMPUT 13(10):4087-4096.
Zheng, Yu, Furui Liu, and Hsun-Ping Hsieh (2013)    U-air: When urban air quality inference meets big data. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013, pp. 1436-1444.
Zhou, Junyi, Jing Shi, and Gong Li (2011)    Fine tuning support vector machines for short-term wind speed forecasting. ENERG CONVERS MANAGE 52(4):1990-1998.
  • Receive Date: 11 February 2021
  • Revise Date: 12 March 2021
  • Accept Date: 26 March 2021
  • First Publish Date: 28 March 2021