Abstract
Accurate and real-time detection of nitrogen oxide (NOx) concentration at the inlet of a denitrification reactor plays a key role in controlling NOx emission in thermal power plants. However, time delays often exist when using the traditional continuous emission monitoring system (CEMS) to obtain NOx concentration. In the present work, a data-driven method based on random forest (RF) is proposed to address this issue. First, a heuristic method is proposed for extracting variables that are beneficial for modeling based on the maximum information coefficient (MIC). To tune the threshold of MIC, an RF regression model is constructed, and the MIC threshold can be adjusted iteratively. Then, the variable importance index of RF is used in evaluating the remaining variables, and redundant variables are deleted. Second, an improved RF regression algorithm is used to establish NOx emission prediction model and an updating strategy is proposed to ensure that the model can be maintained timely and effectively when applied online. Finally, the proposed method is tested using a real-world industrial dataset. The results show that the proposed method has a greater prediction accuracy (root-mean-squared error (RMSE) = 2.90 mg/m3, mean absolute percentage error (MAPE) = 1.41%, mean absolute error (MAE) = 2.01 mg/m3, and R2 = 0.96 on industrial dataset) and robustness compared to traditional models.