Background:
Data mining (DM) is an approach used in extracting valuable information from environmentalprocesses. This research depicts a DM approach used in extracting some information from influent andeffluent wastewater characteristic data of a waste stabilization pond (WSP) in Birjand, a city in Eastern Iran.Methods:
Multiple regression (MR) and neural network (NN) models were examined using influentcharacteristics (pH, Biochemical oxygen demand [BOD5], temperature, chemical oxygen demand [COD],total suspended solids [TSS], total dissolved solid [TDS], electrical conductivity [EC] and turbidity) asthe regression input vectors. Models were adjusted to input attributes, effluent BOD5 (BODout) and COD(CODout). The models performances were estimated by 10-fold external cross-validation. An internal 5-foldcross-validation was also used for the training data set in NN model. The models were compared usingregression error characteristic (REC) plot and other statistical measures such as relative absolute error (RAE).Sensitivity analysis was also applied to extract useful knowledge from NN model.Results: NN models (with RAE = 78.71 ± 1.16 for BODout and 83.67 ± 1.35 for CODout) and MR models(with RAE = 84.40% ± 1.07 for BODout and 88.07 ± 0.80 for CODout) indicate different performances andthe former was better (P < 0.05) for the prediction of both effluent BOD5 and COD parameters. For theprediction of CODout the NN model with hidden layer size (H) = 4 and decay factor = 0.75 ± 0.03 presentedthe best predictive results. For BODout the H and decay factor were found to be 4 and 0.73 ± 0.03, respectively.TDS was found as the most descriptive influent wastewater characteristics for the prediction of the WSPperformance. The REC plots confirmed the NN model performance superiority for both BOD and CODeffluent prediction.Conclusion: Modeling the performance of WSP systems using NN models along with sensitivity analysiscan offer better understanding on exploring the most significant parameters for the prediction of systemperformance. The findings of this study could build the foundation for prospective work on the characterizationof WSP operations and optimization of their performances with a view to conducting statistical approaches.