Ensemble Methods for Model Transparency
Ensemble Methods for Model Transparency
Ensemble of explainable AI methods for network security - model transparency, decision justification, and interpretable security analytics.
Journal of Computer and Knowledge Engineering, Vol. 7, No. 1, 2024. (1-8) Ferdowsi University of Mashhad Journal of Computer and Knowledge Engineering https://cke.um.ac.ir Information and Communication Technology Association of Iran ENIXMA: ENsemble of EXplainable Methods for Detecting Network Attacks* Research Article Seyed Mojtaba Abtahi1, Hossein Rahmani2, Milad Allahgholi3, Sajjad Alizadeh4 DOI: 10.22067/cke.2024.82986.1084 Abstract: The Internet has become an integral societal component, with its accessibility being imperative. However, malicious actors strive to disrupt internet services and exploit service providers. Countering such challenges necessitates robust methods for identifying network attacks. Yet, prevailing approaches often grapple with compromised precision and limited interpretability. In this paper, we introduce a pioneering solution named ENIXMA, which harnesses a fusion of machine learning classifiers to enhance attack identification. We validate ENIXMA using the CICDDoS2019 dataset. Our approach achieves a remarkable 90% increase in attack detection precision on the balanced CICDDoS2019 dataset, signifying a substantial advancement compared to antecedent methodologies that registered a mere 3% precision gain. We employ diverse preprocessing and normalization techniques, including z-score, to refine the data. To surmount interpretability challenges, ENIXMA employs SHAP, LIME, and decision tree methods to pinpoint pivotal features in attack detection. Additionally, we scrutinize pivotal scenarios within the decision tree. Notably, ENIXMA not only attains elevated precision and interpretability but also showcases expedited performance in contrast to prior techniques. Keywords: Network anomaly detection, Machine learning, Intrusion detection system, Ensemble learning, Interpretability.
2 Hossein Rahmani et. al.: ENIXMA: ENsemble of EXplainable … Figure 1. A General Overview Of The Categorization Of Tasks Performed In The Field Of Anomaly Detection In Network Data
Journal of Computer and Knowledge Engineering, Vol.7, No.1. 2024. 3 labeled as attack and non-attack cases, which is very time- consuming and may encounter unintended errors. In the semi-supervised approach, the training dataset doesn't need to be fully labeled. Although it reduces the complexity of labeling, it increases the ambiguity of the model providing network or system traffic. Unsupervised approaches do not require labels. These systems cluster similar patterns and behaviors. Chuanlang and colleagues [9] use recurrent neural networks to detect anomalies. They use both forward and backward propagation methods in their methodology. Their experiments are performed on the KDD-NSL dataset [30]. The classification is based on whether the attacks are normal or not. In their experiments, they increased the features from 41 features to 122 features, so the RNN-IDS model has 122 input nodes and 2 output nodes in binary classification experiments. The number of epochs is also 100. They performed the experiments with the number of hidden nodes, 20, 60, 80, 120, 240, and the learning rate, 0.01, 0.1, 0.5. The highest accuracy is for the number of hidden nodes 80 and the learning rate of 0.1. Abhijit Das and colleagues [10] used a combined approach based on three models: Balanced Bagging, XGBoost, and RF-HDDT. The parameters of Balanced Bagging and XGBoost are tuned for imbalanced data, and the Hellinger criterion complements the Random Forest to overcome the limitations of the default distance criterion. They propose two new algorithms to address the issue of class overlap in the dataset and apply them during training. These two algorithms are used to help improve the performance of the test dataset by influencing the final classifier decision made by the three basic classifiers as part of the ensemble classification, which uses a majority vote combiner. Their proposed scheme performs noticeably better than reported schemes for binary and multi-class classification cases. This implies that their combined approach can effectively handle both binary and multi-class anomaly detection problems, offering an advantage over traditional methods. Hua-Wu and colleagues [14] initially examine the architecture of DDoS and ascertain the details of its stages. They then study the procedures of DDoS attacks and select variables based on these characteristics. Ultimately, they use the K-nearest neighbor method to classify the network status at each stage of a DDoS attack. As you can see in Figure 2, the process works such that after training the K- nearest neighbor algorithm based on 9 selected features, data is collected online, then preprocessed, and in the final stage, it is classified into three classes: normal traffic, attack traffic, and pre-attack traffic. After conducting an experiment on this algorithm, the accuracy of this algorithm on the 2000 DARPA dataset is 91%. This result suggests that their approach, using the K- nearest neighbors algorithm with selected features that are representative of DDoS attacks, is effective in identifying such attacks. It is noteworthy that the accuracy of this method is quite high, indicating that it could be a robust method for real-world network security applications. Figure 2. General overview of the classification process using the k-nearest neighbors algorithm [14] Lazarovich and colleagues [15] focused on comparing anomaly detection techniques in unsupervised algorithms. They pursued various schemes for detecting outlier data for self-anomaly detection in their work. Most anomaly detection algorithms need a completely normal set of data to train the model and implicitly assume that anomalies can be identified as patterns that have not been previously observed. As an outlier might impact the measurements and modeling, we need to consider different plans for extracting these data to understand which one works effectively [16]. Their research emphasizes the importance of proper data handling, especially outliers, in the training process of anomaly detection models. By using various schemes for handling outlier data, they sought to determine the most effective methods for improving the model's ability to accurately identify anomalies. The results of such studies can significantly contribute to the enhancement of unsupervised learning algorithms used for anomaly detection, particularly in the context of network security and DDoS attack identification. Bolodurina and colleagues [32] investigate the issue of improving the accuracy of classification of network attacks on unbalanced CICDDoS2019 data using class sampling algorithms such as ROS, SMOTE, and ADASYN. The results of computational experiments show the effectiveness of data balancing algorithms in identifying network attacks. Additionally, the ADASYN adaptive synthetic sampling method improves the accuracy of type of attack classification by up to 84% compared to other algorithms. 2.2. Interpretability Among the significant methods in interpretability are LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) which have been extensively used in various works [17]–[20]. These
4 Hossein Rahmani et. al.: ENIXMA: ENsemble of EXplainable … methods provide non-intuitive, local, and model- independent interpretability. In a study, Rizi and colleagues [18] used the LIME method to examine the prediction performance of an LSTM (Long Short-Term Memory) model. They focused on extracting significant features in samples with incorrect predictions and, ultimately, by altering the effect of negative parameters, they improved the accuracy of their model. In a similar study, Singgata and colleagues [19] investigated the important features in the output of the XGBoost model. They implemented their findings on a dataset of user logs. They implemented their findings on a user log dataset. However, a fundamental point that exists in all similar articles is that there is no evaluation for interpretability. Marcelo and colleagues [22] have used interpretability for feature extraction. They used the SHAP method and, based on the score this method considers for each feature, they proceeded to select important features. Then, after feature selection, they carried out the classification task. In this study, the results of SHAP are compared with other existing feature selection methods such as ANOVA (Analysis of Variance). The results indicate that SHAP performs better in feature selection compared to other methods.
Journal of Computer and Knowledge Engineering, Vol.7, No.1. 2024. 5 Figure 4. The distribution of labels after balancing the dataset. 3.2. Preprocessing Initially, we remove missing values from the dataset, then we remove missing and single value data in all data. After that, we replace 'inf' values due to the fact that they impede the correct training of the model, with the maximum value of that feature. In the next step, using the z score normalization, we remove outlier data from the dataset. Finally, to better train the model, we normalize the data. 3.3. Balancing the Dataset Due to the imbalance of the data and the high ratio of datawith the attack label to data with the non-attack label, sampling was performed on the existing data, and the distribution of labels became as shown in Figure 4. Furthermore, we employed the undersampling technique to balance the dataset, using the clustering centroids method as our approach for selecting samples to be removed [26]. Due to the low number of data with the UDPLag label, even after balancing the dataset, the number of these data is still lower compared to the rest of the labels. 3.4. Interpretability Module In this section, we used three algorithms, SHAP, LIME, and decision tree, for better interpretability of attack detection and identification of important features for attack detection. SHAP Algorithm. The SHAP algorithm is a interpretation method used to analyze the influence of features on the predictions of a model. This algorithm is built upon cooperative game theory and Shapley value [17]. In this algorithm, for each instance of the data, the impact of each feature on the model's output prediction is calculated. To compute this influence, first, the Shapley value is calculated for each combination of features. Then, using these values, the impact of each feature on the model's prediction is calculated [22]. In our method, we first feed the balanced dataset into the random forest algorithm, and then we extract the important features using the SHAP algorithm. LIME Algorithm. The LIME algorithm is a model interpretability algorithm that is used to understand the behavior of machine learning models. This algorithm is particularly intended for models that operate in a non-linear manner. LIME is an acronym for Local Interpretable Model-agnostic Explanations and it essentially provides explanations for each input data by interpreting the model's decisions using a local linear function, stating which features are important for decision-making [20]. In our method, we first feed the balanced dataset into the random forest algorithm, then we separate the false positive data and give it to the LIME algorithm, and then we isolate the important and influential features. Decision Tree. Extracting rules from a decision tree can be very beneficial and vital in many machine learning models. Here are some explanations about the importance of extracting rules from a decision tree:
6 Hossein Rahmani et. al.: ENIXMA: ENsemble of EXplainable … more data can be processed easily and more trust can be placed in the trained model.
Journal of Computer and Knowledge Engineering, Vol.7, No.1. 2024. 7 In scenario 3, we observe that the high volume features are the reason for detecting UDP attacks. From a qualitative perspective, we were able to identify scenarios and important features for detecting types of attacks. This helps network experts to better identify and detect various attacks. Furthermore, in accordance with the article by Wei Gao et al. [33], the packet volume magnitude in detecting UDP attacks and the length of transmitted packets are crucial features in detecting Syn attacks.
8 Hossein Rahmani et. al.: ENIXMA: ENsemble of EXplainable … 10.12962/j23378557.v7i1.a8989. [21] I. Sharafaldin, A. H. Lashkari, S. Hakak, and A. A. Ghorbani, “Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy,” in Proc. 53rd International Carnahan Conference on Security Technology, Chennai, India, 2019. [22] W. E. Marcilio and D. M. Eler, “From explanations to feature selection: Assessing SHAP values as feature selection mechanism,” in Proceedings - 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images, SIBGRAPI 2020, 2020, pp. 340–347. doi: 10.1109/SIBGRAPI51738.2020.00053.. [23] Mirkovic, Jelena, Gregory Prier, and Peter Reiher. "Attacking DDoS at the source." 10th IEEE International Conference on Network Protocols, 2002. Proceedings.. IEEE, 2002. [24] J. Mirkovic, G. Prier, and P. Reiher, “Source-end DDoS defense,” in Second IEEE International Symposium on Network Computing and Applications, 2003. NCA 2003., pp. 171–178. doi: 10.1109/NCA.2003.1201153. [25] S. I. Ao and International Association of Engineers., International MultiConference of Engineers and Computer Scientists : IMECS 2009 : 18-20 March, 2009, Regal Kowloon Hotel, Kowloon, Hong Kong. Newswood Ltd., 2009. [26] X. Liang and T. Znati, “On the performance of intelligent techniques for intensive and stealthy DDos detection,” Computer Networks, vol. 164, Dec. 2019, doi: 10.1016/j.comnet.2019.106906. [27] X. Wu et al., “Top 10 algorithms in data mining,” Knowledge and Information Systems, vol. 14, no. 1, Jan. 2008, doi: 10.1007/s10115-007-0114-2. [28] D. Hu, P. Hong, and Y. Chen, “FADM: DDoS Flooding Attack Detection and Mitigation System in Software-Defined Networking,” Dec. 2017. doi: 10.1109/GLOCOM.2017.8254023. [29] Z. Xie, W. Dong, J. Liu, H. Liu, and D. Li, “Tahoe,” in Proceedings of the Sixteenth European Conference on Computer Systems, Apr. 2021, pp. 426–440. doi: 10.1145/3447786.3456251. [30] B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” Journal of Applied Science and Technology Trends, vol. 2, no. 01, pp. 20–28, Mar. 2021, doi: 10.38094/jastt20165. [31] S. K. Murthy, “Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey,” Data Mining and Knowledge Discovery, vol. 2, no. 4, 1998, doi: 10.1023/A:1009744630224. [32] H. Kousar, M. M. Mulla, P. Shettar, and D. G. Narayan, “Detection of DDoS Attacks in Software Defined Network using Decision Tree,” in 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT), Jun. 2021, pp. 783–788. doi: 10.1109/CSNT51715.2021.9509634. [33] Gao, W. and Morris, T.H., 2014. On cyber attacks and signature based intrusion detection for modbus based industrial control systems. Journal of Digital Forensics, Security and Law, 9(1), p.3.
Source: Phoenix Technical Documentation Library
Category: Network Security
Original: Peer-reviewed research paper / Official guideline
License: CC BY 4.0 (unless otherwise noted)
Suggested Citation:
ENIXMA: Explainable AI for Network Security. Phoenix Technical Documentation Library, Avondale.AI. Accessed May 2026. https://avondale.ai/technical/