Attack Pattern Recognition & Performance
Attack Pattern Recognition & Performance
Implementation strategies for ML-powered intrusion detection - attack pattern recognition, performance metrics, and production deployment considerations.
International Journal of Electrical and Computer Engineering (IJECE) Vol. 14, No. 5, October 2024, pp. 5894~5905 ISSN: 2088-8708, DOI: 10.11591/ijece.v14i5.pp5894-5905 ο² 5894 Journal homepage: http://ijece.iaescore.com Fortifying network security: machine learning-powered intrusion detection systems and classifier performance analysis Arar Al Tawil1, Lara Al-Shboul2, Laiali Almazaydeh3, Mohammad Alshinwan1,4 1Faculty of Information Technology, Applied Science Private University, Amman, Jordan 2King Abdullah II School of Information Technology, The University of Jordan, Amman, Jordan 3College of Information Technology, Al-Hussein Bin Talal University, Maβan, Jordan 4MEU Research Unit, Middle East University, Amman, Jordan Article Info ABSTRACT Article history: Received Feb 21, 2024 Revised Jun 18, 2024 Accepted Jul 1, 2024 Intrusion detection systems (IDS) protect networks from threats; they actively monitor network activity to identify and prevent malicious actions. This study investigates the application of machine learning methods to strengthen IDS, explicitly emphasizing the comprehensive CICIDS 2017 dataset. The dataset was refined by implementing stringent preprocessing methods such as feature normalization, class imbalance management, feature reduction, and feature selection to ensure its quality and lay the foundation for developing robust models. The performance evaluation of three classifiers-support vector machine (SVM), extreme gradient boosting (XGBoost), and naive Bayes was highly impressive. Vital accuracy, precision, recall, and F1-score values of 0.984389, 0.984479, 0.984375, and 0.984304, respectively, were achieved by SVM. Notably, XGBoost demonstrated exceptional performance across all metrics, attaining flawless scores of 1.0. naive Bayes demonstrated noteworthy accuracy, precision, recall, and F1-score performance, which were recorded as 0.877392, 0.907171, 0.877007, and 0.876986, respectively. The results of this study emphasize the critical importance of preparation methods in improving the effectiveness of IDS via machine learning. This further demonstrates the potential of particular classifiers to detect and prevent network intrusions efficiently, thereby substantially contributing to cybersecurity measures. Keywords: Class imbalance handling Classification Feature selection Intrusion detection systems Preprocessing This is an open access article under the CC BY-SA license. Corresponding Author: Arar Al Tawil Faculty of Information Technology, Applied Science Private University Amman, Jordan Email: ar_altawil@asu.edu.jo
Int J Elec & Comp Eng ISSN: 2088-8708 ο² Fortifying network security: machine learning-powered intrusion detection β¦ (Arar Al Tawil) 5895 detection methodologies [2], [3]. The fundamental classification is predicated on signature detection, which compares distinctive patterns in network traffic (e.g., byte sequences) with an established repository of recognized attack signatures. In contrast, the anomaly-based detection approach assesses the current state of a network about a predetermined reference point. This empowers it to identify and discern both established and novel perils. Furthermore, it is critical to note that dimensional reduction is a prevalent technique utilized in machine learning, mainly when dealing with feature spaces comprising numerous dimensions. Learning by machines is analogous to instructing computers to improve their task performance without being explicitly instructed on each step. It is everything about developing programs that can use data to enhance intelligence. Observing the data and learning from it to identify patterns and generate more precise predictions is the initial step in the learning process. The primary objective is for computers to acquire knowledge autonomously and adjust their behavior without requiring continuous human supervision [4]. Preprocessing can significantly impact the overall predictive performance of a supervised machine learning algorithm in the context of generating hypotheses using novel data. One of the most formidable challenges encountered in inductive machine learning pertains to detecting and eliminating chaotic instances. These cases commonly demonstrate substantial departures from the standard, frequently distinguished by many absent or inconsequential attribute values. Often, these exceptionally aberrant characteristics are denoted as outliers. In addition, in situations where working with huge datasets is impractical, it is typical to select a representative sample from the massive set while also addressing the problem of missing data [5]. Our study employed A comprehensive preprocessing strategy to improve data quality and maximize the efficiency of our machine-learning models. The approach utilized various methods, including data normalization for consistent scaling. Data normalization entails reducing the magnitude of numerical characteristics in a dataset to a standard range, typically from 0 to 1. This mechanism prevents any one feature from exerting an excessive influence on machine learning models by ensuring that all features have an equal impact [6]. Feature selection by correlation entails identifying and retaining the most pertinent characteristics present in a given dataset. The primary objective is to decrease the dimensionality of the data without altering the attributes that maintain the most robust associations with the target variable. This streamlines the process of modeling [7]. Managing missing data techniques entails the implementation of approaches to address data instances or attributes that contain null or incomplete values. Conventional approaches to managing missing values encompass imputation and exclusion. Imputation entails employing statistical techniques to compensate for missing values, while exclusion entails excluding instances containing missing data from the analysis [5]. Class imbalance strategies aim to alleviate the problem when one class is significantly underrepresented relative to the others in a given dataset. These methods aim to restore equilibrium to the class distribution so that machine-learning models can generate accurate predictions for all classes, including those with fewer instances and do not favor the majority class. Methods include oversampling, undersampling, and applying suitable evaluation metrics [8]. Implementing these preprocessing procedures was critical in empowering our machine learning models to generate precise and resilient forecasts, even when confronted with intricate and practical datasets. This paper tackles the critical issue of improving IDS to ensure that they can accurately identify and mitigate network intrusions, which is essential for maintaining a robust cybersecurity system. The proposed solution entails using machine learning techniques, with a particular emphasis on preprocessing methods such as feature normalization, class imbalance management, feature reduction, and feature selection, to enhance the quality of the data and construct robust models. The study assesses the efficacy of three classifiers: Naive Bayes, extreme gradient boosting (XGBoost), and support vector machine (SVM). The results suggest that SVM obtained high accuracy, precision, recall, and F1-score, whereas XGBoost exhibited extraordinary performance with flawless scores across all metrics. Although Naive Bayes was less effective than the other two, it still demonstrated significant precision and accuracy. This research expands upon previous research by utilizing rigorous preprocessing techniques and assessing the efficacy of various classifiers on the CICIDS 2017 dataset. The results emphasize the superior performance of XGBoost and the critical role of data preparation in enhancing the effectiveness of IDS. Following this, the remaining sections are structured as follows: an examination of the literature about intrusion detection systems and machine learning algorithms is presented in section 2. The methodology utilized in this study is delineated in section 3, encompassing the selection of datasets, preprocessing procedures, and experimental configuration. The evaluation and implementation of multiple machine learning classifiers for intrusion detection are described in section 4. The results and analysis of the experiments are detailed in section 5, emphasizing performance metrics, including accuracy, false positives, and detection rate (DR). In conclusion, the paper is summarized in section 6, which also analyzes the main findings' implications and proposes potential directions for future research.
ο² ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 14, No. 5, October 2024: 5894-5905 5896
Int J Elec & Comp Eng ISSN: 2088-8708 ο² Fortifying network security: machine learning-powered intrusion detection β¦ (Arar Al Tawil) 5897 (RMSE) and minimum false positives. The random tree classifier had the lowest mean accuracy rate yet with the smallest receiver operating characteristic (ROC) value. Meanwhile, the multi-layer perceptron (MLP) and naive Bayes classifiers showed similar average accuracy rates. The Bayes network algorithm has demonstrated exceptional performance in accurately recognizing regular packets. On the other hand, while the Decision Table algorithm did not achieve the maximum level of accuracy, it exhibited the lowest rate of false negatives and efficient model construction. Ultimately, rule-based classifiers such as the decision table provide a favorable balance by achieving satisfactory accuracy and instilling a greater sense of certainty, mainly because they have the lowest rates of false negatives when used for intrusion detection. Agarwal et al. [14] featured a thorough examination that used three different machine learning classification algorithms: naΓ―ve Bayes (NB), SVM, and k-nearest neighbor (KNN). The main objective was to determine their efficacy in improving accuracy and reducing processing time using the UNSW-NB15 dataset. The primary goal was to identify the most appropriate algorithm for acquiring knowledge about the complexities of suspicious network activity. The selection of the most suitable algorithm for training the IDS was facilitated by conducting a comparative study of feature sets. The selected algorithm was then used to forecast and analyze future incursion behavior. During the testing phase of the model, performance measures such as accuracy, recall, and F1-score were systematically produced. Additionally, confusion matrices were created and compared to determine the best validation and support status achieved. The derived results show that the SVM outperformed the other algorithms, achieving an impressive accuracy rate of 0.977. This highlights the outstanding appropriateness of SVM in the study model, showcasing its capacity to handle the dataset successfully and improve intrusion detection skills. Emanet et al. [15] focuses on developing a sophisticated IDS that prioritizes enhanced accuracy using strategic feature selection and ensemble learning techniques. Using the CIC-CSE-IDS2018 dataset, the study progresses through two crucial phases, substantially contributing to its overall effect. The first refining of the dataset entails carefully selecting features and using ensemble learning methods to enhance the performance of IDS by combining the capabilities of several classifiers. Implementing ensemble learning afterward results in a resilient model, improving attack detection and substantially decreasing detection time. The suggested ensemble model achieves an impressive accuracy rate of 98.82% by using under-sampling and feature selection techniques. This results in a significant decrease of 73% in intrusion detection time and a modest improvement of 3% in accuracy. Spearman's correlation analysis, recursive feature elimination (RFE), and chi-square test procedures are used to determine the essential elements that enhance the efficiency of IDS. A comparative comparison of classifiers, such as additional trees, decision trees, and logistic regression, demonstrates reasonable accuracy rates while considering actual implementation time. The significance of this research is its contribution to the advancement of IDS capabilities through the proposal of an ensemble learning model that surpasses individual classifiers. This affirms the model's potential impact on future intrusion detection systems and strengthens computer security across various domains. Additionally, it paves the way for innovative approaches in the field. Fitni and Ramli [16] addresses the growing concerns about data security in organizational information systems. It emphasizes the necessity for more robust defensive mechanisms to counter sophisticated assaults that may bypass standard security technologies such as firewalls and antivirus software. This study aims to overcome the constraints of existing IDSs by using an ensemble learning technique. The approach combines logistical regression, decision trees, and gradient boosting as effective classifiers. Using the CSE-CIC-IDS2018 dataset and employing Spearman's rank correlation coefficient, the research improves the model by carefully choosing 23 essential characteristics from a pool of 80, considerably boosting its concentration. The experimental results illustrate the strength of the ensemble model, displaying exceptional performance metrics: a final accuracy of 98.8%, precision, and recall rates of 98.8% and 97.1%, respectively, resulting in an excellent F1-score of 97.9%. These results highlight the effectiveness of ensemble learning in strengthening IDS capabilities, making significant progress in tackling current difficulties and enhancing network security. Al Tawil and Sabri [17] introduces a novel feature selection algorithm for IDS that employs the moth flame optimization (MFO) algorithm. The objective of the proposed algorithm is to reduce the time required for training and improve the precision of the model by selecting pertinent features. The algorithm was evaluated on the CIC-2017 dataset, resulting in a reduction of the number of features from 78 to 4. It obtained a high detection rate (100%) and accuracy (99.9%) with a lower false alarm rate. Table 1 provides a comprehensive summary of the machine learning algorithms applied to intrusion detection systems, as documented in the relevant literature. Every cell in the table represents a distinct study, providing comprehensive information regarding the algorithms utilized, datasets incorporated, performance metrics assessed, and significant discoveries attained. This comparative analysis illuminates the efficacy of various methodologies in detecting and classifying intrusions, providing essential perspectives for improving cybersecurity protocols.
ο² ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 14, No. 5, October 2024: 5894-5905 5898 Table 1. Machine learning approaches in intrusion detection Ref. Algorithms used Dataset Performance metrics Key findings 9 Bayesian network, NB, DT, random decision forest, random tree, decision table, ANN KDD'99 cup Precision, Recall, F1-score, Accuracy RF classifier achieved the highest accuracy of 0.94. 10 ID3-BA (ID3 classifier + bees algorithm) KDD Cup99 FAR, DR, AR ID3-BA model achieved a DR of 91.02%, AR of 92.002%, and FAR of 3.917%. 2 FGLCC-CFA (Filter: FGLCC, Wrapper: CFA) KDD CUP99 Accuracy, DR, False Positives, Fitness Function FGLCC-CFA algorithm achieved a DR of 95.23%, AR of 95.03%, and false positives rate of 1.65%. 11 NN, RF, SVM KDD Cup 99 Accuracy SVM algorithm achieved the highest accuracy score of 0.94. 12 SSO KDDCUP 99 Accuracy SSO achieved an accuracy rate of 93.3% and reduced the number of features from 41 to 6. 13 Decision table, RF, random tree, MLP, NB, Bayes network
Int J Elec & Comp Eng ISSN: 2088-8708 ο² Fortifying network security: machine learning-powered intrusion detection β¦ (Arar Al Tawil) 5899 Table 2. Dataset Label Number of instances Benign 227,3097 DoS Hulk 23,1073 PortScan 15,8930 DDoS 12,8027 DoS GoldenEye 10,293 FTP-Patator 7,938 SSH-Patator 5,897 DoS slowloris 5,796 DoS Slowhttptest 5,499 Bot 1,966 Web Attack Brute Force 1,507 Web Attack XSS 652 Infiltration 36 Web Attack SQL Injection 21 Heartbleed 11 XSS: Cross-site scripting 3.2. Preprocessing methods In the context of this research, several preprocessing techniques were applied to enhance the quality of the dataset and prepare it for further analysis and modeling. These methods aimed to address data imbalances, handle missing values and standardize the feature set for a more robust and accurate analysis. Using these preprocessing techniques, the dataset was converted into a more appropriate format for analysis and modeling, effectively tackling problems such as class imbalance, missing data, and feature relevance. These processes provide the groundwork for more precise and dependable outcomes in the subsequent phases of the study. 3.2.1. Class reduction based To simplify the dataset and improve computational efficiency, classes with more than 10,000 instances were retained while others were reduced or excluded. This reduction process ensures that the dataset remains manageable and that computational resources are used effectively. By focusing on labels with a substantial number of instances, the analysis can prioritize the most prevalent and significant classes for a more efficient and targeted study. 3.2.2. Feature selection using correlation Feature selection based on correlation [19] was employed to identify and retain the most relevant features while eliminating redundant or highly correlated ones. A correlation threshold of 0.6 was applied, selecting the best 40 features from the dataset. This step optimizes the feature set by focusing on those attributes that have the most significant impact on the analysis, while also ensuring that the selected features are not highly correlated with each other. As we show in (1) present the formula of Correlation. π= β(ππβπβΎ)(ππβπβΎ) ββ(ππβπβΎ)2β(ππβπβΎ)2 (1) 3.2.3. Class imbalance To address class imbalance [20] issues, the RandomUnderSampler [21] technique was applied. This method reduces the number of instances in the overrepresented classes, effectively balancing the class distribution and preventing biased model training. By ensuring a more equitable representation of each class, the RandomUnderSampler helps improve the performance and reliability of the machine learning models. This approach ensures that the models can accurately predict outcomes for all classes, including those with fewer instances. 3.2.4. Missing data The missing of data may lead to the introduction of disturbances and errors in the analysis. The dataset was imputed using the SimpleImputer [22] using the 'mean' technique to replace missing values. This approach involves substituting missing values with the average value of the related characteristic, hence maintaining the integrity and significance of the data. 3.2.5. Normalization To ensure that all characteristics are measured on the same scale, the StandardScaler technique [23] was used. This approach normalizes the elements, providing a mean of 0 and a standard deviation of 1.
ο² ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 14, No. 5, October 2024: 5894-5905 5900 Standardization facilitates the attainment of a consistent and comprehensible dataset, which is especially crucial for specific machine-learning algorithms.
Int J Elec & Comp Eng ISSN: 2088-8708 ο² Fortifying network security: machine learning-powered intrusion detection β¦ (Arar Al Tawil) 5901 examples to the total number of occurrences in the dataset. A greater level of accuracy signifies a more significant proportion of accurate forecasts. Equation (2) represents the formal to calculate the accuracy. π΄πππ’ππππ¦= (ππ+ππ) (ππ+πΉπ+πΉπ+ππ) (2) 5.3.2. Precision Almazaydeh et al. [24] is a quantitative measure that assesses the model's capacity to accurately forecast favorable outcomes. The calculation involves dividing the number of correct optimistic predictions by the total number of positive predictions generated by the model. Greater accuracy indicates that the model has a higher probability of correctly predicting a joyous event. Equation (3) represents the formal to calculate the precision. ππππππ πππ= ππ (ππ+πΉπ) (3) 5.3.3. Recall Almazaydeh et al. [24] sometimes called sensitivity or actual positive rate, quantifies the model's capacity to detect all positive cases accurately. It computes the proportion of accurate optimistic predictions relative to the dataset's overall number of positive cases. A higher recall signifies that the model can capture more good cases. Equation (4) represents the formal to calculate the recall. π πππππ= ππ (ππ+πΉπ) (4) 5.3.4. F1-score Almazaydeh et al. [25] is calculated as the reciprocal of the arithmetic mean of the reciprocals of accuracy and recall. It compromises accuracy and recall by including erroneous positives and false negatives into a unified score. The F1-score is precious when there is a disparity between the number of positive and negative classifications in the dataset. Equation (5) represents the formal to calculate the F1-score. πΉ1 βπππππ= 2 β (π πππππβππππππ πππ) (π πππππ+ππππππ πππ) (5) 5.4. Results The classifiers' performance was evaluated using several measures, such as model accuracy, model precision, model recall, and model F1-score. These metrics provide a comprehensive assessment of how well each classifier performs in terms of both prediction accuracy and the balance between precision and recall. The findings, as shown in Table 3, highlight the comparative effectiveness of the SVM, XGBoost, and naive Bayes classifiers. Table 3. Result Model Model accuracy Model precision Model recall Model F1-score SVM 0.984389 0.984479 0.984375 0.984304 XGBoost 1 1 1 1 Naive Bayes 0.877392 0.907171 0.877007 0.876986 In Figure 2 illustrates the model accuracy scores for SVM, XGBoost, and naive Bayes. It provides a comparative view of the accuracy achieved by each model, showing the performance in correctly predicting instances across the dataset. In Figure 3 displaying precision scores for SVM, XGBoost, and Naive Bayes, this figure highlights the precision achieved by each model. Precision is a metric that quantifies the proportion of accurate optimistic predictions produced by the model out of all the positive predictions it made. Figure 4 displays the recall scores for SVM, XGBoost, and naive Bayes, illustrating the models' capacity to identify all positive cases accurately. The term "true positive rate" refers to the proportion of correctly predicted positive cases in the dataset relative to the total number of positive instances. Figure 5 displays the F1-scores for SVM, XGBoost, and naive Bayes. This figure represents a composite measure that considers both precision and recall. The F1-score is precious when there is a disparity between the number of positive and negative classifications in the dataset.
ο² ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 14, No. 5, October 2024: 5894-5905 5902 Figure 2. Model accuracy across different models Figure 3. Model precision across different models Figure 4. Model recall across different models
Int J Elec & Comp Eng ISSN: 2088-8708 ο² Fortifying network security: machine learning-powered intrusion detection β¦ (Arar Al Tawil) 5903 Figure 5. Model F1-score across different models 6. CONCLUSION In conclusion, this investigation tackles the critical issue of improving IDS to accurately identify and mitigate network intrusions, a prerequisite for a robust cybersecurity system. In order to enhance the quality of the data, the study implements sophisticated machine learning techniques and rigorous preprocessing strategies, such as class reduction, feature selection, class imbalance management, missing data treatment, and feature normalization, by utilizing the CICIDS 2017 dataset. The results of the evaluations of the SVM, XGBoost, and naive Bayes classifiers were compelling. XGBoost exhibited extraordinary performance, attaining near-perfect scores in the F1-score, precision, recall, and accuracy metrics. SVM also demonstrated commendable performance, consistently achieving high ratings across various metrics. Despite its comparatively inferior performance compared to SVM and XGBoost, naive Bayes still achieved significant results. These results emphasize the critical role of preprocessing techniques in improving the efficiency of IDS through machine learning. The potential of specific classifiers, notably XGBoost, to accurately identify and mitigate network intrusions is underscored by their exceptional performance, significantly enhancing cybersecurity measures. This research enhances the existing body of work by employing rigorous preprocessing techniques and evaluating various classifiers on the CICIDS 2017 dataset. In addition to demonstrating promising results, additional research utilizing a broader range of classifiers and diverse datasets could provide a more profound understanding of the potential of machine learning to improve IDS capabilities and guarantee robust network security. REFERENCES [1] H.-J. Liao, C.-H. Richard Lin, Y.-C. Lin, and K.-Y. Tung, βIntrusion detection system: a comprehensive review,β Journal of Network and Computer Applications, vol. 36, no. 1, pp. 16β24, Jan. 2013, doi: 10.1016/j.jnca.2012.09.004. [2] S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee, and H. Karimipour, βCyber intrusion detection by combined feature selection algorithm,β Journal of Information Security and Applications, vol. 44, pp. 80β88, Feb. 2019, doi: 10.1016/j.jisa.2018.11.007. [3] K. Rajasekaran and K. Nirmala, βClassification and importance of intrusion detection system,β International Journal of Computer Science and Information Security, vol. 10, no. 8, pp. 44β47, 2020. [4] N. Burkart and M. F. Huber, βA survey on the explainability of supervised machine learning,β Journal of Artificial Intelligence Research, vol. 70, pp. 245β317, Jan. 2021, doi: 10.1613/jair.1.12228. [5] D. Kotsiantis, S. B. Kanellopoulos and P. E. Pintelas, βData preprocessing for supervised leaning,β International journal of computer science, vol. 1, no. 2, pp. 111β117, 2006. [6] P. J. M. Ali, R. H. Faraj, E. Koya, P. J. M. Ali, and R. H. Faraj, βData normalization and standardization: a technical report,β Mach Learn Tech Rep, vol. 1, no. 1, pp. 1β6, 2014. [7] U. M. Khaire and R. Dhanalakshmi, βStability of feature selection algorithm: a review,β Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 4, pp. 1060β1073, Apr. 2022, doi: 10.1016/j.jksuci.2019.06.012. [8] H. Ali, M. N. Mohd Salleh, R. Saedudin, K. Hussain, and M. F. Mushtaq, βImbalance class problems in data mining: a review,β Indonesian Journal of Electrical Engineering and Computer Science, vol. 14, no. 3, pp. 1552β1563, Jun. 2019, doi: 10.11591/ijeecs.v14.i3.pp1552-1563. [9] H. Alqahtani, I. H. Sarker, A. Kalim, S. M. Minhaz Hossain, S. Ikhlaq, and S. Hossain, βCyber intrusion detection using machine learning classification techniques,β in Communications in Computer and Information Science, vol. 1235, Springer Singapore, 2020, pp. 121β131.
ο² ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 14, No. 5, October 2024: 5894-5905 5904 [10] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, βA new feature selection model based on ID3 and bees algorithm for intrusion detection system,β Turkish Journal of Electrical Engineering & Computer Sciences, vol. 23, pp. 615β622, 2015, doi: 10.3906/elk- 1302-53. [11] G. S. Sajja, M. Mustafa, R. Ponnusamy, and S. Abdufattokhov, βMachine learning algorithms in intrusion detection and classification,β Annals of the Romanian Society for Cell Biology, vol. 25, no. 6, pp. 12211β12219, 2021. [12] Y. Y. Chung and N. Wahid, βA hybrid network intrusion detection system using simplified swarm optimization (SSO),β Applied Soft Computing, vol. 12, no. 9, pp. 3014β3022, Sep. 2012, doi: 10.1016/j.asoc.2012.04.020. [13] M. Almseidin, M. Alzubi, S. Kovacs, and M. Alkasassbeh, βEvaluation of machine learning algorithms for intrusion detection system,β Sep. 2017, doi: 10.1109/sisy.2017.8080566. [14] A. Agarwal, P. Sharma, M. Alshehri, A. A. Mohamed, and O. Alfarraj, βClassification model for accuracy and intrusion detection using machine learning approach,β PeerJ Computer Science, vol. 7, Apr. 2021, doi: 10.7717/peerj-cs.437. [15] S. Emanet, G. Karatas Baydogmus, and O. Demir, βAn ensemble learning based IDS using voting rule: VEL-IDS,β PeerJ Computer Science, vol. 9, Sep. 2023, doi: 10.7717/peerj-cs.1553. [16] Q. R. S. Fitni and K. Ramli, βImplementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems,β in 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), Jul. 2020, pp. 118β124, doi: 10.1109/IAICT50021.2020.9172014. [17] A. Al Tawil and K. E. Sabri, βA feature selection algorithm for intrusion detection system based on moth flame optimization,β in 2021 International Conference on Information Technology (ICIT), Jul. 2021, pp. 377β381, doi: 10.1109/ICIT52682.2021.9491690. [18] UNB, βIntrusion detection evaluation dataset (CIC-IDS2017),β University of New Brunswick, 2023, https://www.unb.ca/cic/datasets/ids-2017.html (accessed Feb. 21, 2024). [19] R. Duangsoithong and T. Windeatt, βCorrelation-Based and Causal Feature Selection Analysis for Ensemble Classifiers,β in Artificial Neural Networks in Pattern Recognition (ANNPR); Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2010, pp. 25β36. [20] M. Buda, A. Maki, and M. A. Mazurowski, βA systematic study of the class imbalance problem in convolutional neural networks,β Neural Networks, vol. 106, pp. 249β259, Oct. 2018, doi: 10.1016/j.neunet.2018.07.011. [21] G. LemaΓΕ½tre, F. Nogueira, and C. K. Aridas, βImbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning,β Journal of machine learning research, vol. 18, no. 17, pp. 1β5, 2017. [22] M. Kuhn and K. Johnson, Feature engineering and selection: a practical approach for predictive models. Chapman and Hall/CRC, 2019. [23] H. T. Nguyen, A. H. Cao, and P. H. D. Bui, βElectrocardiogram-based heart disease classification with machine learning techniques,β 2023, pp. 689β701. [24] L. Almazaydeh, R. Alsalameen, and K. Elleithy, βHerbal leaf recognition using mask-region convolutional neural network (MASK R-CNN),β Journal of Theoretical and Applied Information Technology, vol. 100, no. 11, pp. 3664β3671, 2022. [25] L. Almazaydeh, M. Abuhelaleh, A. Al Tawil, and K. Elleithy, βClinical text classification with word representation features and machine learning algorithms,β International Journal of Online and Biomedical Engineering (iJOE), vol. 19, no. 04, pp. 65β76, Apr. 2023, doi: 10.3991/ijoe.v19i04.36099. BIOGRAPHIES OF AUTHORS Arar Al Tawil he earned his BSc. in computer science from Al-Hussein Bin Talal University, Jordan, in 2018, followed by an MSc. from Jordan University in 2021. He currently serves as a lecturer and developer specializing in virtual reality and game design. He holds a prominent position as a lecturer in the esteemed Faculty of Information Technology at Applied Science Private University, Amman. His professional pursuits are deeply rooted in virtual reality, augmented reality environments, and the intricate intersection of machine learning and data analysis. His dedication to these fields is reflected in his ongoing research endeavors, where he continually explores new dimensions of technology. Moreover, he remains at the forefront of innovation and is deeply interested in cutting-edge domains such as deep learning and natural language processing (NLP). This commitment to staying abreast of the latest advancements underscores his dedication to pushing the boundaries of technology and contributing significantly to its ever-evolving landscape. He can be contacted at email: ar_altawil@asu.edu.jo. Lara Al-Shboul a dedicated researcher and educator, holds an MSc from the University of Jordan in addition to her current pursuit of a Ph.D. in computer science at the same institution. Driven by a fervent passion for advancing knowledge within her field, Lara focuses her research endeavors on artificial intelligence, striving to construct more precise models. Alongside her doctoral studies, Lara excels as an instructor at the University, imparting her expertise with excellence, particularly in the realm of biology. She can be contacted at email: lar9220473@ju.edu.jo.
Int J Elec & Comp Eng ISSN: 2088-8708 ο² Fortifying network security: machine learning-powered intrusion detection β¦ (Arar Al Tawil) 5905 Laiali Almazaydeh received her doctorate degree in computer science and engineering from University of Bridgeport in USA in 2013, specializing in human computer interaction. She is currently a full professor and the dean of College of Computer Information Technology, The American University in the Emirates, UAE. Laiali has published more than seventy research papers in various international journals and conferences proceedings, her research interests include human computer interaction, pattern recognition, and computer security. She received best paper awards in 3 conferences, ASEE 2012, ASEE 2013 and ICUMT 2016. Recently she has been awarded two postdoc scholarships from European Union Commission and Jordanian-American Fulbright Commission. She can be contacted at emails: laiali.almazaydeh@ahu.edu.jo. Mohammad Alshinwan received the Ph.D. degree from the School of Computer Engineering, Inje University, Gimhae, Republic of Korea, in 2017. He was an assistant professor with the Department of Computer and Information Sciences, Amman Arab University, Jordan. He is currently an associate professor with applied science private University, Jordan. His research interests include computer networks, mobile networks, information security, artificial intelligent, and optimization methods. He can be contacted at email: m_shinwan@asu.edu.jo.
Source: Phoenix Technical Documentation Library
Category: Network Security
Original: Peer-reviewed research paper / Official guideline
License: CC BY 4.0 (unless otherwise noted)
Suggested Citation:
ML-Powered Intrusion Detection Systems. Phoenix Technical Documentation Library, Avondale.AI. Accessed May 2026. https://avondale.ai/technical/