Scientific and technical journal

«Automation and Informatization of the fuel and energy complex»

ISSN 0132-2222

Automation and Informatization of the fuel and energy complex
Classification of corporate traffic using machine learning algorithms

UDC: 004.89
DOI: 10.33285/2782-604X-2023-7(600)-22-29

Authors:

UYMIN ANTON G.1

1 National University of Oil and Gas "Gubkin University", Moscow, Russia

Keywords: algorithm, classification, traffic, data, system, construction

Annotation:

The article studies the application of various machine learning algorithms to classify corporate traffic according to the 8CoS model. The purpose of this work is to determine a classifier that provides recognition of traffic with a probability not lower than 0,96 and minimal hardware load on condition that the data processing does not exceed 2000 instances. The choice of such metrics is caused by the desire to ensure high classification accuracy (at least 96 %) while maintaining optimal use of hardware resources. The limit of 2000 data instances is chosen based on the estimated resource constraints and real-world processing time. The results of C4.5 Decision Tree, Random Forest, Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) algorithms are analyzed. Based on the results obtained, the C4.5 Decision Tree algorithm is identified as the most applicable one for the solution of the task.

Bibliography:

1. Yehoon Jang, Namgi Kim, Byoung-Dai Lee. Traffic classification using distributions of latent space in software-defined networks: An experimental evaluation // Engineering Applications of Artificial Intelligence. – 2023. – Vol. 119. – P. 105736. – DOI: 10.1016/j.engappai.2022.105736
2. Shanthi J., Gracia Nirmala Rani D., Rajaram S. A C4.5 decision tree classifier based floorplanning algorithm for System-on-Chip design // Microelectronics J. – 2022. – Vol. 121. – P. 105361. – DOI: 10.1016/j.mejo.2022.105361
3. Utilization of random forest classifier and artificial neural network for predicting the acceptance of reopening decommissioned nuclear power plant / A.K.S. Ong, Y.T. Prasetyo, K.E. Velasco [et al.] // Annals of Nuclear Energy. – 2022. – Vol. 175. – P. 109188. – DOI: 10.1016/j.anucene.2022.109188
4. Goyal S. Effective software defect prediction using support vector machines (SVMs) // Int. J. of System Assurance Engineering and Management. – 2022. – Vol. 13, Issue 2. – P. 681–696. – DOI: 10.1007/s13198-021-01326-1
5. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction / S. Uddin, I. Haque, Haohui Lu [et al.] // Scientific Reports. – 2022. – Vol. 12, No. 1. – Article No. 6256. – DOI: 10.1038/s41598-022-10358-x
6. Dol S.M., Jawandhiya P.M. Classification Technique and its Combination with Clustering and Association Rule Mining in Educational Data Mining – A survey // Engineering Applications of Artificial Intelligence. – 2023. – Vol. 122. – P. 106071. – DOI: 10.1016/j.engappai.2023.106071
7. Uymin A.G., Morozov I.M. Sravnitel'nyy analiz instrumentov nepreryvnoy onlayn-autentifikatsii i sistem obnaruzheniya anomaliy dlya postoyannogo podtverzhdeniya lichnosti pol'zovatelya // T-Comm: Telekommunikatsii i transport. – 2022. – T. 16, № 5. – S. 48–55. – DOI: 10.36724/2072-8735-2022-16-5-48-55
8. Comparison of the Performance Results of C4.5 and Random Forest Algorithm in Data Mining to Predict Childbirth Process / M. Muhasshanah, M. Tohir, D.A. Ningsih [et al.] // CommIT J. – 2023. – Vol. 17, No. 1. – P. 51–59.
9. A Comparative Analysis of Machine Learning techniques on Breast Cancer diagnosis using WEKA / A. Rashid, S.S. Binta Farhad, A. Bhuyian [et al.] // 2022 25th Int. Conf. on Computer and Information Technology (ICCIT), Cox's Bazar, Bangladesh, Dec. 17–19, 2022. – IEEE, 2023. – P. 663–668. – DOI: 10.1109/ICCIT57492.2022.10055421
10. Mahammad A.B., Kumar R. Design a Linear Classification Model with Support Vector Machine Algorithm on Autoimmune Disease Data // 2022 3rd Int. Conf. on Intelligent Engineering
and Management (ICIEM), London, Apr. 27–29, 2022. – IEEE, 2022. – P. 164–169. – DOI: 10.1109/ICIEM54221.2022.9853182
11. Venkatramana Reddy S., Madhavi Latha N., Sarojamma B. Development of Machine Learning models for air pollutants using weka // J. of Physics: Conf. Series. – IOP Publishing, 2022. – Vol. 2332. – P. 012018. – DOI: 10.1088/1742-6596/2332/1/012018
12. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads / A. Ousterhout, J. Fried, J. Behrens [et al.] // 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’19), Boston, MA, USA, Feb. 26–28, 2019. – USENIX Association, 2019. – P. 361–377. – URL: https://www.usenix.org/system/files/nsdi19-ousterhout.pdf
13. Gregg B. Linux load averages: Solving the Mystery. – URL: https://brendangregg.com/blog/2017-08-08/linux-load-averages.html
14. Uymin A.G. Avtomaticheskoe markirovanie setevogo trafika brauzera dlya analiza i klassifikatsii na primere platformy "Remotetopology" // T-Comm: Telekommunikatsii i transport. – 2022. – T. 16, № 12. – S. 17–22. – DOI: 10.36724/2072-8735-2022-16-12-17-22