Determination of information data reliability obtained from process facility sensors
UDC: 004.04
DOI: -
Authors:
KRAVCHENKO D.A.
1
1 Gazprom dobycha Urengoy, Novy Urengoy, Russia
Keywords: reliable data, element with a special limit state, statistical and machine learning methods, Local Outlier Factor and Silhouette Score methods, Gradient Boosting Regression Tree method, F1-score method
Annotation:
Determination of information data reliability obtained from process facility sensors is possible using various methods. The Local Outlier Factor (LOF) method is a statistical method. It is based on the analysis of information elements local density, thus allowing reliable/unreliable data detection based on their relative density. The method important procedure, which lies in averaging the distance from the information element to each of its k-nearest neighbor should be noted. The k value indicates how many neighbors are taken into account. The correct choice of k is required for assessing whether the information element belongs to the area of reliable/unreliable data. To automate the process of detecting reliable/unreliable data, it is proposed to modify the LOF method, namely, to add a search for the optimal k value. The optimal k value is calculated using the Silhouette Score, Gradient Boosting Regression Tree and F1-score methods. For this purpose, a model based on numerical methods is developed that calculates the optimal k values of LOF. Each stage of the study was analyzed and corresponding recommendations for choosing k were given. The advantages and disadvantages of the modified LOF method are shown.
Bibliography:
1. The Cambridge History of Ancient China: From the Origins of Civilization to 221 BC / Edited by M. Loewe, E.L. Shaughnessy. – Cambridge University Press, 1999. – XXIX, 1148 p.
2. Lepesh G.V. Industrializatsiya dlya vsekh otrasley ekonomiki // Tekhniko-tekhnologicheskie problemy servisa. – 2011. – № 3(17). – S. 3–5.
3. Koryakin A.Yu. Kompleksnye resheniya zadach razrabotki i ekspluatatsii skvazhin Urengoyskogo dobyvayushchego kompleksa. – M.: RGU nefti i gaza (NIU) im. I.M. Gubkina, 2016. – 272 s.
4. Kravchenko D.A. Prognozirovanie sostoyaniya ob"ektov gazopromyslovoy tekhnologii // Avtomatizatsiya, telemekhanizatsiya i svyaz' v neftyanoy prom-sti. – 2021. – № 8(577). – S. 20–35. – DOI: 10.33285/0132-2222-2021-8(577)-20-35
5. Smirnov S.P. Metod otsenki dostovernosti rezul'tatov otsenki vidov i posledstviy otkazov (FMEA) slozhnykh tekhnicheskikh sistem na osnove analiza kommunikatsiy, osushchestvlennykh v protsesse identifikatsii riskov // E-Scio. – 2022. – № 6(69). – S. 533–538.
6. Samariya D., Thakkar A. A Comprehensive Survey of Anomaly Detection Algorithms // Annals of Data Science. – 2023. – Vol. 10, Issue 3. – P. 829–850. – DOI: 10.1007/s40745-021-00362-9
7. Applied Cloud Deep Semantic Recognition: Advanced Anomaly Detection / Edited by M. Roopaei, P. Najafirad (P. Rad). – New York: Auerbach Publications, 2018. – 202 p.