AUTOMATED DATA COLLECTION OF KEY FINANCIAL INDICATORS OF IT ENTERPRISES IN THE REGION
https://doi.org/10.34822/1999-7604-2022-3-39-45
Abstract
The article deals with a new software for automated data collection. Using parsing of open websites, this software allows obtaining a set of the most significant financial and economic indicators of firms for a particular industry in the region in a short time and presents it in an Excel document, resulting in a comprehensive financial and economic analysis of a given industry in the region. A list of required firms published at export-base.ru is a necessary condition for data collection. Python and such key libraries as request, selenium, beautifulsoup4, and csv are selected as the main tool for writing the program. The software developed is tested for quickly obtaining complex data from IT companies in the Rostov Oblast to conduct the financial and economic analysis of the IT industry in the region.
About the Authors
S. O. KramarovRussian Federation
Doctor of Sciences (Physics and Mathematics), Professor
E-mail: maoovo@yandex.ru
V. A. Ovsyannikov
Russian Federation
Master’s Degree Student
E-mail: viktordee41@gmail.com
L. V. Sakharova
Russian Federation
Doctor of Sciences (Physics and Mathematics), Associate Professor
E-mail: L_Sakharova@mail.ru
R. S. Usatyi
Russian Federation
Postgraduate
E-mail: rs.usaty@gmail.com
G. V. Lukyanova
Russian Federation
Candidate of Sciences (Engi-neering), Associate Professor
E-mail: lukyanova.g@yandex.ru
References
1. Krotov V., Silva L. Legality and Ethics of Web Scraping // Proceedings of the 24th Americas Con-ference on Information Systems, New Orleans, 2018. URL: https://www.researchgate.net/publication/3249 07302_Legality_and_Ethics_of_Web_Scraping (дата обращения: 30.08.2022).
2. Рожкова М. А. Базы данных и сервисы онлайн-классифайдов: пользование базой и использование информации // Журн. суда по интеллектуал. правам. 2019. № 26. С. 25-32.
3. Государственный информационный ресурс бухгалтерской (финансовой) отчетности. URL: https://bo.nalog.ru (дата обращения: 30.08.2022).
4. Всероссийская система проверки контрагентов. URL: https://zachestnyibiznes.ru (дата обращения: 30.08.2022).
5. Спектр-аудит. URL: https://spektr-audit.com (дата обращения: 30.08.2022).
6. Бухгалтерская (финансовая) отчетность предприятий. URL: https://e-ecolog.ru/buh (дата обращения: 30.08.2022).
7. Сравнение финансового состояния фирмы с отраслевыми показателями и конкурентами. URL: https://www.testfirm.ru (дата обращения: 30.08.2022).
8. Mitchell R. Web Scraping with Python. 2nd Ed. 2018. 288 p.
9. Chromedriver. URL: https://chromedriver.chromium.org downloads (дата обращения: 30.08.2022).
10. Adams M. D. Principled Parsing for Indentation-Sensitive Languages: Revisiting Landin’s Offside Rule // ACM SIGPLAN Notices. Vol. 48, Is. 1. P. 511‒522. (дата обращения: 30.08.2022).
11. Ramu C., Gemünd C., Gibson T. J. Object-Oriented Parsing of Biological Databases with Python // Bio-informatics. 2000. Vol. 16, Is. 7. P. 628‒638. URL: https://academic.oup.com/bioinformatics/article/16/7/628/227928?login=false (дата обращения: 30.08.2022).
12. Chapagain A. Hands-On Web Scraping with Python: Perform Advanced Scraping Operations Using Various Python Libraries and Tools Such as Selenium, Regex, and Others. 2019. 189 p. URL: https://books. google.ru/books?hl=ru&lr=&id=j-6iDwAAQBAJ&oi= fnd&pg=PP1&dq=parsing+python+selenium&ots=J9F2ScFdG8&sig=y0U5WR-xLhvNR3bGpJdAOyDrXR 8&redir_esc=y#v=onepage&q=parsing%20python%20selenium&f=false (дата обращения: 30.08.2022).
13. ExportBase. URL: https://export-base.ru (дата обращения: 30.08.2022).
Review
For citations:
Kramarov S.O., Ovsyannikov V.A., Sakharova L.V., Usatyi R.S., Lukyanova G.V. AUTOMATED DATA COLLECTION OF KEY FINANCIAL INDICATORS OF IT ENTERPRISES IN THE REGION. Proceedings in Cybernetics. 2022;(3 (47)):39-45. (In Russ.) https://doi.org/10.34822/1999-7604-2022-3-39-45