Platform for generating medical datasets for machine learning in public health

Avatar
Poster
Voices Powered byElevenlabs logo
Connected to paperThis paper is a preprint and has not been certified by peer review

Platform for generating medical datasets for machine learning in public health

Authors

Anna Andreychenko, Viktoriia Korzhuk, Stanislav Kondratenko, Polina Cheraneva

Abstract

Currently, there are many difficulties regarding the interoperability of medical data and related population data sources. These complications get in the way of the generation of high-quality data sets at city, region and national levels. Moreover, the collection of datasets within large medical centers is feasible due to own IT departments whereas the collection of raw medical data from multiple organizations is a more complicated process. In these circumstances, the most appropriate option is to develop digital products based on microservice architecture. Because of this approach, it is possible to ensure the multimodality of the system, the flexibility of the interface and the internal system approach, when interconnected elements behave as a whole, demonstrating behavior different from the behavior when working independently. These conditions allow, in turn, to ensure the maximum number and representativeness of the resulting data sets. This paper demonstrates a concept of the platform for a sustainable generation of quality and reliable sets of multimodal medical data. It collects data from different external sources, harmonizes it using a special service, anonymizes harmonized data, and labels processed data. The proposed system aims to be a promising solution to the improvement of medical data quality for machine learning.

Follow Us on

0 comments

Add comment