The challenge of structuring data in the healthcare system
Structured data is part of a schema and is grouped with a specific purpose. This makes extraction and use easier. Unstructured data, on the other hand, can only be organized and categorized with great effort. Examples of this are notes from doctors, diagnostic reports or voice notes .
Only 25% of the data in the health system is available in a digitally structured form . In addition, they are often isolated due to special data protection measures. This stands in the way of an individual networking of the data, which is common in other industries and makes companies such as Facebook or Google successful . The implementation of an electronic health record alone cannot solve this problem . On the one hand, because these are often inconsistent too and, on the other hand, because they again represent a closed system, while health data can be recorded and used much more extensively in the 21st century .
Standards in data processing are crucial for interaction in the healthcare system, for example in the practical implementation of research results. Accordingly, in its position paper of 2021, the german Federal Association of Medical Technology called for the implementation of standards in data acquisition as an essential building block for high-performance digital health care in the future .
These are required in the health system for a large number of applications, such as AI systems that live from a large set of uniform data.
The good news: The standards already exist. SNOMED (Systematized Nomenclature of Medicine) constitutes a globally standardized directory of medical terms such as symptoms, findings, diagnoses and procedures. In addition to laboratory tests, the use of LOINC (Logical Observation Identifiers Names and Codes) ensures systematic documentation. The nomenclature is updated annually and can be downloaded free of charge from the website. Therefore, there is more a lack of consistent data collection based on these terminologies.
NLP (natural language processing), i.e. the acquisition and machine processing of natural language, offers the advantage of easy data acquisition at the point of origin. In order to record spoken and written language, however, the complexity and ambiguity of entire textual contexts must be captured. This demanding processing is therefore associated with a high potential for errors, so that information and the context can potentially be lost. This can have fatal consequences, especially in the case of critical data in the healthcare system, at a time when medical errors are already one of the leading causes of death .
A far better option is data acquisition directly at the source. The closer the coding according to standards such as SNOMED and LOINC takes place to the data generation, the better the quality and timeliness of the data. For a medical scenario, this means that the coding must be done directly by the doctor. Additional effort for the doctor typically has little lobby in the health system and this concept has therefore often caused annoyance in the past. This is not just specific to the healthcare industry. I have had similar experiences in the field of customer relationship management (CRM) and visitor documentation, although there were usually fewer external standards such as SNOMED. So how can doctors be motivated to document correctly?
As described, the added value of standardized and digital documentation is huge and well-known. Since the potential of standardized documentation lies particularly in the analysis of the data, the direct effects of data acquisition are hardly tangible. In order to motivate employees, however, the benefits must be immediately visible in everyday life. If this is not implemented, gamification can help to increase motivation. While this has long been the case on the patient side, for example by tracking movement using smartwatches or apps that motivate people to take medication regularly, there is also great potential there on the care side . For example, a 2019 study showed that gamification can ensure that general practitioners monitor their patients’ blood pressure more often . From my experience, changes to the documentation in CRM are usually initiated by the sales manager or because the information system used requires this data. If there is to be a change in the current documentation, this can only be avoided by an external request.
medicalvalues provides decision support through artificial intelligence for laboratory data that supports doctors in requesting and evaluating samples on the basis of current research results. At the same time, we would like to make uniform medical documentation the standard. The manual standardization process is extremely (time) consuming, but can be automated with the help of AI. For this purpose, medicalvalues provides a program that simplifies the uniform data acquisition. This takes non-standardized data as input and then “LOINCifies” this input, i.e. the data has the corresponding LOINC code after the process. With the mapper application, data from all terminology systems can be transformed into LOINC. The whole process is faster, more efficient and safer because human errors such as typing mistakes can be avoided. More information here: Standardization and Harmonization
Try out the (semi-)automated LOINC mapping on the medicalvalues platform for free here: LOINC Mapper
: Capurro, Daniel, Meliha Yetisgen, Erik Eaton, Robert Black, and Peter Tarczy-Hornoch. “Availability of Structured and Unstructured Clinical Data for Comparative Effectiveness Research and Quality Improvement: A Multi-Site Assessment.” EGEMs (Generating Evidence & Methods to Improve Patient Outcomes) 2, no. 1 (July 11, 2014): 11. https://doi.org/10.13063/2327-9214.1079.
: https://pubmed.ncbi.nlm.nih.gov/Gentry, Sarah Victoria, Andrea Gauthier, Beatrice L’Estrade Ehrstrom, David Wortley, Anneliese Lilienthal, Lorainne Tudor Car, Shoko Dauwels-Okutsu, et al. “Serious Gaming and Gamification Education in Health Professions: Systematic Review.” Journal of Medical Internet Research 21, no. 3 (March 28, 2019): e12994. https://doi.org/10.2196/12994.