OMOP the basis
New to the OMOP CDM? We’d recommend you pare this book with The Book of OHDSI
Real Word Data
All our studies are done with what it is called Real World Data (RWD). It refers to data collected from various sources outside of traditional clinical trials. It encompasses information about the health status, treatment, and outcomes of patients in real-world settings.
For example, the most common source that we use is Electronic Health Records (EHRs): Data collected from healthcare providers during routine clinical care, including patient demographics, diagnoses, treatments, and outcomes.
There exist also other sources of RWD: Claims and Billing Activities (from health insurance claims), Registries (databases that collect information on patients with specific diseases or conditions), Pharmacy Data (prescription medications), among others…
Uses of Real World Data
RWD can be used in many different ways:
Regulatory Decision Making: Regulatory agencies, like the FDA, use RWD to support approval of new treatments, label expansions, and post-market surveillance.
Clinical Decision Support: RWD helps healthcare providers make more informed decisions about patient care by providing insights into treatment effectiveness and safety in broader populations.
Health Economics and Outcomes Research (HEOR): RWD is used to assess the cost-effectiveness and value of medical interventions.
Epidemiology: RWD aids in understanding the prevalence, incidence, and burden of diseases in the population.
Comparative Effectiveness Research: Researchers use RWD to compare the effectiveness of different treatments in real-world settings.
Pharmacovigilance: RWD is crucial for monitoring the safety of medications post-approval to identify and mitigate adverse effects.
Challenges with Real World Data
When working with RWD we have to be aware of some challenges and limitations:
Data Quality: Inconsistencies, missing information, and variations in data collection methods can affect the reliability of RWD.
Data Integration: Combining data from diverse sources can be complex due to differences in formats, standards, and terminologies.
Privacy and Security: Ensuring patient confidentiality and data security is paramount when handling RWD.
Bias and Confounding: Real-world studies may be subject to biases and confounders that can impact the validity of findings.
Real World Evidence
We call the evidence generated by RWD as Real World Evidence. This will be our end goal, to generate reliable Real World Evidence in a transparent and fast way.
Why a Common Data Model?
As we have just seen RWD can come from many different sources and it is usually not collected for research purposes. This leads to diverse structures, coding systems and can become quite a nightmare to reproduce an study in different databases. This is why common data models started gaining popularity.
Using a common data model (CDM) is crucial for standardising and harmonising data from disparate sources, ensuring consistency and interoperability. A CDM facilitates the integration and analysis of data from various healthcare systems by providing a unified structure and standardised terminologies. This standardisation enables researchers and healthcare professionals to perform meaningful comparisons, aggregate data efficiently, and derive robust, generalisable insights. Additionally, a CDM enhances data quality and reliability, reduces the potential for errors, and supports regulatory compliance and collaborative research efforts. Ultimately, adopting a common data model accelerates the translation of real-world data into actionable knowledge, improving patient outcomes and advancing medical research.
OMOP basis
The Observational Medical Outcomes Partnership (OMOP) is a common data model for organising healthcare data from various sources. It is ones of the most popular growing CDM world wide with more than 800 million patients’ health care data transformed into this format.
The OMOP CDM is a person-centric relational data model. Patients’ data is spread across various tables related to different clinical domains with, for example, the condition occurrence table containing diagnoses while the drug exposure table contains drug prescriptions. These different clinical tables are all linked back to the person table which contains a unique identifier for each individual along with some key demographic data such as their date of birth. Meanwhile, records in the observation period table define the period of calendar time over which an individual is followed-up.
In this figure you can see the different tables that exist in the OMOP CDM and how are they related:
OHDSI
The OHDSI (Observational Health Data Sciences and Informatics) community is a global, multi-stakeholder, interdisciplinary collaborative that aims to improve health by empowering the community to collaboratively generate evidence that promotes better health decisions and better care.
The primary mission is to create a research community that produces high-quality, reproducible, and reliable evidence about health and healthcare.
A key component of OHDSI’s infrastructure is the OMOP (Observational Medical Outcomes Partnership) Common Data Model.
OHDSI emphasizes open science principles, making their tools, methods, and research findings freely available to the public though their github: https://github.com/ohdsi.
Annual symposiums and other events foster collaboration and the sharing of ideas within the community.
You can find more about ohdsi community on their website: https://ohdsi.org.
Edhden Academy
The Ehden Academy is an educational initiative that contains lots of resources on the basis of OMOP and how the OMOP CDM is structured, its vocabularies and so…
In particular we would recommend these courses:
- …
Vocabularies
That’s a brief introduction to a very complicated topic, please refer to the provided links to learn more and get a more in depth view.
Every record in a RWD database gets coded into a numeric identifier (code), there exist many different Medical Classifications that are different vocabularies. Each vocabulary has its pros and its cons. Each database generally will come with a different vocabulary. OMOP CDM has some standard vicabularies that are the ones commonly used. In general this vocabulary will be different to the source one (originl of your data).
We call mapping to the process to convert a source data (original format of the data) that can be in many different formats to the OMOP CDM. In our team Antonella and Teen are the ones in charge of the mapping process.
Athena contains the last version of the vocabularies and it is used to track its changes. You can use athena to search for OMOP concepts and see how they are related.
Final remark
This was a very general introduction to the OMOP CDM as there are many resources out there that can help you to familiarise with OMOP. We recomend to take a look to The Book of OHDSI and the recommended courses from Ehden Academy, but the best way to learn about OMOP is to do an study with one of our OMOP instances, learning by doing :).