OmopConstructor

An R Package for Customising Observation Periods in OMOP CDM Analyses

OmopConstructor

OmopConstructor

You can install OmopConstructor from GitHub:

library(pak)
pkg_install("ohdsi/OmopConstructor")

It has been submitted to CRAN and it should be updated there soon!

The documentation and vignettes of the packages can be found in our page: https://ohdsi.github.io/OmopConstructor/

Motivation

Let’s get started

For this presentation we will use the GiBleed dataset from omock.

library(omock)
cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
cdm
── # OMOP CDM reference (duckdb) of GiBleed ────────────────────────────────────────────────────────────────────────────
• omop tables: care_site, cdm_source, concept, concept_ancestor, concept_class, concept_relationship, concept_synonym,
condition_era, condition_occurrence, cost, death, device_exposure, domain, dose_era, drug_era, drug_exposure,
drug_strength, fact_relationship, location, measurement, metadata, note, note_nlp, observation, observation_period,
payer_plan_period, person, procedure_occurrence, provider, relationship, source_to_concept_map, specimen, visit_detail,
visit_occurrence, vocabulary
• cohort tables: -
• achilles tables: -
• other tables: -

The cdm reference object

If you are not familiar with the cdm_reference object you can take a look to its formal definition: cdm_reference.

We usually would use the CDMConnector package to create a cdm_reference to our database.

Characterisation of the observation period

cdm$observation_period
# Source:   table<observation_period> [?? x 5]
# Database: DuckDB 1.4.0 [unknown@Linux 6.11.0-1018-azure:R 4.4.1//tmp/RtmpANlfTn/file2bc89560610.duckdb]
   observation_period_id person_id observation_period_start_date observation_period_end_date period_type_concept_id
                   <int>     <int> <date>                        <date>                                       <int>
 1                     6         6 1963-12-31                    2007-02-06                                44814724
 2                    13        13 2009-04-26                    2019-04-14                                44814724
 3                    27        27 2002-01-30                    2018-11-21                                44814724
 4                    16        16 1971-10-14                    2017-11-02                                44814724
 5                    55        55 2009-05-30                    2019-03-23                                44814724
 6                    60        60 1990-11-21                    2019-01-23                                44814724
 7                    42        42 1909-11-03                    2019-03-13                                44814724
 8                    33        33 1986-05-12                    2018-09-10                                44814724
 9                    18        18 1965-11-17                    2018-11-07                                44814724
10                    25        25 2007-03-18                    2019-04-07                                44814724
# ℹ more rows

Characterisation of the observation period

We will use OmopSketch to characterise the current observation period.

library(OmopSketch)
result <- summariseInObservation(cdm$observation_period, interval = "years", sex = TRUE)
plotInObservation(result, colour = "sex")

Characterisation of the observation period

result <- summariseObservationPeriod(cdm$observation_period)
tableObservationPeriod(result)
Observation period ordinal Variable name Estimate name
CDM name
GiBleed
all Number records N 2,694
Number subjects N 2,694
Records per person mean (sd) 1.00 (0.00)
median [Q25 - Q75] 1 [1 - 1]
Duration in days mean (sd) 21,602.60 (5,460.69)
median [Q25 - Q75] 20,872 [17,495 - 24,702]
1st Number subjects N 2,694
Duration in days mean (sd) 21,602.60 (5,460.69)
median [Q25 - Q75] 20,872 [17,495 - 24,702]

Build observation period

library(OmopConstructor)
cdm <- buildObservationPeriod(cdm = cdm)
ℹ Using censor date as 2019-05-25 from source_release_date.
cdm$observation_period
# Source:   table<results.test_observation_period> [?? x 5]
# Database: DuckDB 1.4.0 [unknown@Linux 6.11.0-1018-azure:R 4.4.1//tmp/RtmpANlfTn/file2bc89560610.duckdb]
   observation_period_id person_id observation_period_start_date observation_period_end_date period_type_concept_id
                   <int>     <int> <date>                        <date>                                       <int>
 1                     1         1 1953-02-06                    2019-05-25                                   32817
 2                     2         2 1920-07-01                    2019-05-25                                   32817
 3                     3         3 1923-12-22                    2019-05-25                                   32817
 4                     4         5 1968-10-30                    2019-05-25                                   32817
 5                     5         6 1964-04-07                    2019-05-25                                   32817
 6                     6         7 1968-12-06                    2019-05-25                                   32817
 7                     7         9 1978-10-26                    2019-05-25                                   32817
 8                     8        11 1955-01-11                    2019-05-25                                   32817
 9                     9        12 1963-05-08                    2019-05-25                                   32817
10                    10        16 1972-01-19                    2019-05-25                                   32817
# ℹ more rows

Characterise observation period

First to last

cdm <- buildObservationPeriod(
  cdm = cdm,
  collapseDays = Inf,
  persistenceDays = 0
)
ℹ Using censor date as 2019-05-25 from source_release_date.
result2 <- summariseInObservation(cdm$observation_period, interval = "years", output = "person-days") |>
  mutate(cdm_name = "First to last")

Only ongoing visit

cdm <- buildObservationPeriod(
  cdm = cdm,
  collapseDays = 0,
  persistenceDays = 0,
  recordsFrom = "visit_occurrence"
)
ℹ Using censor date as 2019-05-25 from source_release_date.
ℹ `persistenceDays` (0) can not be equal to `collapseDays` (0) as back to back observation periods are not allowed,
  setting `collapseDays = 1`.
result3 <- summariseInObservation(cdm$observation_period, interval = "years", output = "person-days") |>
  mutate(cdm_name = "Inpatient")

Ongoing record

cdm <- buildObservationPeriod(
  cdm = cdm,
  collapseDays = 0,
  persistenceDays = 0,
  recordsFrom = c("visit_occurrence", "drug_exposure", "condition_occurrence", "procedure_occurrence")
)
ℹ Using censor date as 2019-05-25 from source_release_date.
ℹ `persistenceDays` (0) can not be equal to `collapseDays` (0) as back to back observation periods are not allowed,
  setting `collapseDays = 1`.
result4 <- summariseInObservation(cdm$observation_period, interval = "years", output = "person-days") |>
  mutate(cdm_name = "Ongoing record")

Collapse by 365

cdm <- buildObservationPeriod(
  cdm = cdm,
  collapseDays = 365,
  persistenceDays = 0,
  recordsFrom = c("visit_occurrence", "drug_exposure", "condition_occurrence", "procedure_occurrence")
)
ℹ Using censor date as 2019-05-25 from source_release_date.
result5 <- summariseInObservation(cdm$observation_period, interval = "years", output = "person-days") |>
  mutate(cdm_name = "Collapse 365")

Collapse by 365 and persistence

cdm <- buildObservationPeriod(
  cdm = cdm,
  collapseDays = 365,
  persistenceDays = 365,
  recordsFrom = c("visit_occurrence", "drug_exposure", "condition_occurrence", "procedure_occurrence")
)
ℹ Using censor date as 2019-05-25 from source_release_date.
ℹ `persistenceDays` (365) can not be equal to `collapseDays` (365) as back to back observation periods are not allowed,
  setting `collapseDays = 366`.
result6 <- summariseInObservation(cdm$observation_period, interval = "years", output = "person-days") |>
  mutate(cdm_name = "Collapse 365 + persistence")

Reliability and extraction date

cdm <- buildObservationPeriod(
  cdm = cdm,
  collapseDays = 730,
  persistenceDays = 730,
  dateRange = c("1990-01-01", "2009-12-31"),
  recordsFrom = c("visit_occurrence", "drug_exposure", "condition_occurrence", "procedure_occurrence")
)
ℹ `persistenceDays` (730) can not be equal to `collapseDays` (730) as back to back observation periods are not allowed,
  setting `collapseDays = 731`.
result7 <- summariseInObservation(cdm$observation_period, interval = "years", output = "person-days") |>
  mutate(cdm_name = "Collapse 730 + persistence")

Combine and compare

result <- bind(result0, result1, result2, result3, result4, result5, result6, result7)
plotInObservation(result = result, colour = "cdm_name")

Future steps

  • buildDrugEra()

  • buildConditionEra()

  • cdmSample()

OmopConstructor

👉 Packages website
👉 CRAN link Soon 👉 Manual Soon

📧 marti.catalasabate@ndorms.ox.ac.uk