Published

June 26, 2025

Set up

In this section we cover how to set up your local environment to be able to conduct the practical sessions.

Issues

If you have any issue setting your environment please contact:

Setup your laptop

Follow this instructions to setup your environment:

Install R

  • https://cran.r-project.org/bin/windows/base/ (at least version 4.2)

Install RStudio

  • https://posit.co/download/rstudio-desktop/

Install Rtools

  • https://cran.r-project.org/bin/windows/Rtools/

After this steps open RStudio and install the following R packages. You can easily install a package from from the command line just typing: install.packages("PackageName")

  • DBI
  • duckdb
  • here
  • usethis
  • dplyr
  • dbplyr
  • CDMConnector
  • PatientProfiles
  • IncidencePrevalence
  • CohortConstructor
  • DrugUtilisation
  • OmopSketch
  • visOmopResults
  • CohortCharacteristics
install.packages(c("DBI", "duckdb", "here", "usethis", "dplyr", "dbplyr", 
                   "CDMConnector", "PatientProfiles", "IncidencePrevalence", 
                   "CohortConstructor", "DrugUtilisation", "OmopSketch", 
                   "visOmopResults", "CohortCharacteristics"))

Check code works

Execute the following block of code and make sure that it produces the same output without any error:

library(DBI)
library(duckdb)
library(here)
here() starts at /home/runner/work/RealWorldEvidenceSummerSchool2025/RealWorldEvidenceSummerSchool2025
library(usethis)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(dbplyr)

Attaching package: 'dbplyr'
The following objects are masked from 'package:dplyr':

    ident, sql
library(CDMConnector)
library(PatientProfiles)
library(IncidencePrevalence)
library(CohortConstructor)
library(DrugUtilisation)
library(OmopSketch)
library(visOmopResults)
library(CohortCharacteristics)

requireEunomia(datasetName = "GiBleed")
ℹ `EUNOMIA_DATA_FOLDER` set to: '/tmp/RtmpKgcutL'.

Download completed!
db <- dbConnect(duckdb(), dbdir = eunomiaDir())
Creating CDM database /tmp/RtmpKgcutL/GiBleed_5.3.zip
cdm <- cdmFromCon(con = db, cdmSchema = "main", writeSchema = "main")
cdm$my_cohort <- conceptCohort(
  cdm = cdm,
  name = "my_cohort",
  conceptSet = list('chronic_sinusitis' = 257012L)
)
ℹ Subsetting table condition_occurrence using 1 concept with domain: condition.
ℹ Combining tables.
ℹ Creating cohort attributes.
ℹ Applying cohort requirements.
ℹ Merging overlapping records.
✔ Cohort my_cohort created.
settings(cdm$my_cohort)
# A tibble: 1 × 4
  cohort_definition_id cohort_name       cdm_version vocabulary_version
                 <int> <chr>             <chr>       <chr>             
1                    1 chronic_sinusitis 5.3         v5.0 18-JAN-19    
Store data partmanently

Note this code will download the GiBleed data set every time. GiBleed is a small data set with only 6MB, but there are other data sets that can be ~1GB and downloading them every time it is not efficient. To efficiently store the data permanently you have to set up an environment secret pointing to a path where data sets will be stored.

To set up an environment variable use:

usethis::edit_r_environ()

write your secret there:

EUNOMIA_DATA_FOLDER="path/to/data/folder"

Restart R and then every time that you use the function requireEunomia() or downloadEunomiaData() the code will check if it is already downloaded there so you don’t have to download the same code twice.

To check that you have saved the path correctly you can check it using:

Sys.getenv("EUNOMIA_DATA_FOLDER")
[1] "path/to/data/folder"