4  Set up your environment

To build an R package you will need R installed and an IDE: preferably RStudio or Positron.

You will need to install devtools and usethis as they are key to follow the different steps to get started.

# use base R
install.packages(c("usethis", "devtools"))

# or pak (recommended)
library(pak)
pkg_install(c("usethis", "devtools"))

To work with the Tidy R OMOP ecosystem, you will also need to install the omopgenerics package. Alongside with other packages that may be useful:

You can see the full list of Tidy R packages in our website.

4.1 Create an empty package

To create a new empty package, use usethis::create_package() with the path to the folder where you want the package to live. The folder name becomes the package name.

library(usethis)

create_package(path = "path/to/MyPackage")

This will:

  • Create the folder MyPackage/ with the standard package structure (R/, DESCRIPTION, NAMESPACE).
  • Open the new project in a fresh RStudio or Positron window.
TipChoosing a package name

Package names must start with a letter, contain only letters, numbers, and dots, and ideally be short, distinctive, and easy to remember. Use CamelCase to make multi-word names readable (e.g. CohortConstructor, PatientProfiles). Avoid dots, underscores, and all-lowercase names that are hard to distinguish.

You can check whether a name is available on CRAN with available::available("MyPackageName").

4.1.1 The package structure

After create_package() you will have:

MyPackage/
├── R/                  # Your R source files go here
├── DESCRIPTION         # Package metadata (name, version, dependencies, …)
├── NAMESPACE           # Exported functions — managed automatically by devtools
└── MyPackage.Rproj     # RStudio project file

You will add more files as you go: tests/, vignettes/, man/, and NEWS.md among others. Most of these are created for you by usethis helpers described later in this chapter.

4.2 Add a license

Adding a license is essential — it tells users how they can use, modify, and redistribute your package, and protects you legally. In the Tidy R OMOP ecosystem we release under the Apache 2.0 license.

use_apache_license()

This adds the LICENSE.md file and populates the License field in DESCRIPTION.

usethis supports all common open-source licenses. The right choice usually comes down to how permissive you want to be:

  • More permissive (anyone can use your code, even in closed-source projects):
    • MIT: simple and widely understood.
    • Apache 2.0: like MIT but adds explicit patent protection. This is our default.
  • Copyleft (anyone who distributes changes must also open-source them):
    • GPL v3
    • AGPL v3: extends GPL to cover use over a network.
    • LGPL v3: a weaker copyleft, suitable for libraries.
  • Creative Commons (appropriate for data packages rather than code):
    • CC0: dedicated to the public domain.
    • CC-BY: free to share and adapt, with attribution.
use_mit_license()
use_apache_license()
use_gpl_license(version = 3)
use_agpl_license(version = 3)
use_lgpl_license(version = 3)
use_cc0_license()
use_ccby_license()

4.3 Create a first function

All R source files live in the R/ folder. Each file can contain one or more functions — the typical convention is one file per function or per closely related group of functions, named to match.

Create a new R file with:

use_r("myFunction")

This creates R/myFunction.R and opens it. Write your function there:

myFunction <- function(cohort, overlap = FALSE) {
  # your code here
  cohort
}
TipAdd a roxygen skeleton immediately

Before writing the body, add a roxygen2 documentation skeleton with Code → Insert Roxygen Skeleton in RStudio (or Ctrl+Alt+Shift+R). Fill in the @title, @description, @param, and @return tags as you write the function. Doing this from the start is much easier than adding it later. Documentation is covered fully in ?sec-functions.

Once you have written a function, load the whole package into your session with:

devtools::load_all()  # Ctrl+Shift+L in RStudio

load_all() simulates what happens when the package is installed and loaded — it makes all your functions available without having to restart R. Use it constantly during development.

4.4 Check the package

devtools::check() runs R CMD check, which is the official test that CRAN and you should run to verify your package is correct. It checks documentation, dependencies, examples, and tests.

devtools::check()  # Ctrl+Shift+E in RStudio

A clean check produces output ending with:

── R CMD check results ─────────────────────────────────────────────────
Duration: 30s

0 errors ✔ | 0 warnings ✔ | 0 notes ✔

Aim to keep your package at 0 errors, 0 warnings, 0 notes at all times — not just before a release. It is much easier to fix one new note immediately than to diagnose five notes that accumulated over months.

Common things to fix early:

  • A missing @export tag on a function that should be public.
  • An undocumented argument (@param entry missing).
  • A package used in code but not listed in DESCRIPTION under Imports or Suggests.
  • An example that fails because it requires a real database.

4.5 Create a simple test

Tests live in tests/testthat/. Set up the testing infrastructure with:

use_testthat()

Then create a test file for your function:

use_test("myFunction")

This creates tests/testthat/test-myFunction.R. A minimal test looks like:

test_that("myFunction returns a cohort table", {
  cdm <- mockPatientProfiles()
  result <- myFunction(cdm$cohort1)
  expect_s3_class(result, "cohort_table")
  CDMConnector::cdmDisconnect(cdm)
})

Run the tests for the file you are currently editing:

devtools::test_active_file()  # Ctrl+Shift+T in RStudio (for active file)

Or run all tests in the package:

devtools::test()

Testing with OMOP data is covered in detail in ?sec-test-omop. For now, the important habit is: every function you write should have at least one test file created for it before you move on.

4.6 Create a README

The README is the first thing people see when they visit your package on GitHub. Create one with:

use_readme_rmd()

This creates README.Rmd, which is an R Markdown file that can contain executable code chunks. After editing it, render it to produce README.md:

devtools::build_readme()

Always commit both README.Rmd and README.md. GitHub displays README.md automatically on the repository front page.

A good README for an ecosystem package should include: a one-paragraph description, an installation code block, a minimal usage example using a mock* function (no real database needed), and links to the documentation website. The full README conventions are covered in ?sec-readme.

4.7 Create a documentation website

A pkgdown website gives your package a searchable, browsable reference site generated automatically from your roxygen2 documentation and vignettes. Set it up with:

use_pkgdown()

Build the site locally to preview it:

pkgdown::build_site()

This creates a docs/ folder. Open docs/index.html in your browser to see it.

The ecosystem uses a shared pkgdown theme for visual consistency. Configuration and automated deployment via GitHub Actions are covered in ?sec-automatic-website.

4.8 Set up GitHub

Once your package is working locally, push it to GitHub so it can benefit from continuous integration: automated checks, test coverage reporting, and website deployment on every push.

The fifth part of this book covers GitHub and GitHub Actions in full:

4.9 Using the package template

Although it is instructive to follow these steps manually the first time, the ecosystem provides a GitHub template repository that does all of this setup for you. Use it to create a new repository with the full scaffold already in place — license, GitHub Actions, pkgdown configuration, and lintr — by clicking the link below:

👉 Create a new package from the template

The source for the template is at https://github.com/oxford-pharmacoepi/EmptyPackageTemplate. You can inspect it to see exactly what files are included.