Giving users informative error messages is one of the most important things you can do to make a package pleasant to use. A cryptic error from deep inside a dplyr pipeline — or worse, a silent wrong result — is much harder to debug than a clear message at the top of the function that says exactly what was wrong with the input.
The rule of thumb is: validate at the boundary, trust inside. Check all inputs at the very start of each exported function, before any work is done. Once those checks pass, do not re-check the same things in internal helpers.
That said, do not overdo it. Avoid checking things that are guaranteed by earlier checks, and avoid checks that cost more than the function itself. In general, check that each argument has the expected type, that it has the expected length, and that any logical constraints between arguments are satisfied.
8.1 The two kinds of validation helpers in omopgenerics
omopgenerics provides two families of validation helpers that together cover almost everything you need.
validate* functions are for arguments that appear consistently across many packages — cdm, cohort, cohortId, conceptSet, window, ageGroup, strata, name, nameStyle, and result. These functions do more than just check: they also coerce and normalise the input to a canonical form. You must assign their return value back to the variable, because the cleaned-up version is what the rest of your function should use.
assert* functions are for simple type and constraint checks. They do not transform the input — they either pass silently or throw an error. Use these for arguments that do not have a dedicated validate* function.
8.2validate* functions
The following validate* functions are available in omopgenerics. Import them with @importFrom omopgenerics validateCdmArgument (and so on) rather than calling them with omopgenerics::.
Cohort IDs: resolves NULL to all IDs, checks IDs exist in cohort
validateConceptSetArgument(conceptSet)
Concept set: accepts codelist, codelist_with_details, or concept_set_expression
validateWindowArgument(window)
Window: accepts vector or named list, always returns named list
validateAgeGroupArgument(ageGroup)
Age groups: accepts several input forms, always returns named list of named intervals
validateStrataArgument(strata, cohort)
Strata: checks that all columns named in strata exist in the cohort
validateNameArgument(name, cdm)
Table name: checks it is a valid identifier, optionally checks it doesn’t already exist
validateNameStyle(nameStyle, ...)
Name style template: checks it is a valid glue template
validateResultArgument(result)
Summarised result: checks class and required columns
8.2.1 Normalisation is the point
The most important property of validate* functions is that they accept several reasonable input forms and always return a single canonical form. This means the rest of your function only has to handle one form, which keeps the logic clean.
The ageGroup argument illustrates this well. Users can supply it in several ways:
Whatever you pass in, the output is always a named list of named intervals. The function also throws clear errors if the input is malformed:
# Overlapping age groups when only one column is allowedvalidateAgeGroupArgument(ageGroup =list(group1 =list(c(0, 19), c(20, Inf)), group2 =list(c(0, Inf))),multipleAgeGroup =FALSE)
Error in `purrr::map()`:
ℹ In index: 1.
Caused by error:
! Elements of `ageGroup` argument must be greater or equal to "0".
# NULL not allowedvalidateAgeGroupArgument(ageGroup =NULL, null =FALSE)
Error:
! `ageGroup` argument can not be NULL.
The same principle applies to validateWindowArgument(), which accepts either a plain c(0, Inf) vector or a named list of windows, and always returns a named list.
8.2.2 The {{ syntax for cohortId
validateCohortIdArgument() supports tidyselect semantics, so users can pass things like starts_with("fracture") to select cohort IDs by name. To make this work, you must wrap the argument in double curly braces {{ when you pass it to the validator:
Without {{, tidyselect expressions will not be evaluated correctly.
8.3assert* functions
Use assert* functions for arguments that do not have a dedicated validate* function. Unlike validate*, these functions return the input invisibly and are used purely for their side effect of throwing an error.
Function
Checks
assertCharacter(x, ...)
x is a character vector
assertNumeric(x, ...)
x is numeric
assertLogical(x, ...)
x is logical
assertDate(x, ...)
x is a Date
assertList(x, ...)
x is a list
assertClass(x, class, ...)
x has the specified class(es)
assertChoice(x, choices, ...)
x is one of the allowed values
assertTable(x, ...)
x is a table (data frame or tbl_sql)
assertTrue(expr, ...)
The expression evaluates to TRUE
All of these accept common modifiers as arguments:
Modifier
Type
Meaning
length
integer or NULL
Required length; NULL skips the check
null
logical
Whether NULL is a valid input (default FALSE)
na
logical
Whether NA values are allowed (default FALSE)
named
logical
Whether elements must be named
unique
logical
Whether elements must be unique
min
numeric
Minimum value (for assertNumeric)
max
numeric
Maximum value (for assertNumeric)
integerish
logical
Whether numeric must be whole numbers (for assertNumeric)
minNchar
integer
Minimum number of characters per element (for assertCharacter)
call
call
Passed to the cli error message for better call context
8.4 Putting it together: a validation block
Here is the recommended pattern. Validation comes first, before any computation. validate* results are always reassigned. assert* calls are written as statements.
myFunction <-function(cohort, cohortId =NULL, window =c(0, Inf), overlap =FALSE) {# validate and normalise cohort <- omopgenerics::validateCohortArgument(cohort = cohort) cohortId <- omopgenerics::validateCohortIdArgument(cohortId = {{cohortId}}, cohort = cohort) window <- omopgenerics::validateWindowArgument(window = window)# simple type checks omopgenerics::assertLogical(overlap, length =1)# business logic checksif (overlap &&any(sapply(window, function(w) diff(w) >365))) { cli::cli_abort(c("x"="Windows longer than 365 days are not supported when {.var overlap} is {.val TRUE}." )) }# ... function body}
A second example with a CDM reference:
myFunction2 <-function(cdm, conceptSet, days =180L, startDate =NULL, overlap =TRUE) {# validate and normalise cdm <- omopgenerics::validateCdmArgument(cdm = cdm) conceptSet <- omopgenerics::validateConceptSetArgument(conceptSet = conceptSet)# simple type checks omopgenerics::assertNumeric(days, integerish =TRUE, min =0, length =1) omopgenerics::assertDate(startDate, length =1, null =TRUE) omopgenerics::assertLogical(overlap, length =1)# business logic checksif (overlap && days >365) { cli::cli_abort(c("x"="{.var days} cannot be >= 365 when {.var overlap} is {.val TRUE}." )) }# ... function body}
8.5 Writing good error messages with cli
Custom errors and warnings should use the cli package, which is already a dependency of omopgenerics. Do not use stop() or warning().
# Errorcli::cli_abort(c("x"="Argument {.var days} must be a positive integer.","i"="You supplied {.val {days}}."))# Warning cli::cli_warn(c("!"="Cohort {.val {name}} already exists in the CDM and will be overwritten."))# Informational messagecli::cli_inform(c("i"="Computing {length(cohortId)} cohort{?s}."))
A few conventions for error messages:
Use {.var argument_name} to refer to an argument by name.
Use {.val {value}} to show the actual value that was supplied.
Use {.fn function_name} to refer to a function.
Use {.cls class_name} to refer to a class.
Use named elements "x" for errors, "!" for warnings, "i" for hints or additional information.
Write error messages in the present tense from the user’s perspective: “must be”, “cannot be”, “is not” rather than “expected”, “got”.
8.6 What to validate and what to skip
Validate arguments that users are likely to pass in many ways or get wrong:
Arguments with a standard validate* function — always validate these.
Type-sensitive arguments: anything the rest of the function passes directly to dplyr or database operations.
Arguments with constraints relative to each other (e.g. days must be positive when overlap is TRUE).
Do not validate:
Internal function arguments that are never user-facing.
Things already guaranteed by a validate* call (e.g. do not re-check that cohortId is numeric after calling validateCohortIdArgument).
Constants you define yourself inside the function.
8.7 Summary
Validating arguments is one of the most impactful things you can do for users — it turns opaque database errors into clear messages that point directly to the problem. omopgenerics makes this straightforward: use validate* for the standard OMOP arguments (and always reassign the result), use assert* for simple type checks, and use cli for any custom messages. A complete validation block for a typical function takes four or five lines and requires no custom logic.