TestGenerator
takes as an input an Excel file with
sheets that represent a table in the OMOP-CDM. The following example (testPatientsRSV.xlsx)
represents a population of 10 patients, some of them with RSV.
#> # A tibble: 10 × 5
#> person_id gender_concept_id year_of_birth race_concept_id
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 8532 1980 0
#> 2 2 8507 1980 0
#> 3 3 8532 1965 0
#> 4 4 8532 2010 0
#> 5 5 8532 1936 0
#> 6 6 8532 1970 0
#> 7 7 8532 1988 0
#> 8 8 8507 1998 0
#> 9 9 8507 1990 0
#> 10 10 8532 1945 0
#> # ℹ 1 more variable: ethnicity_concept_id <dbl>
The user can include only the tables that are relevant to the analysis.
#> [1] "person" "observation_period" "condition_occurrence"
#> [4] "visit_occurrence" "visit_detail" "death"
TestGenerator::readPatients()
converts the file into
JSON format and saves it in the project. The sample data is then pushed
to a blank CDM with patientsCDM()
.
#> Unit Test Definition created successfully: test
#> Patients pushed to blank CDM successfully
#> # OMOP CDM reference (tbl_duckdb_connection)
#>
#> Tables: person, observation_period, visit_occurrence, visit_detail, condition_occurrence, drug_exposure, procedure_occurrence, device_exposure, measurement, observation, death, note, note_nlp, specimen, fact_relationship, location, care_site, provider, payer_plan_period, cost, drug_era, dose_era, condition_era, metadata, cdm_source, concept, vocabulary, domain, concept_class, concept_relationship, relationship, concept_synonym, concept_ancestor, source_to_concept_map, drug_strength, cohort_definition, attribute_definition
That returns a CDM reference object that now can be used to perform unit tests.
#> # Source: table<person> [?? x 18]
#> # Database: DuckDB v0.9.1 [cbarboza@Windows 10 x64:R 4.3.1/C:\Users\cbarboza\AppData\Local\Temp\Rtmp6Jtt0B\file92c81df13da1.duckdb]
#> person_id gender_concept_id year_of_birth month_of_birth day_of_birth
#> <int> <int> <int> <int> <int>
#> 1 1 8532 1980 NA NA
#> 2 2 8507 1980 NA NA
#> 3 3 8532 1965 NA NA
#> 4 4 8532 2010 NA NA
#> 5 5 8532 1936 NA NA
#> 6 6 8532 1970 NA NA
#> 7 7 8532 1988 NA NA
#> 8 8 8507 1998 NA NA
#> 9 9 8507 1990 NA NA
#> 10 10 8532 1945 NA NA
#> # ℹ more rows
#> # ℹ 13 more variables: birth_datetime <dttm>, race_concept_id <int>,
#> # ethnicity_concept_id <int>, location_id <int>, provider_id <int>,
#> # care_site_id <int>, person_source_value <chr>, gender_source_value <chr>,
#> # gender_source_concept_id <int>, race_source_value <chr>,
#> # race_source_concept_id <int>, ethnicity_source_value <chr>,
#> # ethnicity_source_concept_id <int>