library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.4.2
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(knitr)
#> Warning: package 'knitr' was built under R version 4.4.2
library(stringr)
#> Warning: package 'stringr' was built under R version 4.4.2
library(tidyr)
#> Warning: package 'tidyr' was built under R version 4.4.2
library(glue)
#> Warning: package 'glue' was built under R version 4.4.2
library(purrr)
#> Warning: package 'purrr' was built under R version 4.4.2
library(psHarmonize)
#> psHarmonize ver. 0.3.5
#>
#> psHarmonize evaluates and runs code entered in the harmonization sheet.
#> Make sure to only use harmonization sheets from authors you trust.
The harmonization sheet is the input for the harmonization function. It also serves as the set of instructions for recoding (if any) of your variables.
There are two different types of modifications you can choose from: - Recode category - Function
Recode category will let you change values of a categorizal variable. For example, if you have a variable with values of 1s and 0s, and you want your harmonized variable to have “Yes”s and “No”s, you can use the recode category option.
Cohort A has an education variable with the values of 1-5. They correspond to values such as “No education”, “Completed grade school”, etc.
If we want to harmonize these values from 1-5, to harmonized values,
we can use the recode category
option. In this example we
will harmonize the coded values to “No education/grade school”, “High
school”, and “College”.
The table below shows how these values relate to each other.
Coded values | Coded value meaning | Harmonized values |
---|---|---|
1 | No education | No education/grade school |
2 | Completed grade school | No education/grade school |
3 | Jr-High School | High school |
4 | Completed High School | High school |
5 | Some college | College |
In the harmonization sheet you will want to enter
recode category
in the code_type
column. You
will then type the code pairings in the code1
column with
an =
sign (1 = Yes
for example).
harmonization_sheet_example %>%
filter(study == 'Cohort A' & item == 'education') %>%
select(code_type, code1) %>%
kable()
code_type | code1 |
---|---|
recode category | 1 = No education/grade school; 2 = No education/grade school; 3 = High school; 4 = High school; 5 = College; 6 = Graduate/Professional |
The harmonization function will then code every value that is a 1 as “No education/grade school”, 2 as “No education/grade school”, etc.
If you want to convert a continuous variable (lbs to kg for example),
you can use the function
option. To do this you would enter
function
into code_type
. In code1
you would enter a function, with x
as your input
variable.
For example, Cohort A has weight variables stored as pounds. If we
want to convert these values to kilograms, we can enter the following
into code
: x / 2.205
.
harmonization_sheet_example %>%
filter(study == 'Cohort A' & item == 'weight') %>%
select(code_type, code1) %>%
kable()
code_type | code1 |
---|---|
function | x / 2.205 |
function | x / 2.205 |
function | x / 2.205 |
If you have a function that requires multiple variables as input, you
can add multiple source_item
variables seperated by a
semicolon:
var_1; var_2
You can then refer to the variables in code1
by x1, x2,
etc. If you wanted to add the two variables together you could write the
following in code1
.
x1 + x2
If you need to refer to a variable you previously made you can use
“previous_dataset” in source_dataset
. Make sure to refer to
the variables as their new names (in item
), and use “ID” as
the id_var
.