The zlog
package offers functions to transform laboratory measurements into standardised \(z\) or \(z(log)\)-values as suggested in Hoffmann et al. (2017). Therefore the lower and upper reference limits are needed. If these are not known they could estimated from a given sample.
Hoffmann et al. (2017) define \(z\) as follows:
\[z = (x - (limits_1 + limits_2 )/2) * 3.92/(limits_2 - limits_1)\]
Consequently the \(z(log)\) is defined as:
\[zlog = (\log(x) - (\log(limits_1) + \log(limits_2))/2) * 3.92/(\log(limits_2) - \log(limits_1))\]
Where \(x\) is the measured laboratory value and \(limits_1\) and \(limits_2\) are the lower and upper reference limit, respectively.
Example data and reference limits are taken from Hoffmann et al. (2017), Table 2.
library("zlog")
c(42, 34, 38, 43, 50, 42, 27, 31, 24)
albumin <-z(albumin, limits = c(35, 52))
## [1] -0.345876 -2.190548 -1.268212 -0.115292 1.498796 -0.345876 -3.804636
## [8] -2.882300 -4.496388
zlog(albumin, limits = c(35, 52))
## [1] -0.15472223 -2.24698167 -1.14569028 0.07826303 1.57162335 -0.15472223
## [7] -4.52949253 -3.16160843 -5.69571148
izlog(zlog(albumin, limits = c(35, 52)), limits = c(35, 52))
## [1] 42 34 38 43 50 42 27 31 24
Hoffmann et al. (2017) suggested a colour gradient to visualise laboratory measurements for the user.
It could be used to highlight the values in a table:
Category | albumin | zlog(albumin) | bilirubin | zlog(bilirubin) |
---|---|---|---|---|
blood donor | 42 | -0.15 | 11 | 0.88 |
blood donor | 34 | -2.25 | 9 | 0.55 |
blood donor | 38 | -1.15 | 2 | -1.96 |
hepatitis without cirrhosis | 43 | 0.08 | 5 | -0.43 |
hepatitis without cirrhosis | 50 | 1.57 | 22 | 2.04 |
hepatitis without cirrhosis | 42 | -0.15 | 42 | 3.12 |
hepatitis with cirrhosis | 27 | -4.53 | 37 | 2.90 |
hepatitis with cirrhosis | 31 | -3.16 | 200 | 5.72 |
hepatitis with cirrhosis | 24 | -5.70 | 20 | 1.88 |
The reference_limits
functions calculates the lower and upper 2.5 or 97.5 (or a user given probability) quantiles:
reference_limits(albumin)
## lower upper
## 24.6 48.6
reference_limits(albumin, probs = c(0.05, 0.95))
## lower upper
## 25.2 47.2
exp(reference_limits(log(albumin)))
## lower upper
## 24.57207 48.51429
Most laboratories use their own age- and sex-specific reference limits. The lookup_limits
function could be used to find the correct reference limit.
# toy example
data.frame(
reference <-param = c("albumin", rep("bilirubin", 4)),
age = c(0, 1, 2, 3, 7), # days
sex = "both",
units = c("g/l", rep("µmol/l", 4)), # ignored
lower = c(35, rep(NA, 4)), # no real reference values
upper = c(52, 5, 8, 13, 18) # no real reference values
)::kable(reference) knitr
param | age | sex | units | lower | upper |
---|---|---|---|---|---|
albumin | 0 | both | g/l | 35 | 52 |
bilirubin | 1 | both | µmol/l | NA | 5 |
bilirubin | 2 | both | µmol/l | NA | 8 |
bilirubin | 3 | both | µmol/l | NA | 13 |
bilirubin | 7 | both | µmol/l | NA | 18 |
# lookup albumin reference values for 18 year old woman
lookup_limits(
age = 18 * 365.25,
sex = "female",
table = reference[reference$param == "albumin",]
)
## lower upper
## albumin 35 52
# lookup albumin and bilirubin values for 18 year old woman
lookup_limits(
age = 18 * 365.25,
sex = "female",
table = reference
)
## lower upper
## albumin 35 52
## bilirubin NA 18
# lookup bilirubin reference values for infants
lookup_limits(
age = 0:8,
sex = rep(c("female", "male"), 5:4),
table = reference[reference$param == "bilirubin",]
)
## lower upper
## bilirubin NA NA
## bilirubin NA 5
## bilirubin NA 8
## bilirubin NA 13
## bilirubin NA 13
## bilirubin NA 13
## bilirubin NA 13
## bilirubin NA 18
## bilirubin NA 18
Sometimes reference limits are not specified. That is often the case for biomarkers that are related to infection or cancer. Using zero as lower boundary results in skewed distributions (Hoffmann et al. 2017, fig. 7). Haeckel et al. (2015) suggested to set the lower reference limit to 15 % of the upper one.
# use default fractions
set_missing_limits(reference)
## param age sex units lower upper
## 1 albumin 0 both g/l 35.00 52
## 2 bilirubin 1 both µmol/l 0.75 5
## 3 bilirubin 2 both µmol/l 1.20 8
## 4 bilirubin 3 both µmol/l 1.95 13
## 5 bilirubin 7 both µmol/l 2.70 18
# set fractions manually
set_missing_limits(reference, fraction = c(0.2, 5))
## param age sex units lower upper
## 1 albumin 0 both g/l 35.0 52
## 2 bilirubin 1 both µmol/l 1.0 5
## 3 bilirubin 2 both µmol/l 1.6 8
## 4 bilirubin 3 both µmol/l 2.6 13
## 5 bilirubin 7 both µmol/l 3.6 18
If laboratory measurements are missing they could be imputed using “normal” values from the reference table. Using the "logmean"
(default) or "mean"
reference value (default) will result in a \(zlog\) or \(z\)-value of zero, respectively.
data.frame(
x <-age = c(40, 50),
sex = c("female", "male"),
albumin = c(42, NA)
) x
## age sex albumin
## 1 40 female 42
## 2 50 male NA
z_df(impute_df(x, reference, method = "mean"), reference)
## age sex albumin
## 1 40 female -0.345876
## 2 50 male 0.000000
zlog_df(impute_df(x, reference), reference)
## age sex albumin
## 1 40 female -0.1547222
## 2 50 male 0.0000000
PBC
ExampleFor demonstration we choose the pbc
dataset from the survival
package and exclude all non-laboratory measurements except age and sex:
library("survival")
data("pbc")
c(
labs <-"bili", "chol", "albumin", "copper", "alk.phos", "ast", "trig",
"platelet", "protime"
) pbc[, c("age", "sex", labs)]
pbc <-::kable(head(pbc), digits = 1) knitr
age | sex | bili | chol | albumin | copper | alk.phos | ast | trig | platelet | protime |
---|---|---|---|---|---|---|---|---|---|---|
58.8 | f | 14.5 | 261 | 2.6 | 156 | 1718.0 | 137.9 | 172 | 190 | 12.2 |
56.4 | f | 1.1 | 302 | 4.1 | 54 | 7394.8 | 113.5 | 88 | 221 | 10.6 |
70.1 | m | 1.4 | 176 | 3.5 | 210 | 516.0 | 96.1 | 55 | 151 | 12.0 |
54.7 | f | 1.8 | 244 | 2.5 | 64 | 6121.8 | 60.6 | 92 | 183 | 10.3 |
38.1 | f | 3.4 | 279 | 3.5 | 143 | 671.0 | 113.2 | 72 | 136 | 10.9 |
66.3 | f | 0.8 | 248 | 4.0 | 50 | 944.0 | 93.0 | 63 | NA | 11.0 |
Next we estimate all reference limits from the data. We want to use sex-specific values for copper and aspartate aminotransferase ("ast"
).
## replicate copper and ast 2 times, use the others just once
rep(labs, ifelse(labs %in% c("copper", "ast"), 2, 1))
param <- rep_len("both", length(param))
sex <-
## replace sex == both with female and male for copper and ast
%in% c("copper", "ast")] <- c("f", "m")
sex[param
## create data.frame, we ignore age-specific values for now and set age to zero
## (means applicable for all ages)
data.frame(
reference <-param = param, age = 0, sex = sex, lower = NA, upper = NA
)
## estimate reference limits from sample data
for (i in seq_len(nrow(reference))) {
c("lower", "upper")] <-
reference[i, if (reference$sex[i] == "both")
reference_limits(pbc[reference$param[i]])
else
reference_limits(pbc[pbc$sex == reference$sex[i], reference$param[i]])
}::kable(reference) knitr
param | age | sex | lower | upper |
---|---|---|---|---|
bili | 0 | both | 0.4000 | 17.3150 |
chol | 0 | both | 174.0750 | 1086.2250 |
albumin | 0 | both | 2.5400 | 4.2200 |
copper | 0 | f | 12.8250 | 269.2750 |
copper | 0 | m | 23.5000 | 388.0000 |
alk.phos | 0 | both | 504.7500 | 9261.7400 |
ast | 0 | f | 49.6000 | 249.7438 |
ast | 0 | m | 55.4775 | 208.3350 |
trig | 0 | both | 52.0250 | 279.8000 |
platelet | 0 | both | 95.0000 | 470.4000 |
protime | 0 | both | 9.5000 | 13.1625 |
The pbc
dataset contains a few missing values. We impute the with the corresponding mean reference value (which is in this example just the sample mean but would be in real life the mean of a e.g. healthy subpopulation).
c(6, 14),] pbc[
## age sex bili chol albumin copper alk.phos ast trig platelet protime
## 6 66.25873 f 0.8 248 3.98 50 944 93 63 NA 11
## 14 56.22177 m 0.8 NA 2.27 43 728 71 NA 156 11
impute_df(pbc, reference)
pbc <-c(6, 14),] pbc[
## age sex bili chol albumin copper alk.phos ast trig platelet
## 6 66.25873 f 0.8 248.0000 3.98 50 944 93 63.0000 211.3954
## 14 56.22177 m 0.8 434.8386 2.27 43 728 71 120.6507 156.0000
## protime
## 6 11
## 14 11
Subsequently we can convert the laboratory measurements into \(z(log)\)-values using the zlog_df
function that applies the zlog
for every numeric
column in a data.frame
(except the "age"
column):
zlog_df(pbc, reference) pbc <-
age | sex | bili | chol | albumin | copper | alk.phos | ast | trig | platelet | protime |
---|---|---|---|---|---|---|---|---|---|---|
58.8 | f | 1.8 | -1.1 | -1.8 | 1.3 | -0.3 | 0.5 | 0.8 | -0.3 | 1.0 |
56.4 | f | -0.9 | -0.8 | 1.8 | -0.1 | 1.7 | 0.0 | -0.7 | 0.1 | -0.6 |
70.1 | m | -0.7 | -1.9 | 0.5 | 1.1 | -1.9 | -0.3 | -1.8 | -0.8 | 0.8 |
54.7 | f | -0.4 | -1.2 | -2.0 | 0.1 | 1.4 | -1.5 | -0.6 | -0.4 | -1.0 |
38.1 | f | 0.3 | -1.0 | 0.6 | 1.1 | -1.6 | 0.0 | -1.2 | -1.1 | -0.3 |
66.3 | f | -1.2 | -1.2 | 1.5 | -0.2 | -1.1 | -0.4 | -1.5 | 0.0 | -0.2 |
55.5 | f | -1.0 | -0.6 | 1.7 | -0.2 | -1.3 | -1.5 | 1.3 | -0.1 | -1.7 |
53.1 | f | -2.3 | -0.9 | 1.5 | -0.2 | 1.0 | -3.3 | 1.0 | 1.4 | -0.2 |
42.5 | f | 0.2 | 0.5 | -0.5 | 0.4 | 0.1 | 0.6 | -0.7 | 0.4 | -0.2 |
70.6 | f | 1.6 | -1.7 | -1.4 | 1.1 | -1.2 | 0.7 | 0.4 | 0.9 | 0.3 |
53.7 | f | -0.7 | -1.1 | 1.8 | -0.3 | -0.9 | -0.8 | -1.0 | 0.5 | 0.8 |
59.1 | f | 0.3 | -1.3 | 0.6 | 0.6 | -1.7 | -0.7 | -0.6 | -2.7 | 2.4 |
45.7 | f | -1.4 | -0.9 | 1.3 | -0.5 | -0.8 | -0.6 | 0.2 | 0.4 | -0.6 |
56.2 | m | -1.2 | 0.0 | -2.8 | -1.1 | -1.5 | -1.2 | 0.0 | -0.7 | -0.2 |
64.6 | f | -1.2 | -1.4 | 1.3 | 1.4 | 1.9 | 0.3 | -0.5 | 0.8 | -0.2 |
40.4 | f | -1.4 | -1.6 | 0.9 | -1.0 | -1.5 | -1.0 | -1.7 | -0.2 | -0.4 |
52.2 | f | 0.0 | -1.0 | -0.3 | 1.3 | -0.5 | 0.1 | 0.1 | 0.1 | -0.8 |
53.9 | f | 1.5 | -1.9 | -1.2 | 3.0 | -1.1 | 2.2 | 1.2 | 0.7 | 1.2 |
49.6 | f | -1.4 | -1.3 | 0.6 | -0.5 | -0.2 | -0.4 | 0.0 | -0.0 | -0.2 |
60.0 | f | 0.7 | -0.3 | 0.5 | 1.1 | -0.2 | 0.2 | 0.3 | 1.0 | 1.8 |
64.2 | m | -1.5 | -1.2 | 1.2 | -1.2 | -1.3 | -1.5 | -0.9 | 1.1 | 0.2 |
56.3 | f | 0.3 | -1.0 | 0.8 | 2.7 | -0.6 | 0.2 | -1.8 | -0.5 | 0.4 |
56.0 | f | 2.0 | -0.2 | -0.8 | 2.9 | 1.4 | 1.7 | 1.1 | 0.0 | 0.5 |
44.5 | m | -0.2 | 0.1 | 1.5 | 0.4 | 1.3 | 2.1 | 1.5 | -2.7 | -1.5 |
45.1 | f | -1.4 | -0.8 | 1.7 | -0.5 | -1.6 | -0.1 | -1.4 | 1.0 | 0.1 |
sessionInfo()
## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 11 (bullseye)
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
## [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] survival_3.2-7 zlog_1.0.2
##
## loaded via a namespace (and not attached):
## [1] bslib_0.4.2 compiler_4.2.2 jquerylib_0.1.4 highr_0.10
## [5] tools_4.2.2 digest_0.6.31 lattice_0.20-41 jsonlite_1.8.4
## [9] evaluate_0.19 lifecycle_1.0.3 viridisLite_0.4.1 rlang_1.0.6
## [13] Matrix_1.5-3 cli_3.5.0 rstudioapi_0.14 yaml_2.3.6
## [17] xfun_0.36 fastmap_1.1.0 kableExtra_1.3.4 httr_1.4.4
## [21] stringr_1.5.0 knitr_1.41 xml2_1.3.3 vctrs_0.5.1
## [25] sass_0.4.4 systemfonts_1.0.4 grid_4.2.2 webshot_0.5.4
## [29] svglite_2.1.0 glue_1.6.2 R6_2.5.1 rmarkdown_2.19
## [33] magrittr_2.0.3 scales_1.2.1 htmltools_0.5.4 splines_4.2.2
## [37] rvest_1.0.3 colorspace_2.0-3 stringi_1.7.8 munsell_0.5.0
## [41] cachem_1.0.6
Haeckel, Rainer, Werner Wosniok, Ebrhard Gurr, and Burkhard Peil. 2015. “Permissible Limits for Uncertainty of Measurement in Laboratory Medicine.” Clinical Chemistry and Laboratory Medicine 53 (8): 1161–71. https://doi.org/10.1515/cclm-2014-0874.
Hoffmann, Georg, Frank Klawonn, Ralf Lichtinghagen, and Matthias Orth. 2017. “The Zlog-Value as Basis for the Standardization of Laboratory Results.” LaboratoriumsMedizin 41 (1): 23–32. https://doi.org/10.1515/labmed-2016-0087.