README

Risk-related information — like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments — can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, riskyr computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent (see Neth et al., 2021, for details).

Motivation

The goals of riskyr are less of a computational and more of a representational nature: We express risk-related information in multiple formats, facilitate the translation between them, and provide a variety of attractive visualizations that emphasize different aspects of risk-related scenarios (see Neth et al., 2021, for theoretical details). Whereas people find it difficult to understand and compute information expressed in terms of probabilities, the same information is easier to understand and compute when expressed in terms of frequencies (e.g., Gigerenzer, 2002, 2014; Gigerenzer & Hoffrage, 1995). But rather than just expressing probabilities in terms of frequencies, riskyr allows translating between formats and illustrates the relationships between different representations in a variety of ways. Switching between and interacting with different representations fosters transparency and boosts human understanding of risk-related information.²

The basic assumptions and aspirations driving the current development of riskyr include:

Based on these assumptions and goals, riskyr provides a range of computational and representational tools. Importantly, the objects and functions in the riskyr toolbox are not isolated, but complement, explain, and support each other. All functions and visualizations can be used separately or explored interactively, providing immediate feedback on the effect of changes in parameter values. By providing a variety of customization options, users can explore and design representations of risk-related information that fit to different tasks and meet their personal needs and goals.

Getting riskyr

Installation

Available resources

Quick start guide

Defining a scenario

Probabilities provided

The first challenge in solving such problems is in understanding the information that is being provided. The problem description provides three essential probabilities:

Understanding the questions asked

The second challenge here lies in understanding the questions that are being asked — and in realizing that their answers are not simply the decision’s sensitivity or specificity values. Instead, we are asked to provide two conditional probabilities:

Translating into frequencies

One of the best tricks in risk literacy education is to translate probabilistic information into frequencies.⁴ To do this, we imagine a representative sample of N = 1000 individuals. Rather than asking about the probabilities for Mr. and Ms. Smith, we could re-frame the questions as:

Using riskyr

Creating a scenario from probabilities

We define a new riskyr scenario (called hustosis) by using the riskyr() function and entering the information provided by our problem as its arguments:

By providing the argument N = 1000 we define the scenario for a target population of 1000 people. If we leave this parameter unspecified (or NA), the riskyr() function will automatically pick a suitable value of N.

Summary

To obtain a quick overview of key parameter values, we ask for the summary of hustosis:

The summary distinguishes between probabilities, frequencies, and accuracy information. In Probabilities we find the answer to both of our questions that take into account all the information provided above:

If find these answers surprising, you are an ideal candidate for additional insights into the realm of risk literacy. A key component of riskyr is to analyze and view a scenario from a variety of different perspectives. To get you started immediately, we only illustrate some introductory commands here and focus on different types of visualizations. (Call riskyr.guide() for various vignettes that provide more detailed information.)

Creating a scenario from frequencies

Rather than defining our hustosis scenario by providing 3 essential probabilities (prev, sens, and spec), we could define the same scenario by providing 4 essential frequencies (hi, mi, fa, and cr) as follows:

As we took the values of these frequencies from the summary of hustosis, the hustosis_2 scenario should contain exactly the same information as hustosis:

Visualizations

Various visualizations of riskyr scenarios can be created by a range of plotting functions.

Prism plot

The default type of plot used in riskyr is a prism plot (or network diagram) that shows key frequencies of a scenario as nodes and key probabilities as edges linking the nodes:

Tree diagram

A tree diagram is the upper half of a prism plot, which can be obtained by plotting a scenario with 1 of 3 perspectives:

This particular tree splits the population of N = 1000 individuals into two subgroups by decision (by = "dc") and contains the answer to the second (frequency) version of our questions:

Of course, the frequencies of these ratios were already contained in the hustosis summary above. But the representation in the form of a tree diagram makes it easier to understand the decomposition of the population into subgroups and to see which frequencies are required to answer a particular question.

Frequency net

A new type of visualization combines elements from 2x2 tables with those of tree or double tree diagrams. The frequency net (Binder et al., 2020) is similar to a 2x2 table insofar as its perspectives (shown by arranging marginal frequencies in a vertical vs. horizontal fashion) do not suggest an order or dependency (in contrast to trees or mosaic plots). Additionally, the frequency net allows showing 3 kinds of (marginal, conditional, and joint) probabilities:

Icon array

An icon array shows the classification result for each of N = 1000 individuals in our population:

While this particular icon array is highly regular (as both the icons and classification types are ordered), riskyr provides many different versions of this type of graph. This allows viewing the probability of diagnostic outcomes as either frequency, area, or density (see ?plot_icons for details and examples).

Area plot

An area plot (or mosaic plot) offers a way of expressing classification results as the relationship between areas. Here, the entire population is represented as a square and the probability of its subgroups as the size of rectangles (see ?plot_area for details and examples):

Table plot

When not scaling the size of rectangles by their relative frequencies or probabilities, we can plot basic scenario information as a 2-by-2 confusion (or contingency) table (see ?plot_tab for details and examples):

Bar plot

A bar plot allows comparing relative frequencies as the heights of bars (see ?plot_bar for details and examples):

Curves

By adopting a functional perspective, we can ask how the values of some probabilities (e.g., the predictive values PPV and NPV) change as a function of another (e.g., the condition’s prevalence prev, see ?plot_curve for details and examples):

Planes

When parameter values systematically depend on two other parameters, we can plot this as a plane in a 3D cube. The following graph plots the PPV as a function of the sensitivity (sens) and specificity (spec) of our test for a given prevalence (prev, see ?plot_plane for details and examples):

The L-shape of this plane reveals a real problem with our current test: Given a prevalence of 4% for hustosis in our target population, the PPV remains very low for the majority of the possible range of sensitivity and specificity values. To achieve a high PPV, the key requirement for our test is an extremely high specificity. Although our current specificity value of 95% (spec = .95) may sound pretty good, it is still not high enough to yield a PPV beyond 40%.

Using existing scenarios

As defining your own scenarios can be cumbersome and the literature is full of risk-related problems (often referred to as “Bayesian reasoning”), riskyr provides a set of — currently 24 — pre-defined scenarios (stored in a list scenarios). Here, we provide an example that shows how you can select and explore them.

Selecting a scenario

Let us assume you want to learn more about the controversy surrounding screening procedures of prostate-cancer (known as PSA screening). Scenario 10 in our collection of scenarios is from an article on this topic (Arkes & Gaissmaier, 2012). To select a particular scenario, simply assign it to an R object. For instance, we can assign Scenario 10 to s10:

Scenario summary

Our selected scenario object s10 is a list with 30 elements, which describe it in both text and numeric variables. The following commands provide an overview of s10 in text form:

Generating some riskyr plots allows a quick visual exploration of the scenario. We only illustrate some selected plots and options here, and trust that you will play with and explore the rest for yourself.

Prism plots

A tree diagram is a prism plot that views the population from only one perspective, but provides a quick overview. In the following plot, the boxes are depicted as squares with area sizes that are scaled by relative frequencies (using the area = "sq" argument):

The prism plot (or network diagram) combines 2 tree diagrams to simultaneously provide two perspectives on a population (see Wassner et al., 2004). riskyr provides several variants of prism plots. To avoid redundancy to the previous tree diagram, the following version splits the population by accuracy and by decision (see the by = "acdc" argument). In addition, the frequencies are represented as horizontal rectangles (area = "hr") so that their relative width reflect the number of people in the corresponding subgroup:

Frequency net

Just like the 2x2 table, area plot, and prism plot, the frequency net allows selecting two out of three perspectives. Additionally, the shape and size of the frequency boxes can be adjusted by using the area = "sq" option. The following example shows a frequency net by condition and accuracy (by = "cdac") without the joint probabilities, with custom settings for labels, links, and colors:

Icon array

Area plot

Table plot

Curves

The following curves show the values of several conditional probabilities as a function of prevalence:

Adding the argument what = "all" also shows the proportion of positive decisions (ppod) and the decision’s overall accuracy (accu) as a function of the prevalence (prev). Would you have predicted their shape without seeing this graph?

Planes

The following surface shows the negative predictive value (NPV) as a function of sensitivity and specificity (for a given prevalence):

Hopefully, this brief overview managed to whet your appetite for visual exploration. If so, call riskyr.guide() for viewing the package vignettes and obtaining additional information.

About

The riskyr package is open source software written in R and released under the GPL 2 | GPL 3 licenses.

The theoretical background of riskyr is illuminated further in the following article:

Resources

Contact

Reference

References

Type:	Version:	URL:
A. riskyr (R package):	Release version	https://CRAN.R-project.org/package=riskyr
	Development version	https://github.com/hneth/riskyr/
B. riskyrApp (R Shiny code):	Online version	https://riskyr.org/
	Development version	https://github.com/hneth/riskyrApp/
C. Online documentation:	Release version	https://hneth.github.io/riskyr/
	Development version	https://hneth.github.io/riskyr/dev/

Simon, H.A. (1996). The Sciences of the Artificial (3rd ed.). The MIT Press, Cambridge, MA. (p. 132).↩︎
To clarify our notion of “risk” in this context, we need to distinguish it from its everyday usage as anything implying a chance of danger or harm.
In basic research on judgment and decision making and the more applied fields of risk perception and risk communication, the term risk typically refers to decisions or events for which the options and their consequences are known and probabilities for all possible outcomes can be provided.
For our present purposes, the notion of risk-related information refers to any scenario in which some events of interest are determined by probabilities. While it is important that quantitative (estimates of) probabilities are provided, their origin, reliability and validity is not questioned here. Thus, the probabilities provided can be based on clinical intuition, on recordings of extensive experience, or on statistical simulation models (e.g., repeatedly casting dice and counting the frequencies of outcomes).
This notion of risk is typically contrasted with the much wider notion of uncertainty in which options or probabilities are unknown or cannot be quantified. (See Gigerenzer and Gaissmaier, 2011, or Neth and Gigerenzer, 2015, on this conceptual distinction and corresponding decision strategies.)↩︎
See Gigerenzer (2002, 2014), Gigerenzer and Hoffrage, U. (1995), Gigerenzer et al. (2007), and Hoffrage et al. (2015) for scientific background information and similar problems. See Sedlmeier and Gigerenzer (2001) and Kurzenhäuser and Hoffrage (2002) for related training programs (with remarkable results), and Micallef et al. (2012) and Khan et al. (2015) for (rather sceptical and somewhat sobering) studies on the potential benefits of static representations for solving Bayesian problems.↩︎
See Gigerenzer and Hoffrage (1995) and Hoffrage et al. (2000, 2002) on the concept of natural frequencies.↩︎

riskyr 0.5.0

A toolbox for rendering risk literacy more transparent

Motivation

Getting riskyr

Installation

Available resources

Quick start guide

Defining a scenario

Probabilities provided

Understanding the questions asked

Translating into frequencies

Using riskyr

Creating a scenario from probabilities

Summary

Creating a scenario from frequencies

Visualizations

Prism plot

Tree diagram

Frequency net

Icon array

Area plot

Table plot

Bar plot

Curves

Planes

Using existing scenarios

Selecting a scenario

Scenario summary

Prism plots

Frequency net

Icon array

Area plot

Table plot

Curves

Planes

About

Resources

Contact

Reference

References