| Type: | Package | 
| Title: | A Comprehensive Collection of Crime-Related Datasets | 
| Version: | 0.1.0 | 
| Maintainer: | Renzo Caceres Rossi <arenzocaceresrossi@gmail.com> | 
| Description: | A comprehensive collection of datasets exclusively focused on crimes, criminal activities, and related topics. This package serves as a valuable resource for researchers, analysts, and students interested in crime analysis, criminology, social and economic studies related to criminal behavior. Datasets span global and local contexts, with a mix of tabular and spatial data. | 
| License: | GPL-3 | 
| URL: | https://github.com/lightbluetitan/crimedatasets | 
| BugReports: | https://github.com/lightbluetitan/crimedatasets/issues | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| Suggests: | ggplot2, dplyr, knitr, rmarkdown, sf, testthat (≥ 3.0.0) | 
| Config/testthat/edition: | 3 | 
| RoxygenNote: | 7.3.2 | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2024-11-29 19:56:52 UTC; renzocrossi | 
| Author: | Renzo Caceres Rossi [aut, cre] | 
| Depends: | R (≥ 3.5.0) | 
| Repository: | CRAN | 
| Date/Publication: | 2024-12-02 12:50:37 UTC | 
crimedatasets: A Comprehensive Collection of Crime-Related Datasets
Description
A comprehensive collection of datasets exclusively focused on crimes, criminal activities, and related topics. This package serves as a valuable resource for researchers, analysts, and students interested in crime analysis, criminology, and socio-economic studies related to criminal behavior.
Details
crimedatasets: A Comprehensive Collection of Crime-Related Datasets
A Comprehensive Collection of Crime-Related Datasets.
Author(s)
Maintainer: Renzo Cáceres Rossi arenzocaceresrossi@gmail.com
See Also
Useful links:
Crime Records of Abilene, Texas, USA
Description
This dataset contains information on reported crimes in Abilene, Texas, including the type of crime, year of the incident, and the number of reported cases. It provides a snapshot of crime patterns in the city for the years 1992 and 1999.
Usage
data(Abilene_tbl_df)
Format
A tibble with 16 observations and 3 variables:
- crimetype
 Type of crime (character).
- year
 Year of the reported crime (factor).
- number
 Number of reported crimes (integer).
Details
The dataset name has been changed to 'Abilene_tbl_df' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble in R. The original content has not been modified in any way.
Source
Uniform Crime Reports, U.S. Department of Justice.
Convictions Reported by U.S. Attorney's Offices
Description
This dataset contains information on the number of convictions reported by U.S. attorney's offices, along with the number of staff members, normalized per 1 million population. The dataset also includes the district names for each observation.
Usage
data(Attorney_tbl_df)
Format
A tibble with 88 observations and 3 variables:
- staff
 Number of U.S. attorneys' office staff per 1 million population (integer).
- convict
 Number of convictions reported by U.S. attorneys' offices per 1 million population (integer).
- district
 Name of the district (character). Possible values include major U.S. cities such as Albuquerque, Atlanta, Boston, Chicago, Houston, Miami, San Francisco, and others.
Details
The dataset name has been changed to 'Attorney_tbl_df' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble in R. The original content has not been modified in any way.
Source
Data from U.S. Attorney's Office Reports.
Boston Housing Data
Description
This dataset contains information on housing values and various factors influencing those values in 506 suburbs of Boston. It provides detailed insights into the factors such as crime rates, proximity to highways, and the quality of the local environment that may affect housing prices.
Usage
data(Boston_df)
Format
A data frame with 506 observations and 14 variables:
- crim
 Per capita crime rate by town (numeric).
- zn
 Proportion of residential land zoned for lots over 25,000 sq.ft. (numeric).
- indus
 Proportion of non-retail business acres per town (numeric).
- chas
 Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) (integer).
- nox
 Nitrogen oxides concentration (parts per 10 million) (numeric).
- rm
 Average number of rooms per dwelling (numeric).
- age
 Proportion of owner-occupied units built prior to 1940 (numeric).
- dis
 Weighted mean of distances to five Boston employment centres (numeric).
- rad
 Index of accessibility to radial highways (integer).
- tax
 Full-value property-tax rate per $10,000 (numeric).
- ptratio
 Pupil-teacher ratio by town (numeric).
- black
 1000(Bk - 0.63)^2 where Bk is the proportion of Black population by town (numeric).
- lstat
 Lower status of the population (percent) (numeric).
- medv
 Median value of owner-occupied homes in $1000s (numeric).
Details
The dataset name has been changed to 'Boston_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
This dataset was obtained from the Boston dataset, which is part of the MASS package, with slight modifications.
Cybersecurity Breaches Reported to US Health Department
Description
This dataset contains records of cybersecurity breaches reported to the US Department of Health and Human Services (HHS). Since October 2009, organizations in the United States that store data on human health are required to report incidents compromising the confidentiality of 500 or more patients or human subjects (45 C.F.R. 164.408). These reports are publicly available and provide detailed information about the affected entities, breach types, and impacted individuals.
Usage
data(CyberSecurityBreaches_df)
Format
A data frame with 1,151 observations and 9 variables:
- Name.of.Covered.Entity
 Name of the covered entity involved in the breach (character).
- State
 US state where the entity is located (factor with 52 levels).
- Covered.Entity.Type
 Type of the covered entity (factor with 4 levels).
- Individuals.Affected
 Number of individuals affected by the breach (integer).
- Breach.Submission.Date
 Date the breach was reported (Date).
- Type.of.Breach
 Type of breach (factor with 29 levels).
- Location.of.Breached.Information
 Location of the breached information (factor with 47 levels).
- Business.Associate.Present
 Indicates whether a business associate was involved (logical).
- Web.Description
 Description of the breach provided online (character).
Details
The dataset name has been changed to 'CyberSecurityBreaches_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
Cybersecurity breach data downloaded from the Office for Civil Rights of the US Department of Health and Human Services (HHS) on 2015-02-26.
Death Penalty and Race in Georgia
Description
This dataset contains data collected by lawyers on convicted Black murderers in the state of Georgia. The goal was to examine whether convicted Black murderers whose victim was white were more likely to receive the death penalty compared to those whose victim was Black, accounting for the level of aggravation of the crime.
Usage
data(DeathPenaltyRace_df)
Format
A data frame with 12 observations and 4 variables:
- Aggravation
 Level of aggravation of the murder (integer). Categories range from 1 (least serious) to 6 (most serious).
- Victim
 Race of the victim (factor with 2 levels: "White" and "Black").
- Death
 Number of cases where the death penalty was given (integer).
- NoDeath
 Number of cases where the death penalty was not given (integer).
Details
The dataset name has been changed to 'DeathPenaltyRace_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
Data collected on death penalty cases in Georgia.
US Casualties: Drunk Driving, Suicide, Terrorism
Description
This dataset contains data on fatalities and casualties in the U.S. for drunk-driving incidents, suicides, and acts of terrorism. The dataset spans the years 1970 to 2018 and provides insights into the impact of these causes of death and injury over time.
Usage
data(DrunkDST_tbl_df)
Format
A tibble with 49 observations and 5 variables:
- year
 Year of the observation (numeric).
- nkill
 Number of people killed in acts of terrorism (numeric).
- terrtotal
 Total number of casualties (injuries and fatalities) caused by terrorism (numeric).
- suicides
 Number of suicides (numeric).
- ddfat
 Number of fatalities caused by drunk-driving incidents (numeric).
Details
The dataset name has been changed to 'DrunkDST_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble. The original content has not been modified in any way.
Source
Data on casualties and fatalities from drunk-driving, suicide, and terrorism in the U.S., 1970–2018.
FBI Criminal Background Check System
Description
This dataset contains detailed data from the FBI's National Instant Criminal Background Check System (NICS) on firearm background checks across U.S. states. It includes monthly data on gun sales, population statistics, and various firearm-related activities from multiple categories.
Usage
data(FBICriminal_tbl_df)
Format
A tibble with 11,648 observations and 35 variables:
- state
 State where the data was recorded (character).
- year
 Year of the observation (integer).
- month
 Month of the observation (character).
- month.num
 Numeric representation of the month (integer).
- population
 Population of the state (integer).
- guns_per_thousand
 Number of guns per 1,000 people (numeric).
- guns_sold
 Total guns sold (integer).
- multiplier
 Adjustments for sales data (numeric).
- instore_purchases
 Number of in-store purchases (integer).
- permit
 Number of gun permits issued (integer).
- permit_recheck
 Flag for permit recheck status (character).
- handgun
 Number of handguns sold (integer).
- longgun
 Number of long guns sold (integer).
- other
 Number of other types of firearms sold (integer).
- multiple
 Number of multiple gun purchases (integer).
- multiple_corrected
 Corrected count of multiple purchases (integer).
- admin
 Administrative checks conducted (integer).
- prepawn_handgun
 Number of prepawned handguns (integer).
- prepawn_longgun
 Number of prepawned long guns (integer).
- prepawn_other
 Number of prepawned other firearms (integer).
- redemption_handgun
 Number of redeemed handguns (integer).
- redemption_longgun
 Number of redeemed long guns (integer).
- redemption_other
 Number of redeemed other firearms (integer).
- returned_handgun
 Number of returned handguns (integer).
- returned_longgun
 Number of returned long guns (integer).
- returned_other
 Number of returned other firearms (integer).
- rental_handgun
 Number of handguns rented (integer).
- rental_longgun
 Number of long guns rented (integer).
- private_handgun
 Number of privately sold handguns (integer).
- private_longgun
 Number of privately sold long guns (integer).
- private_other
 Number of privately sold other firearms (integer).
- privatereturn_handgun
 Number of privately returned handguns (integer).
- privatereturn_longgun
 Number of privately returned long guns (integer).
- privatereturn_other
 Number of privately returned other firearms (integer).
- totals
 Total checks conducted (integer).
Details
The dataset name has been changed to 'FBICriminal_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble, which is a modern form of a data frame in R. The original content has not been modified in any way.
Source
FBI's National Instant Criminal Background Check System (NICS).
Drunk Driving Laws and Traffic Deaths
Description
This dataset contains data on traffic fatalities and laws related to drunk driving across U.S. states. It includes information on beer taxes, minimum legal drinking age (MLDA), and other socioeconomic factors observed between 1982 and 1988.
Usage
data(Fatality_df)
Format
A data frame with 336 observations and 10 variables:
- state
 State identifier (integer).
- year
 Year of the observation (integer).
- mrall
 Motor vehicle fatality rate per 100,000 population (numeric).
- beertax
 Beer tax in dollars per gallon (numeric).
- mlda
 Minimum legal drinking age (MLDA) (numeric).
- jaild
 Indicator for mandatory jail sentence for drunk-driving (Factor: Yes/No).
- comserd
 Indicator for mandatory community service for drunk-driving (Factor: Yes/No).
- vmiles
 Vehicle miles traveled in billions (numeric).
- unrate
 Unemployment rate (numeric).
- perinc
 Per capita income in dollars (numeric).
Details
The dataset name has been changed to 'Fatality_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is stored as a traditional data frame in R. The original content has not been modified in any way.
Source
Panel data on drunk driving laws and traffic deaths in the U.S. for 48 states, 1982–1988.
Gallup Marijuana Possession Poll (1980)
Description
This dataset contains the results of a Gallup poll conducted in 1980 regarding public opinion on whether possession of marijuana should be considered a criminal offense. The dataset includes demographic information and the corresponding opinions of the respondents.
Usage
data(Gallup_tbl_df)
Format
A tibble with 1,200 observations and 2 variables:
- demographics
 Demographic category of the respondent (factor with 12 levels).
- opinion
 Respondent's opinion on marijuana possession as a criminal offense (factor with 3 levels).
Details
The dataset name has been changed to 'Gallup_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble in R. The original content has not been modified in any way.
Source
Results of a Gallup poll conducted in 1980.
Canadian Crime Rates Time Series (1931–1968)
Description
This dataset, known as the Hartnagel dataset, contains an annual time series of crime rates and related socio-economic data in Canada from 1931 to 1968. It includes variables such as total fertility rates, labor force participation rates, and crime statistics disaggregated by gender. Note that some data points are missing.
Usage
data(Hartnagel_df)
Format
A data frame with 38 observations and 8 variables:
- year
 Year of observation (integer).
- tfr
 Total fertility rate per 1,000 women (integer).
- partic
 Labor force participation rate per 1,000 people (integer).
- degrees
 Number of university degrees conferred per 1,000 people (numeric).
- fconvict
 Convictions of females per 100,000 people (numeric).
- ftheft
 Thefts by females per 100,000 people (numeric).
- mconvict
 Convictions of males per 100,000 people (numeric).
- mtheft
 Thefts by males per 100,000 people (numeric).
Details
The dataset name has been changed to 'Hartnagel_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
The data is an annual time-series from 1931 to 1968. Some observations contain missing data.
Source
Hartnagel dataset, providing insights into Canadian crime rates and socio-economic factors.
Type of Drug Offense by Race
Description
This dataset provides information on the type of drug offenses categorized by race. It contains records that can be used to analyze racial patterns in drug-related offenses. The data is sourced from a comparative study of federal and state prison inmates.
Usage
data(Inmate_tbl_df)
Format
A tibble with 28,047 observations and 2 variables:
- race
 Race of the individual (factor with 3 levels).
- drug
 Type of drug offense (factor with 4 levels).
Details
The dataset name has been changed to 'Inmate_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble. The original content has not been modified in any way.
This dataset provides insights into racial disparities and trends in drug offenses.
Source
C. Wolf Harlow (1994), *Comparing Federal and State Prison Inmates*, NCJ-145864, U.S. Department of Justice, Bureau of Justice Statistics.
Interim Dane Data with New Criminal Activity (NCA)
Description
This dataset contains pre-treatment covariates, a binary treatment (Z), an ordinal decision (D), and an outcome variable (Y). It is used to study new criminal activity (NCA).
Usage
data(NCAdata_tbl_df)
Format
A tibble with 1,891 observations and 19 variables:
- Sex
 Numeric variable representing the individual's sex.
- White
 Numeric variable indicating whether the individual is White.
- SexWhite
 Numeric interaction term between Sex and White.
- Age
 Numeric variable indicating the individual's age.
- PendingChargeAtTimeOfOffense
 Numeric variable indicating if there was a pending charge at the time of offense.
- NCorNonViolentMisdemeanorCharge
 Numeric variable indicating a non-violent misdemeanor charge.
- ViolentMisdemeanorCharge
 Numeric variable indicating a violent misdemeanor charge.
- ViolentFelonyCharge
 Numeric variable indicating a violent felony charge.
- NonViolentFelonyCharge
 Numeric variable indicating a non-violent felony charge.
- PriorMisdemeanorConviction
 Numeric variable indicating prior misdemeanor convictions.
- PriorFelonyConviction
 Numeric variable indicating prior felony convictions.
- PriorViolentConviction
 Numeric variable indicating prior violent convictions.
- PriorSentenceToIncarceration
 Numeric variable indicating prior sentences to incarceration.
- PriorFTAInPastTwoYears
 Numeric variable indicating prior failures to appear (FTA) in the past two years.
- PriorFTAOlderThanTwoYears
 Numeric variable indicating prior failures to appear (FTA) older than two years.
- Staff_ReleaseRecommendation
 Numeric variable indicating the staff release recommendation.
- Z
 Binary treatment variable.
- D
 Ordinal decision variable.
- Y
 Outcome variable measuring new criminal activity (NCA).
Details
The dataset name has been changed to 'NCAdata_tbl_df' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble in R. The original content has not been modified in any way.
Source
Interim Dane data with new criminal activity (NCA) as an outcome.
Interim Data with New Violent Criminal Activity (NVCA)
Description
This dataset contains information related to new violent criminal activity (NVCA) as an outcome. It includes pre-treatment covariates, a binary treatment variable (Z), an ordinal decision variable (D), and an outcome variable (Y). The dataset provides a rich set of variables that can be used to analyze factors influencing violent crime recidivism, with a focus on various demographic and criminal history attributes.
Usage
data(NVCAdata_tbl_df)
Format
A tibble with 1,891 observations and 19 variables:
- Sex
 Sex of the individual (numeric).
- White
 Indicates if the individual is White (numeric).
- SexWhite
 Indicates if the individual is both White and male (numeric).
- Age
 Age of the individual (numeric).
- PendingChargeAtTimeOfOffense
 Pending charge at the time of offense (numeric).
- NCorNonViolentMisdemeanorCharge
 Non-violent misdemeanor charge (numeric).
- ViolentMisdemeanorCharge
 Violent misdemeanor charge (numeric).
- ViolentFelonyCharge
 Violent felony charge (numeric).
- NonViolentFelonyCharge
 Non-violent felony charge (numeric).
- PriorMisdemeanorConviction
 Prior misdemeanor conviction (numeric).
- PriorFelonyConviction
 Prior felony conviction (numeric).
- PriorViolentConviction
 Prior violent conviction (numeric).
- PriorSentenceToIncarceration
 Prior sentence to incarceration (numeric).
- PriorFTAInPastTwoYears
 Prior failure to appear in the past two years (numeric).
- PriorFTAOlderThanTwoYears
 Prior failure to appear older than two years (numeric).
- Staff_ReleaseRecommendation
 Staff release recommendation (numeric).
- Z
 Binary treatment variable (numeric).
- D
 Ordinal decision variable (numeric).
- Y
 Outcome variable indicating new violent criminal activity (numeric).
Details
The dataset name has been changed to 'NVCAdata_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble in R. The original content has not been modified in any way.
Source
Interim data on violent criminal activity (NVCA).
Ndrangheta Mafia Covert Network Dataset
Description
This dataset contains a network of co-attendance occurrences of suspected members of the Ndrangheta criminal organization at summits held between 2007 and 2009. These summits were meetings aimed at making important decisions, resolving internal issues, and establishing roles and powers.
Usage
data(Ndrangheta_list)
Format
A list with 2 elements:
- X
 A numeric matrix of dimensions 146 x 146 representing the co-attendance occurrences between members of the Ndrangheta organization at summits. The matrix includes member pairs and their respective co-attendance frequency.
- node_meta
 A data frame with 146 observations and 3 variables:
- Role
 Character vector indicating the role of each member in the organization.
- Locale
 Character vector indicating the geographic locale of each member.
- Id
 Integer vector representing a unique identifier for each member.
Details
The dataset name has been changed to 'Ndrangheta_list' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'list' indicates that the dataset is a list object in R. The original content has not been modified in any way.
Source
Ndrangheta mafia covert network dataset, containing data from summits between 2007 and 2009.
Nigeria Terrorism Data
Description
This dataset contains information on terrorist attacks by Fulani Extremists and Boko Haram in Nigeria, starting from the year 2014. The attack data is sourced from the Global Terrorism Database, while other variables are derived from the UCDP PRIO-Grid data. The dataset includes geographic coordinates, population data, and information about mountainous areas relevant to the attacks.
Usage
data(NigeriaTerrorism_df)
Format
A data frame with 312 observations and 6 variables:
- att.ful
 Number of attacks by Fulani Extremists (numeric).
- att.bok
 Number of attacks by Boko Haram (numeric).
- xcoord
 X-coordinate of the attack location (numeric).
- ycoord
 Y-coordinate of the attack location (numeric).
- pop
 Population of the area (numeric).
- mtns
 Indicator of whether the location is mountainous (numeric).
Details
The dataset name has been changed to 'NigeriaTerrorism_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
Global Terrorism Database and UCDP PRIO-Grid data.
Sentences of 41 Prisoners Convicted of a Homicide Offense
Description
This dataset contains information on the length of sentences served by 41 prisoners convicted of a homicide offense. The data was taken from a report by the U.S. Department of Justice, Bureau of Justice Statistics, which provides insight into the sentencing and time served for violent crimes, specifically homicides. The dataset includes the number of months each prisoner served in prison.
Usage
data(Sentence_tbl_df)
Format
A tibble with 41 observations and 1 variable:
- months
 The number of months served in prison by each prisoner (integer).
Details
The dataset name has been changed to 'Sentence_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Source
U.S. Department of Justice, Bureau of Justice Statistics, Prison Sentences and Time Served for Violence, NCJ-153858, April 1995.
Suicide Rates in Germany
Description
This dataset contains information on suicide rates in West Germany, classified by age, sex, and method of suicide. The data was collected from Heuer (1979) and provides detailed insight into suicide rates across different demographic groups. It includes the frequency of suicides, categorized by sex, method of suicide, and age group.
Usage
data(Suicide_Germany_df)
Format
A data frame with 306 observations and 6 variables:
- Freq
 Numeric variable representing the frequency of suicides.
- sex
 Factor indicating the sex of the individual (2 levels: 'Male', 'Female').
- method
 Factor indicating the method of suicide (9 levels).
- age
 Numeric variable representing the age of the individual.
- age.group
 Factor indicating the age group (5 levels).
- method2
 Factor indicating a secondary categorization of the suicide method (8 levels).
Details
The dataset name has been changed to 'Suicide_Germany_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Heuer, 1979. Suicide Rates in West Germany.
Global Terrorism Database (GTD) Yearly Summaries
Description
This dataset contains yearly summaries of global terrorism incidents from 1970 onward. The data includes information on over 209,000 incidents of terrorism, with details on the country, year, and other relevant variables related to each incident.
Usage
data(TerrorismGlobal_table)
Format
A table with 10,200 rows and 50 columns:
- country_txt
 Character vector representing the country where the terrorist incident occurred.
- iyear
 Character vector representing the year the incident took place.
Details
The dataset name has been changed to 'TerrorismGlobal_table' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'table' indicates that the dataset is represented as a table in R. The original content has not been modified in any way.
Source
Global Terrorism Database (GTD), 1970-2020.
Terrorism Incidents in the USA (1968-1974)
Description
This dataset provides a summary of terrorism incidents recorded in the United States during the period from January 1968 to April 1974. It is part of a larger chronology of international terrorism incidents compiled by Jenkins and Johnson (1975).
Usage
data(USATerror_data_df)
Format
A data frame with 6 observations and 2 variables:
- Incidents
 Number of recorded terrorism incidents (integer).
- fre
 Frequency of incidents (numeric).
Details
The dataset name has been changed to 'USATerror_data_df' to align with the naming conventions of the crimedatasets package. The suffix 'df' indicates that the dataset is a data frame in R. The original content has not been modified in any way.
Source
Jenkins, B. M., & Johnson, W. (1975). Chronology of International Terrorism (1968-1974). Extracted from: Li, X. H., Huang, Y. Y., & Zhao, X. Y. (2011). *The Kumaraswamy Binomial Distribution*. Chinese Journal of Applied Probability and Statistics, 27(5), 511-521.
Violent Crime Rates by US State
Description
This dataset contains statistics on violent crime rates in each of the 50 US states for the year 1973. The data includes arrests per 100,000 residents for assault, murder, and rape, as well as the percentage of the population living in urban areas.
Usage
data(USArrests_df)
Format
A data frame with 50 observations and 4 variables:
- Murder
 Murder arrests per 100,000 residents (numeric).
- Assault
 Assault arrests per 100,000 residents (integer).
- UrbanPop
 Percentage of the population living in urban areas (integer).
- Rape
 Rape arrests per 100,000 residents (numeric).
Details
The dataset name has been changed to 'USArrests_df' to maintain consistency with the naming conventions of the crimedatasets package. The suffix 'df' indicates that the dataset is stored as a data frame in R. The original content has not been modified in any way.
Source
1973 crime data, originally included in the USArrests dataset from R.
Lawyers' Ratings of State Judges in the US Superior Court
Description
This dataset contains ratings of U.S. state judges in the Superior Court as evaluated by lawyers. The ratings are based on various attributes of the judges, including integrity, diligence, and legal knowledge.
Usage
data(USJudgeRatings_df)
Format
A data frame with 43 rows and 12 variables:
- CONT
 Rating for judicial control over the court proceedings (numeric).
- INTG
 Rating for integrity (numeric).
- DMNR
 Rating for demeanor (numeric).
- DILG
 Rating for diligence (numeric).
- CFMG
 Rating for case management (numeric).
- DECI
 Rating for decision-making ability (numeric).
- PREP
 Rating for preparation (numeric).
- FAMI
 Rating for familiarity with the law (numeric).
- ORAL
 Rating for oral communication skills (numeric).
- WRIT
 Rating for written communication skills (numeric).
- PHYS
 Rating for physical appearance (numeric).
- RTEN
 Overall rating (numeric).
Details
The dataset name has been changed to 'USJudgeRatings_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Lawyers' ratings of U.S. state judges in the Superior Court.
The Effect of Punishment Regimes on Crime Rates
Description
This dataset contains aggregate data on crime rates and socioeconomic indicators for 47 states in the USA for 1960. It explores the effect of punishment regimes on crime rates, with variables scaled to convenient numbers.
Usage
data(UScrime_df)
Format
A data frame with 47 observations and 16 variables:
- M
 Number of males aged 14–24 per 100,000 (integer).
- So
 Indicator for Southern states (1 = South, 0 = non-South) (integer).
- Ed
 Mean years of schooling (integer).
- Po1
 Police expenditure in 1960 per capita (integer).
- Po2
 Police expenditure in 1959 per capita (integer).
- LF
 Labor force participation rate per 100,000 (integer).
- M.F
 Ratio of males to females (integer).
- Pop
 Population size per 100,000 (integer).
- NW
 Percent non-white population (integer).
- U1
 Unemployment rate of urban males aged 14–24 (integer).
- U2
 Unemployment rate of urban males aged 35–39 (integer).
- GDP
 Gross domestic product per capita (integer).
- Ineq
 Income inequality indicator (integer).
- Prob
 Probability of imprisonment (numeric).
- Time
 Average time served in state prisons (in months) (numeric).
- y
 Crime rate: number of offenses per 100,000 population (integer).
Details
The dataset name has been changed to 'UScrime_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
Aggregate data on crime and punishment regimes in the USA, 1960.
US Crime Rates (1960–2019)
Description
This dataset contains national data on the number of crimes committed in the United States between 1960 and 2019. It provides annual statistics on total crimes, violent crimes, property crimes, and their subcategories.
Usage
data(UScrimerates_tbl_df)
Format
A tibble with 60 rows and 12 variables:
- year
 Year of the recorded data (numeric).
- population
 Total US population (numeric).
- total
 Total number of crimes (numeric).
- violent
 Total number of violent crimes (numeric).
- property
 Total number of property crimes (numeric).
- murder
 Number of murders (numeric).
- forcible_rape
 Number of reported cases of forcible rape (numeric).
- robbery
 Number of robberies (numeric).
- aggravated_assault
 Number of aggravated assaults (numeric).
- burglary
 Number of burglaries (numeric).
- larceny_theft
 Number of larceny-theft crimes (numeric).
- vehicle_theft
 Number of motor vehicle thefts (numeric).
Details
The dataset name has been changed to 'UScrimerates_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble. The original content has not been modified in any way.
Source
National crime data for the United States (1960–2019).
US Incarcerations 1925 Onward
Description
This dataset contains counts of prisoners under the jurisdiction of state and federal correctional authorities in the United States from 1925 onward. The data excludes jail inmates and focuses on individuals in state and federal incarceration facilities.
Usage
data(USincarcerations_df)
Format
A data frame with 95 rows and 7 variables:
- year
 Year of the recorded data (numeric).
- stateFedIncarcerees
 Number of prisoners under state and federal jurisdiction (numeric).
- stateFedIncarcerationRate
 Incarceration rate per 100,000 population for state and federal facilities (numeric).
- stateFedMales
 Number of male prisoners in state and federal facilities (numeric).
- stateFedMaleRate
 Male incarceration rate per 100,000 male population (numeric).
- stateFedFemales
 Number of female prisoners in state and federal facilities (numeric).
- stateFedFemaleRate
 Female incarceration rate per 100,000 female population (numeric).
Details
The dataset name has been changed to 'USincarcerations_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
US incarceration data (1925 onward).
Crime Records of Camden Borough, UK
Description
This dataset contains information on reported crimes in Camden, including spatial coordinates, dates of the incidents, and crime types. It provides a detailed view of crime patterns within the region.
Usage
data(camden_crimes_df)
Format
A data frame with 4,578 observations and 4 variables:
- x
 X-coordinate (numeric).
- y
 Y-coordinate (numeric).
- date
 Date of the reported crime (Date).
- type
 Type of crime (character).
Details
The dataset name has been changed to 'camden_crimes_df' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way
Source
Data comprising 'Theft' and 'Criminal Damage' records of Camden Borough of London, UK, 2021. (Source: https://data.police.uk/data/)
China's Corruption Investigations
Description
This dataset contains information on nearly 20,000 officials who were investigated during Xi Jinping's anti-corruption campaign. It provides data on the province, prefecture, and county where the investigations occurred, as well as unique identifiers for each administrative level.
Usage
data(corruption_tbl_df)
Format
A tibble with 10 observations and 6 variables:
- province
 2-digit province number (numeric).
- prefecture
 Prefecture name in Chinese (character).
- county
 County name in Chinese (character).
- province_id
 6-digit province identifier (numeric).
- prefecture_id
 6-digit prefecture identifier (numeric).
- county_id
 6-digit county identifier (numeric).
Details
The dataset name has been changed to 'corruption_tbl_df' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble in R. The original content has not been modified in any way
Source
Data from China's anti-corruption campaign investigations.
Criminal Offenders Screened in Florida
Description
This dataset contains information on criminal offenders who were screened in Florida during 2013-2014.
Usage
data(crimOffenders_df)
Format
A data frame with 5,855 observations and 16 variables:
- age
 Age of the offender (numeric).
- juv_fel_count
 Number of juvenile felonies committed (numeric).
- decile_score
 COMPAS score decile (numeric).
- juv_misd_count
 Number of juvenile misdemeanors committed (numeric).
- juv_other_count
 Number of other juvenile convictions (numeric).
- v_decile_score
 Predicted decile score of the offender (numeric).
- priors_count
 Number of prior crimes committed (numeric).
- sex
 Gender of the offender (factor with levels 'Female' and 'Male').
- two_year_recid
 Recidivism within two years (factor with levels 'Yes' and 'No').
- race
 Race of the offender (factor with levels 'White', 'Black', 'Hispanic', 'Asian', 'Other', 'Native').
- c_jail_in
 Date of entry into jail (normalized between 0 and 1, numeric).
- c_jail_out
 Date of release from jail (normalized between 0 and 1, numeric).
- c_offense_date
 Date the offense was committed (numeric).
- screening_date
 Date the offender was screened (numeric).
- in_custody
 Date the offender was placed in custody (numeric, normalized between 0 and 1).
- out_custody
 Date the offender was released from custody (numeric, normalized between 0 and 1).
Details
The dataset name has been changed to 'crimOffenders_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
Data collected from criminal offenders screened in Florida during 2013-2014.
US Crime Rates & High School Dropout
Description
This dataset examines the relationship between crime rates and the percentage of the population without a high school degree in various U.S. states. The dataset contains crime data (violent crimes) along with educational attainment (percentage of people without a high school degree).
Usage
data(crimeHSdegree_tbl_df)
Format
A tibble with 51 observations and 3 variables:
- state
 State name (character).
- nodegree
 Percent of the population without a high school degree (numeric).
- crime
 Violent crimes per 100,000 population (numeric).
Details
The dataset name has been changed to 'crimeHSdegree_tbl_df' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble in R. The original content has not been modified in any way.
Source
U.S. Crime Data and Education Statistics.
Annual Crime Dataset of US Counties
Description
This dataset contains annual crime-related statistics for US counties, including violent crime rates, murder rates, and socio-economic indicators such as poverty, education, and unemployment. It provides a comprehensive overview of crime and its potential correlates across the United States.
Usage
data(crimestatewide_tbl_df)
Format
A tibble with 51 observations and 9 variables:
- State
 State name (character).
- violent crime rate
 Violent crime rate per 100,000 people (numeric).
- murder rate
 Murder rate per 100,000 people (numeric).
- poverty
 Poverty rate as a percentage (numeric).
- high school
 Percentage of high school graduates (numeric).
- college
 Percentage of college graduates (numeric).
- single parent
 Percentage of single-parent households (numeric).
- unemployed
 Unemployment rate as a percentage (numeric).
- metropolitan
 Percentage of the population living in metropolitan areas (numeric).
Details
The dataset name has been changed to 'crimestatewide_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble, a modern and more readable alternative to traditional data frames in R. The original content has not been modified in any way.
Source
Annual crime data of US counties.
Student's 3000 Criminals Data
Description
Data of 3000 male criminals over 20 years old undergoing their sentences in the chief prisons of England and Wales.
Usage
data(crimtab_table)
Format
A table with 42 rows and 22 columns:
- Var1
 Factor or categorical variable representing different crime categories.
- Var2
 A second factor or categorical variable, potentially representing different classifications such as location, time, or crime severity.
- Freq
 Frequency of occurrences within each combination of categories, representing the number of reported incidents for each combination.
Details
The dataset name has been changed to 'crimtab_table' to avoid confusion with other data sets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'table' indicates that the dataset is stored as a contingency table, rather than a traditional data frame. The original content has not been modified in any way.
Source
Public crime data.
Fraudulent Automobile Insurance Claims
Description
This dataset contains information on 127 automobile insurance claims arising from accidents in Massachusetts, USA, in 1989. Each claim was classified as either fraudulent or legitimate by consensus among four independent claims adjusters who thoroughly examined each case file.
Usage
data(fraudulent_df)
Format
A data frame with 42 observations and 12 variables:
- r1
 Numeric score or rating 1 (numeric).
- r2
 Numeric score or rating 2 (numeric).
- AC1
 Indicator for a specific automobile claim condition (factor with 2 levels).
- AC9
 Indicator for a second specific automobile claim condition (factor with 2 levels).
- AC16
 Indicator for a third specific automobile claim condition (factor with 2 levels).
- CL7
 Claim-level indicator for condition 7 (factor with 2 levels).
- CL11
 Claim-level indicator for condition 11 (factor with 2 levels).
- IJ2
 Insurance adjuster’s information indicator for condition 2 (factor with 2 levels).
- IJ3
 Insurance adjuster’s information indicator for condition 3 (factor with 2 levels).
- IJ4
 Insurance adjuster’s information indicator for condition 4 (factor with 2 levels).
- IJ6
 Insurance adjuster’s information indicator for condition 6 (factor with 2 levels).
- IJ12
 Insurance adjuster’s information indicator for condition 12 (factor with 2 levels).
Details
The dataset name has been changed to 'fraudulent_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a traditional data frame in R. The original content has not been modified in any way.
Source
Fraudulent automobile insurance claims data from Massachusetts, 1989.
Crime Records of Georgia State, USA
Description
This dataset contains information on reported crimes across Georgia State, including spatial coordinates, dates of incidents, and crime types. It provides valuable insights into crime patterns within the region.
Usage
data(georgia_sf)
Format
An sf object (spatial data frame) with 10,523 observations and 5 variables:
- geometry
 Spatial geometry of each crime record (sf object).
- date
 Date of the reported crime (Date).
- type
 Type of crime (character).
- city
 City where the crime occurred (character).
- county
 County where the crime occurred (character).
Details
The dataset name has been changed to 'georgia_sf' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'sf' indicates that the dataset is a spatial data frame in R. The original content has not been modified in any way.
Source
Public crime data for Georgia State.
US Hate Crimes & Socioeconomic Factors
Description
This dataset contains data on hate crimes across the United States and associated socioeconomic factors. It provides insights into potential relationships between income inequality, socioeconomic characteristics, and the frequency of hate crimes.
Usage
data(hate_crimes_tbl_df)
Format
A tibble with 51 observations and 13 variables:
- state
 Full name of the state (character).
- state_abbrev
 Abbreviation of the state (character).
- median_house_inc
 Median household income (integer).
- share_unemp_seas
 Share of unemployed people (seasonally adjusted) (numeric).
- share_pop_metro
 Share of the population living in metropolitan areas (numeric).
- share_pop_hs
 Share of the population with at least a high school education (numeric).
- share_non_citizen
 Share of the population who are non-citizens (numeric).
- share_white_poverty
 Share of the white population living in poverty (numeric).
- gini_index
 Gini index of income inequality (numeric).
- share_non_white
 Share of the population who are non-white (numeric).
- share_vote_trump
 Share of votes for Donald Trump in the 2016 presidential election (numeric).
- hate_crimes_per_100k_splc
 Hate crimes per 100,000 people as reported by the SPLC (numeric).
- avg_hatecrimes_per_100k_fbi
 Average hate crimes per 100,000 people as reported by the FBI (numeric).
Details
The dataset name has been changed to 'hate_crimes_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble, a modern version of data frames in R. The original content has not been modified in any way.
Source
The raw data behind the story "Higher Rates Of Hate Crimes Are Tied To Income Inequality" by FiveThirtyEight.
Homicides in Nine US Cities (2015)
Description
This dataset contains detailed records of homicides that occurred in nine large US cities during the year 2015. The data includes geographic locations, offense codes, and additional metadata, making it valuable for analyzing patterns and trends in urban crime.
Usage
data(homicides15_tbl_df)
Format
A tibble with 1,922 observations and 15 variables:
- uid
 Unique identifier for the record (integer).
- city_name
 Name of the city where the homicide occurred (character).
- offense_code
 Offense code for the homicide (character).
- offense_type
 Type of offense (character).
- date_single
 Date and time of the homicide (POSIXct).
- address
 Address where the homicide occurred (character).
- longitude
 Longitude of the location (numeric).
- latitude
 Latitude of the location (numeric).
- location_type
 Type of location (character).
- location_category
 Category of location (character).
- fips_state
 FIPS code for the state (integer).
- fips_county
 FIPS code for the county (character).
- tract
 Census tract identifier (character).
- block_group
 Census block group identifier (integer).
- block
 Census block identifier (integer).
Details
The dataset name has been changed to 'homicides15_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble, offering better printing and subsetting capabilities in R. The original content has not been modified in any way.
This dataset provides insights into homicides in urban areas, offering geographic and temporal information for each case.
Source
Crime Open Database, 2015.
Murders in New Zealand (2004 - 2019)
Description
This dataset contains information about recorded murder cases in New Zealand between 2004 and 2019. It includes details on the sex, age, and cause of death of the victims, as well as the identity of the alleged killer, the date of the crime, and the region where the crime occurred. The dataset is in the form of a simple features (sf) object, with geographic data represented as points.
Usage
data(nz_murders_sf)
Format
An sf data frame with 967 observations and 12 variables:
- sex
 Sex of the victim (character).
- age
 Age of the victim (integer).
- date
 Date of the murder (character).
- year
 Year the murder occurred (integer).
- cause
 Cause of death (character).
- killer
 Name of the alleged killer (character).
- name
 Name of the victim (character).
- full_date
 Full date and time of the murder (POSIXct).
- month
 Month of the murder (ordered factor with 12 levels).
- cause_cat
 Category of the cause of death (character).
- region
 Region where the murder occurred (character).
- geometry
 Geographic coordinates (sf POINT) representing the location of the murder (list of 967).
Details
The dataset name has been changed to 'nz_murders_sf' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix '_sf' indicates that the dataset is an sf object in R, used for storing and handling spatial data. The original content has not been modified in any way.
Source
Recorded murder data for New Zealand (2004 - 2019).
Fatal Police Shootings data
Description
This dataset contains records of every fatal police shooting by an on-duty officer since January 1, 2015. It includes information about the shooting incidents, the characteristics of the individuals involved, and details such as mental illness signs, body camera usage, and more. This dataset is valuable for analyzing trends and patterns in fatal police shootings in the United States.
Usage
data(police_shootings_tbl_df)
Format
A tibble with 6,421 observations and 12 variables:
- date
 Date of the shooting (Date).
- manner_of_death
 How the individual died (character).
- armed
 Indicates if the individual was armed (character).
- age
 Age of the individual (numeric).
- gender
 Gender of the individual (character).
- race
 Race of the individual (character).
- city
 City where the shooting occurred (character).
- state
 State where the shooting occurred (character).
- signs_of_mental_illness
 Whether the individual showed signs of mental illness (logical).
- threat_level
 Perceived threat level of the individual (character).
- flee
 Whether the individual was fleeing (character).
- body_camera
 Whether the officer was wearing a body camera (logical).
Details
The dataset name has been changed to 'police_shootings_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble, which is a modern version of a data frame in R. The original content has not been modified in any way.
Source
Washington Post Fatal Police Shootings database.
Rearrests of Juvenile Felons
Description
This dataset contains information on rearrests of juvenile felons based on the type of court in which they were tried. The data originates from a sample of juveniles convicted of felony in Florida in 1987, with matched pairs formed using criteria such as age and the number of previous offenses. The dataset provides counts of rearrests for juveniles, categorized by adult and juvenile courts. This data is useful for analyzing rearrest rates and judicial outcomes for juveniles convicted of felonies.
Usage
data(rearrests_table)
Format
A table with 2 rows and 2 columns:
- Adult court
 Number of rearrests (numeric) and no rearrests (numeric) in adult court.
- Juvenile court
 Number of rearrests (numeric) and no rearrests (numeric) in juvenile court.
Details
The dataset name has been changed to 'rearrests_table' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'table' indicates that the dataset is a contingency table in R, representing the counts of rearrests by court type. The original content has not been modified in any way.
Source
Agresti, 1996. Data on rearrests of juvenile felons in Florida, 1987.
Florida State Prison Sentencing Counts by County, 1905-1910
Description
This dataset contains information about state prison sentencing counts by county in Florida for the years 1905-1910. The data includes various aggregated statistics such as the population of white and Black residents, the number of sentences, and other demographic and agricultural factors at the county level. The dataset also includes geographic information in the form of simple features (sf) representing county boundaries from the year 1910. The population data for each county has been interpolated linearly between the decennial censuses of 1900 and 1910.
Usage
data(sentencing_sf)
Format
A simple features (sf) object with 47 observations and 9 variables:
- name
 Name of the county (character).
- wpop
 White population (numeric).
- bpop
 Black population (numeric).
- sents
 Number of sentences in the county (numeric).
- plantation_belt
 Indicator of plantation belt counties (numeric).
- pct_ag_1910
 Percentage of agricultural land in 1910 (numeric).
- expected_sents
 Expected number of sentences based on population (numeric).
- sir_raw
 Index of racial disparities in sentencing (numeric).
- geometry
 Geometry column containing the spatial boundaries of the counties (list of simple features).
Details
The dataset name has been changed to 'sentencing_sf' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'sf' indicates that the dataset is a spatial object, using the Simple Features format. The original content has not been modified in any way.
Source
Data compiled from historical census and sentencing records of Florida, 1905-1910.
Serial Killers of the UK (1828 - 2015)
Description
This dataset contains information about the serial killers in the UK, including their name, number of kills, years active, and the population during their time. It provides a historical view of some of the most infamous serial killers in the United Kingdom.
Usage
data(uk_serial_df)
Format
A data frame with 62 observations and 8 variables:
- number_of_kills
 Total number of murders committed by the serial killer (integer).
- years
 The years during which the serial killer was active (factor).
- name
 Name of the serial killer (character).
- aka
 Known aliases of the serial killer (character).
- year_start
 The first year the serial killer was active (integer).
- year_end
 The last year the serial killer was active (integer).
- date_of_first_kill
 The date when the serial killer committed their first murder (factor).
- population_million
 Population in millions at the time the serial killer was active (numeric).
Details
The dataset name has been changed to 'uk_serial_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
https://www.murderuk.com/
NYC Vehicle Thefts (2014-2017)
Description
This dataset contains detailed records of motor vehicle thefts in New York City from 2014 to 2017. The dataset includes spatial coordinates, timestamps, and additional contextual information about each theft. It provides valuable insights into patterns and trends of vehicle thefts in NYC.
Usage
data(vehiclethefts_tbl_df)
Format
A tibble with 35,746 rows and 9 variables:
- uid
 Unique identifier for each record (integer).
- date_single
 Single date of the incident (character).
- date_start
 Start date of the incident (character).
- date_end
 End date of the incident (character).
- longitude
 Longitude of the theft location (numeric).
- latitude
 Latitude of the theft location (numeric).
- location_type
 Type of location where the theft occurred (character).
- location_category
 Category of the location (character).
- census_block
 Census block of the theft location (character).
Details
The dataset name has been changed to 'vehiclethefts_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is stored as a tibble in R. The original content has not been modified in any way.
Source
Crime Open Database: Motor Vehicle Theft Records.
Annual Female Murder Rate in the USA (1950-2004)
Description
This dataset contains the annual female murder rate per 100,000 standard population in the United States from 1950 to 2004. The data represents the total number of murdered women per 100,000 population on an annual basis, providing insights into trends and patterns in female homicides over a period of 55 years.
Usage
data(wmurders_ts)
Format
A time series object with 55 observations and 1 variable:
- wmurders_ts
 Numeric vector representing the annual female murder rate per 100,000 population in the USA.
Details
The dataset name has been changed to 'wmurders_ts' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the crimedatasets package and assists users in identifying its specific characteristics. The suffix 'ts' indicates that the dataset is a time series object in R. The original content has not been modified in any way.
Source
U.S. crime statistics and historical records.