Calculate the percentage of positive cells for specific subpopulations
Source:R/getPerc.R
getPerc.RdExpects data input same as the output from get_gated_dat with indicator
columns of specific naming convention (see below).
Usage
getPerc(
intens_dat,
num_marker,
denom_marker,
expand_num = FALSE,
expand_denom = FALSE,
keep_indicators = TRUE
)Arguments
- intens_dat
dataframe of gated data with indicator columns per marker of interest (specify in
num_markeranddenom_marker) with naming conventionmarker_posper marker with values of 0 to indicate negative-, 1 to indicate positive-expressing- num_marker
string for the marker(s) to specify the numerator for subpopulations of interest
Seeexpand_numargument and examples for how to specify- denom_marker
string for the marker(s) to specify the denominator for subpopulations of interest
Seeexpand_denomargument and examples for how to specify.- expand_num
logical, only accepts
TRUEorFALSEwith default ofFALSE
ifexpand_num=TRUE, currently only considers up to pairs of markers specified innum_markerin the numerator of subpopulation calculations (e.g., CD4+ & CD8- of CD3+)
ifexpand_num=FALSE, only considers each marker specified innum_markerindividually in the numerator of subpopulation calculations (e.g., CD4+ of CD3+)- expand_denom
logical, only accepts
TRUEorFALSEwith default ofFALSE
ifexpand_denom=TRUE, currently considers up to 1 marker from thenum_markerand the unique combinations ofdenom_markerto generate list of subpopulations
e.g., ifdenom_marker=c("CD8"),num_marker=c("LAG3", "KI67"), andexpand_denom=TRUE, the subpopulations will include:
1. LAG3+ of CD8+, LAG3- of CD8+, LAG3+ of CD8-, LAG3- of CD8-,
2. KI67+ of CD8+, KI67- of CD8+, KI67+ of CD8-, KI67- of CD8-,
3. KI67+ of (LAG3+ & CD8+), KI67- of (LAG3+ & CD8+), KI67+ of (LAG3+ & CD8-), KI67- of (LAG3+ & CD8-)...etc.,
4. LAG3+ of (KI67+ & CD8+), LAG3- of (KI67+ & CD8+), LAG3+ of (KI67+ & CD8-), LAG3- of (KI67+ & CD8-)...etc.,
ifexpand_denom=FALSE, only generates the list of subpopulations based on unique combinations of thedenom_marker(e.g.,denom_marker=c("CD4")andexpand_denom=FALSEonly considers subpopulations with denominator CD4+ and CD4- whereasdenom_marker=c("CD4", "CD8"andexpand_denom=FALSEwill consider subpopulations with denominators (CD4- & CD8-), (CD4+ & CD8-), (CD4- & CD8+) and (CD4+ & CD8+))- keep_indicators
logical, only accepts
TRUEorFALSEwith default ofTRUE
ifkeep_indicators=TRUE, will return indicator columns of 0/1 to specify which markers are considered in the numerator and denominators of the subpopulations.
Naming convention for the numerator cols are<marker>_POSand for denominator cols are<marker>_POS_D.
For both sets of columns,0indicates considered the negative cells,1indicates considered the positive cells andNA_real_indicates not in consideration for the subpopulation.
This is useful for matching to percentage data with potentially different naming conventions to avoid not having exact string matches for the same subpopulation
Take note that the order also matters when matching strings: "CD4+ & CD8- of CD3+" is different from "CD8- & CD4+ of CD3+"
Value
tibble containing the percentage of cells where
rows correspond to each subpopulation specified in the
subpopulation,n_numindicates the number of cells that satisifies the numerator conditions,n_denomindicates the number of cells that satisifies the denominator conditions,perc=n_numdivided byn_denomunlessn_denom=0, thenperc=NA_real_
Details
The subpopulations are defined as (num marker(s)) out of (denom marker(s)) where num denotes numerator, and denom denotes denominator (these shorthands are used in the function arguments)
Examples
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
# Create a fake dataset
set.seed(100)
intens_dat <- tibble::tibble(
CD3_pos=rep(c(0, 1), each=50),
CD4=rnorm(100, 100, 10),
CD8=rnorm(100, 100, 10)
)
# Run getDensityGates to obtain the gates
gates <- getDensityGates(intens_dat, marker="CD4", subset_col="CD3_pos", bin_n=40)
# Tag on the 0/1 on intens_dat
intens_dat_2 <- getGatedDat(intens_dat, cutoffs=gates, subset_col="CD3_pos")
# Get percentage for CD4 based on gating
getPerc(intens_dat_2, num_marker=c("CD4"), denom_marker="CD3")
#> # A tibble: 4 × 6
#> subpopulation n_num n_denom perc CD4_POS CD3_POS_D
#> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 CD4_NEG_OF_CD3_NEG 3 50 6 0 0
#> 2 CD4_POS_OF_CD3_NEG 47 50 94 1 0
#> 3 CD4_NEG_OF_CD3_POS 42 50 84 0 1
#> 4 CD4_POS_OF_CD3_POS 8 50 16 1 1