About cell frequency, population name

What I’ve observed here is that while the differences may not be particularly dramatic, it would be prudent to maintain them together for consistency.

setdiff(unique(pbmc_cell_frequency$cell_type_name), unique(pbmc_cell_frequency_2022$cell_type_name))
[1] “CD3CD19” “CD19+ CD3-” “MB doublets (CD19+CD14+)”
[4] “ASCs (Plasmablasts)” “B:NK doublets (CD19+CD3-CD14-CD56+)” “T cells (CD19-CD3+CD14-)”
[7] “CD56+ CD3high T cells” “T:M doublets (CD19-CD3+CD14+)” “CD19-CD3-”
[10] “CD3-CD19-CD56-HLA-DR+CD14-CD16- cells” “Lineage - cells (CD3-CD19-CD56-HLA-DR-)” “CD33HLADR”
[13] “Tregs” “mDC”
setdiff(unique(pbmc_cell_frequency_2022$cell_type_name), unique(pbmc_cell_frequency$cell_type_name))
[1] “Antibody secreting B cells (ASCs)” “BNK doublets” “CD19+CD3-” “CD3-CD19-”
[5] “CD3-CD19-CD56-HLA-DR-CD14-CD16- cells” “CD56+CD3high T cells” “Lineage- cells (CD3-CD19-CD56-HLA-DR-)” “MB doublets”
[9] “Round” “TB doublets” “T cells (CD19-CD3-CD14-)” “TM doublets”

1 Like

@Joe, you’ve made an excellent observation. I’ve checked and confirmed naming inconsistencies in the cell_type_name for the 2022 dataset when compared to the training dataset, particularly the 2021 dataset. We’ve since updated these names to ensure consistency. As a result, there have been modifications to the plasma_cell_frequency database table and its associated API. Additionally, the related data file (2022LD_pbmc_cell_frequency) in the prediction_dataset has been adjusted.