Missing subject_id 98 and 113 from TSV files for 2022 dataset

In the 2022 tsv files here (that are pointed to on the prediction page) we are missing titer data for subject_id 98 and 113.

This is for a female wP and male aP respectively. What is weird is that you have subject_id 98 in the TSV Submission template file and not 113.

Accessing the full datasets from the API endpoints again shows subject_id 113 is missing but 98 is there in the titer data.

From the full datasets (2020_dataset, 2021_dataset and 2022_dataset) we are missing subject_id “2” “8” “82” “87” “88” “113”

library(jsonlite)
library(dplyr)

subject <- read_json("https://www.cmi-pb.org/api/subject", simplifyVector = TRUE) 
specimen <- read_json("https://www.cmi-pb.org/api/specimen", simplifyVector = TRUE) 
titer <- read_json("https://www.cmi-pb.org/api/plasma_ab_titer", simplifyVector = TRUE) 

# Join these tables to annotate which specimen
#   and subject each titer measurement relates to:
meta <- inner_join(specimen, subject)
abdata <- inner_join(titer, meta)

# Check on which subject_ids are misssing:
ori.subject.ids <- names(table(subject$subject_id))
got.subject.ids <- names(table(abdata$subject_id))

ori.subject.ids[ !ori.subject.ids %in% got.subject.ids]

## [1] "2"   "8"   "82"  "87"  "88"  "113"

Thanks!

Thanks @barry_test, for pointing out these problems.

  1. We don’t have subject_id 113 in the data. We have subject_id 98 for all three baseline values (specimen_id’s = 740, 741, 742). Weirdly, data for these three specimens is missing from the Ab titer data file uploaded on the website, as students pointed out. We have replaced this file on the website with the correct file, and we have also updated the information on the data tracking post here so that this file replacement is documented properly. These changes should also reflect on the corresponding API endpoint.

  2. I can confirm that we don’t have Ab titer data for the following subjects: 2,8,82,87,88,113.

Feel free to let us know if there are other issues.

Best,
Pramod