As the 2nd challenge progresses, contestants might notice inconsistencies or issues in the dataset. This might result in challenge datasets to undergo modifications over time. This page is dedicated to organizing and tracking all changes related to the datasets. Older (legacy) versions of the dataset are stored in the legacy repository, and the updated (current) datasets are available here.
…/legacy/…/2023-10-05
- Datasets are made available to 2nd challenge contestants via API and direct download.
- This version of the dataset can be found in the legacy repository here.
…/legacy/…/2023-12-04
- Few contestants reported issues when accessing the data files. The identified issues include:
- Inconsistencies in the actual dates relative to the boost.** A more detailed discussion on this can be found here.
- The names of cell populations in the prediction dataset differed from those in the training dataset.**
A more detailed discussion on this can be found here
…/legacy/…/2023-12-21
- Our student contestants reported issues of missing antibody titer data for subject_id 98 in “2022BD_plasma_ab_titer.tsv” file. We checked and confirmed that there was indeed missing data for antibody titer data for subject_id 98. We fixed this issue and replaced the old data files with a new correct file that includes Antibody titer data for subject_id 98 (specimen_id’s = 740, 741, 742).
…/current/… Current and final dataset version (updated on Jan 05, 2024)
In response to a suggestion from one of the contestants, we have taken the initiative to process the prediction dataset in a manner similar to the training dataset, to ease prediction. To this end, we have provided both the processed data and the relevant code. This is the current version of the challenge dataset and is accessible here. You can still access the old data file in here.