Some subjects' specimen actual day need to clarify

Hi, I find some duplicated actual days from some individual subject as below, can you clarify them? For example, the subject_id = 2, it has two specimens associated day1 (actual day). The problematic pid are 2, 48, 49, 54, and 55, all from 2020 dataset. No issue was found in 2021 dataset.

specimen_id subject_id actual_day_relative_to_boost planned_day_relative_to_boost specimen_type visit dataset
6 11 2 1 0 Blood 1 2020_dataset
7 12 2 1 1 Blood 2 2020_dataset
8 13 2 3 3 Blood 3 2020_dataset
9 14 2 7 7 Blood 4 2020_dataset
10 15 2 14 14 Blood 5 2020_dataset
specimen_id subject_id actual_day_relative_to_boost planned_day_relative_to_boost specimen_type visit dataset
232 369 48 -63 0 Blood 1 2020_dataset
233 370 48 1 1 Blood 2 2020_dataset
234 371 48 7 3 Blood 3 2020_dataset
235 372 48 7 7 Blood 4 2020_dataset
236 373 48 14 14 Blood 5 2020_dataset
specimen_id subject_id actual_day_relative_to_boost planned_day_relative_to_boost specimen_type visit dataset
237 376 49 -56 0 Blood 1 2020_dataset
238 377 49 1 1 Blood 2 2020_dataset
239 378 49 7 3 Blood 3 2020_dataset
240 379 49 7 7 Blood 4 2020_dataset
241 380 49 14 14 Blood 5 2020_dataset
specimen_id subject_id actual_day_relative_to_boost planned_day_relative_to_boost specimen_type visit dataset
262 412 54 -36 0 Blood 1 2020_dataset
263 413 54 1 1 Blood 2 2020_dataset
264 414 54 7 3 Blood 3 2020_dataset
265 415 54 7 7 Blood 4 2020_dataset
266 416 54 14 14 Blood 5 2020_dataset
specimen_id subject_id actual_day_relative_to_boost planned_day_relative_to_boost specimen_type visit dataset
267 419 55 -8 0 Blood 1 2020_dataset
268 420 55 1 1 Blood 2 2020_dataset
269 421 55 7 3 Blood 3 2020_dataset
270 422 55 7 7 Blood 4 2020_dataset
271 423 55 14 14 Blood 5 2020_dataset

Thanks @Joe, for raising this query. We are double-checking subject and specimen information with our clinical core one more time. I will update you here shortly.

Update 2023-10-12

@Joe We found some typing errors in the original data tables. These mistakes are now corrected. The following are correct entries:

  • 6 11 2 0 0 Blood 1 2020_dataset
  • 234 371 48 3 3 Blood 3 2020_dataset
  • 239 378 49 3 3 Blood 3 2020_dataset
  • 264 414 54 3 3 Blood 3 2020_dataset
  • 269 421 55 3 3 Blood 3 2020_dataset

We also noted a few suggestions that resulted in changes in datasets. I will be providing a comprehensive description of all the modified datasets in another post.