How should we handle samples which we not take at the expected dates?

Connie · August 13, 2020, 3:51pm

Questions from Joaquin:
I was wondering if I could get more information about the samples. It seems like some subjects/samples were not able to make the exact appointment times/intervals (preboost, 1, 3, 7, 14). For example, in the CYTOF dataset I can find samples that were taken at day 8, 15, 18; for RNA-seq I can find samples with day 2, 6, 8, 11, 13, 15, 16, and 18; for antibody titer data there were many many days taken; lastly, I took a glance at OLINK data and it like there is only preboost data?

How should we handle samples which we not take at the expected dates? Should we somehow join them to the nearest date (this would be odd for day 2)? Is there some way Riccardo already did this and could we just add another column to the table labeled associated_timepoint or something along those lines? And, how long will it be before all OLINK data is uploaded?

Connie · August 13, 2020, 4:17pm

Reply from Ferran:

I assume that Joaquin is using the real days post-boost instead of the theoretical and he has a point. So I think that this question should be answered by someone that is more knowledgeable on how to analyze data with days that are not exactly the ones that we experimentally tried. I am aware that in the new cohort (TdaP Short 4) we are recruiting with restricted “time windows” but I am not sure how the previous was recruited. Regarding the specific mentions of the CyTOf data and the RNAseq, this is quite concerning if he refers to one donor with those timepoints and if so it should probably be excluded. If instead, these are examples of days across all donors, I assume they have to decide how to treat them? Should we give some input here?

Connie · August 13, 2020, 4:18pm

Reply from Ricardo:

I Just think that they should have to consider the approximate time points that we already used for the paper. Otherwise, they could be underpowered for the analysis.
Yes, it might reflect less accurately the biological kinetics to the vaccination but taking the whole approach as a learning curve, and which I think is the main goal of this grant, mistakes are being corrected and the new recruitment already takes the blood collection in a narrower window, hence predictions ought be more accurate overtime.

Connie · August 13, 2020, 4:18pm

Reply from Pramod:

The creation of column ‘associated time-points’ is straight-forward using the information of ‘days relative to boost’ and ‘visit’. It can be quickly done via a script. But, I feel having two columns ‘associated time-points’ and ‘days relative to boost’ would be again confusing. Pls, suggest here.

I believe all existing Olink data is already placed and can be accessible at:

https://staging.cmi-pb.org/db/olink_prot_exp
https://staging.cmi-pb.org/db/olink_prot_info

Connie · August 13, 2020, 4:19pm

Reply from Bjoern:

We need to distinguish between ‘planned visit dates’ and ‘actual visit dates’. I wonder if that would have made the entire analysis of the previous paper better / more consistent. And yes, we need to tighten up the recruitment time courses.