Completed on 6 Jul 2017 by James P. B. Lloyd. Sourced from http://www.biorxiv.org/content/early/2017/06/26/156190.
Login to endorse this review.
Author response is in blue.
First off, really great work. It is wonderful to see the relative importance of transcription and translation to be characterized in a developmental context such as this. I guess it should not be a surprise to see that transcription plays a larger role than translation, but translation still plays a major role, as demonstrated here. It was nice to see approaches that allow for gene-level and transcript-level evaluation of translation.
A major finding of this work was linking 3’ UTR length to translational control in a developmental context or neurogenesis. But it was unclear to me what exactly was meant by 3’ UTR length change and how this was determined. I believe that you are using a static, reference, transcriptome annotation (Ensembl GRCh38 v84 transcripts). In Fig 6A, this analysis is comparing transcript isoforms from within the same gene to each other, and this found that longer 3’ UTRs were enriched in the light polysomes in older neurons, is this correct? How comprehensive is the transcriptome used to capture the diversity of 3’ UTR ends and do they reflect the biology of the neurons? By limiting yourself to these reference transcripts and predicted termination sites, you might be limiting your ability to find an effect. This phenomenon might be more prevalent than you can assess with this transcriptome.
“Equal volumes of samples from the 2-4 ribosomes and the 5-8+ ribosomes fractions were then pooled” – Could this introduce any artifacts into the normalization, simply adding equal volumes of the samples? Did you account for the amount of RNA that was in each sample relative to the others to ensure that you were not over-estimating the amount of (for example) poly5 fraction relative to poly7 fraction?
Fig 2A – it seems strange that in the plot, the largest group, which I think are the no-change (black), appear to have the least dots. I assume this is because they are all stacked on top of each other in the middle. Perhaps another colour might help show how dense this group is?
Fig 2A – it might be worth highlighting in the figure label that the changes are significant changes.
Fig 4A – In MeCP2, there are two differences in the transcripts, as you point out, a cassette exon and a longer 3’ UTR. Do you believe this transcript model to be accurate, as both of these events are too distance for short reads to assemble? Is it clear from your mapping, which event (exon, 3’ UTR or both) is giving the signal for change in abundance across the fractions?
Did you check for contamination of feeder MEFs in the hESC RNA-seq data? Is this a concern? Would it only be a problem for the hESCs and not the other cell types?
Thanks for your comments James and my apologies for taking so long to reply! I appreciate your input and these are all helpful. Some thoughts below.
Re: your major comment on transcriptome annotations, this is indeed a serious issue in general. We elected to use Ensembl GRCh38 v84 as this is a standard annotation set, facilitating evaluation by researchers and comparison between datasets. However, you’re absolutely correct that there are expressed regions in the data that are not found in the annotated transcriptome. This applies not only to the 3’ UTR extensions but also to alternative transcription start sites and other RNA processing events. I suspect you’re correct that what we report here is a lower bound on the effect of 3’ UTRs in brain. In general, and this is also true re: your later comment about correlated transcript processing events in MeCP2, I hope this issue will be partly remedied by the increased use of long read sequencing of RNA.
Re: normalization of samples, we elected to add equal volumes of samples together to mirror the volume of each polysome peak. Peaks were collected into individual tubes, which therefore contain all the RNAs in a peak. Combining equal volumes preserves the original polysome profile, since some peaks will have lower concentration of RNA (smaller peaks in the polysome) and some higher concentration (larger peaks in the polysome).
Re: Figure 2A, you’re correct that the black dots are all stacked on top of each other. A density plot would reflect this better, but we went with the scatter plot as the focus of this panel is on significantly changed genes and not unchanged genes. We’ve added a label indicating these “n” values are for genes with p < 0.01 between conditions.
Re: Figure 4A, I mentioned this above but it’s challenging to detect long range correlated transcript processing events using short read data. Long read sequencing would help considerably here.
Re: feeder MEFs, thanks much for bringing this up. Only the hESCs are grown on irradiated MEFs – the differentiated cell types do not have feeders. As such, any effect from MEFs would only influence the hESC data. MEF contamination was minimized in three ways. First, we used collagenase to dissociate hESC colonies, which does not dissociate most MEFs. Second, we harvested hESCs by sedimentation at normal gravity, which enriches for hESCs versus MEFs as the ES cells grow in colonies while MEFs grow as single cells. Lastly, we examined the fraction of reads that map uniquely onto a concatenated human and mouse metagenome, which has been used previously to analyze species-specific alignment of reads. Reads that mapped uniquely onto one genome were treated as having originated from cells of that species, human or mouse in this case. In most cases the percent of reads that uniquely mapped onto mouse was less than 1% and in all cases it was less than 2%. Reads that aligned uniquely to mouse were discarded. We will update the methods section to reflect this and apologize for its accidental omission.