Completed on 28 Jun 2017 by Palle Villesen . Sourced from http://biorxiv.org/content/early/2017/06/18/146340.
Login to endorse this review.
Dear authors - interesting work!
What about overfitting/data dredging in your work? "The reported result of assessment is based on the average f-measure for the 10-folds for testing dataset."
When you go from genes to isoforms you also increase the number of predictor variables which make overfitting more possible (not necessarily more likely though).
I couldn't see the variance of these f-measures from CV which is normally a signature of overfitting (if the variance is very high).
For a full analysis I would suggest you split your datasets into training (MCC or F estimated by CV on this set) and validation set (MCC or F estimated by fitting final model to full training set - evaluate on this set). This is very close to what is done in kaggle competitions etc. where you actually measure your performance yourself (internal performance) but also need to predict on new data (external performance). If these two measures are very different the chosen model is not good.
Check "Comparison of RNA-seq and microarray-based models for clinical endpoint prediction". The problem is that when using CV to compare and select best models you may end up with the model that accidentally fits (using CV) your dataset best (data dredging). So basically you would like to see a nice correlation between training performance (internal performance) and validation performance (external performance) - and only use internal performance to rank models/parameters.