Completed on 31 Jul 2017 by Peter Rupprecht. Sourced from http://www.biorxiv.org/content/early/2017/06/27/156786.
Login to endorse this review.
Thanks for sharing this paper online on BioRxiV. Here, I would like to focus on one aspect of this paper, 'benchmarking without ground truth electrophysiology'.
An algorithm is suggested that allows to benchmark deconvolution algorithms based on stimulus-repetitions, not requiring any electrophysiological ground truth. However, I have the impression that the algorithm has to be described and analyzed with a bit more detail to allow future users to understand its potential and its limitations.
Here are a some more detailed comments and questions:
1) It is not clear how many stimulus repetitions have been used for this analysis. Is it 2 for each stimulus? What was the nature of these stimuli, and did they evoke roughly similarly strong responses?
2) How well do the approximations in 'Stimulus-related variance from two repeats' hold true for real data? Can this be quantified and shown in figures?
3) Out of curiosity, what are typical values for the fraction of Var_k(s)/[0.5*(Var_k(r1) + Var_k(r2))] ?
Other points that are not fully clear to me from reading the paper:
4) If performance is tested by measuring correlations between stimulus repetitions, wouldn't a non-deconvolved raw dF/F perform better than it should? In other words, how does the benchmark algorithm penalize algorithms that do not deconvolve at all or not sufficiently?
5) Related to the previous point, the ensemble of stimulus responses "are the binned responses to N stimuli". What is the temporal binning used for the analysis in the paper, and how does this affect point 4) ?
Maybe synthetic data could help to address the latter questions, although I'm not completely sure about it.