Format

Send to

Choose Destination
BMC Bioinformatics. 2015 Jan 16;16:3. doi: 10.1186/s12859-014-0429-4.

Single-molecule dataset (SMD): a generalized storage format for raw and processed single-molecule data.

Author information

1
Department of Chemical Engineering, Stanford University, Stanford, CA, 94305, USA. max.greenfeld@gmail.com.
2
Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA. max.greenfeld@gmail.com.
3
Department of Statistics, Columbia University, New York, NY, 10027, USA. janwillem.vandemeent@gmail.com.
4
Department of Physics, Stanford University, Stanford, CA, 94305, USA. dmitrip@stanford.edu.
5
Department of Applied Physics, Stanford University, Stanford, CA, 94305, USA. hmabuchi@stanford.edu.
6
Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY, 10027, USA. chris.wiggins@columbia.edu.
7
Department of Chemistry, Columbia University, New York, NY, 10027, USA. rlg2118@columbia.edu.
8
Department of Chemical Engineering, Stanford University, Stanford, CA, 94305, USA. herschla@stanford.edu.
9
Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA. herschla@stanford.edu.
10
Department of Biochemistry, B400, Stanford University, Stanford, CA, 94305, USA. herschla@stanford.edu.

Abstract

BACKGROUND:

Single-molecule techniques have emerged as incisive approaches for addressing a wide range of questions arising in contemporary biological research [Trends Biochem Sci 38:30-37, 2013; Nat Rev Genet 14:9-22, 2013; Curr Opin Struct Biol 2014, 28C:112-121; Annu Rev Biophys 43:19-39, 2014]. The analysis and interpretation of raw single-molecule data benefits greatly from the ongoing development of sophisticated statistical analysis tools that enable accurate inference at the low signal-to-noise ratios frequently associated with these measurements. While a number of groups have released analysis toolkits as open source software [J Phys Chem B 114:5386-5403, 2010; Biophys J 79:1915-1927, 2000; Biophys J 91:1941-1951, 2006; Biophys J 79:1928-1944, 2000; Biophys J 86:4015-4029, 2004; Biophys J 97:3196-3205, 2009; PLoS One 7:e30024, 2012; BMC Bioinformatics 288 11(8):S2, 2010; Biophys J 106:1327-1337, 2014; Proc Int Conf Mach Learn 28:361-369, 2013], it remains difficult to compare analysis for experiments performed in different labs due to a lack of standardization.

RESULTS:

Here we propose a standardized single-molecule dataset (SMD) file format. SMD is designed to accommodate a wide variety of computer programming languages, single-molecule techniques, and analysis strategies. To facilitate adoption of this format we have made two existing data analysis packages that are used for single-molecule analysis compatible with this format.

CONCLUSION:

Adoption of a common, standard data file format for sharing raw single-molecule data and analysis outcomes is a critical step for the emerging and powerful single-molecule field, which will benefit both sophisticated users and non-specialists by allowing standardized, transparent, and reproducible analysis practices.

PMID:
25591752
PMCID:
PMC4384321
DOI:
10.1186/s12859-014-0429-4
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center