Format

Send to

Choose Destination
J Biol Rhythms. 2014 Aug;29(4):231-42. doi: 10.1177/0748730414537788.

Evaluation of five methods for genome-wide circadian gene identification.

Author information

1
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
2
Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
3
Department of Statistics, Texas A&M University, College Station, Texas, USA.
4
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China zhangzhang@big.ac.cn.

Abstract

Identification of circadian-regulated genes based on temporal transcriptome data is important for studying the regulation mechanism of the circadian system. However, various computational methods adopting different strategies for the identification of cycling transcripts usually yield inconsistent results even for the same dataset, making it challenging to choose the optimal method for a specific circadian study. To address this challenge, we evaluate 5 popular methods, including ARSER (ARS), COSOPT (COS), Fisher's G test (FIS), HAYSTACK (HAY), and JTK_CYCLE (JTK), based on both simulated and empirical datasets. Our results show that increasing the number of total samples (through improving sampling frequency or lengthening the sampling time window) is beneficial for computational methods to accurately identify circadian transcripts and measure circadian phase. For a given number of total samples, higher sampling frequency is more important for HAY and JTK, and the longer sampling time window is more crucial for ARS and COS, as testified on simulated and empirical datasets from which circadian signals are computationally identified. In addition, the preference of higher sampling frequency or the longer sampling time window is also obvious for JTK, ARS, and COS in estimating circadian phases of simulated periodic profiles. Our results also indicate that attention should be paid to the significance threshold that is used for each method in selecting circadian genes, especially when analyzing the same empirical dataset with 2 or more methods. To summarize, for any study involving genome-wide identification of circadian genes from transcriptome data, our evaluation results provide suggestions for the selection of an optimal method based on specific goal and experimental design.

KEYWORDS:

ARSER; COSOPT; Fisher’s G test; HAYSTACK; JTK_CYCLE; circadian gene; circadian rhythms; comparison

PMID:
25238853
DOI:
10.1177/0748730414537788
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center