Format

Send to

Choose Destination
J Clin Epidemiol. 2008 Dec;61(12):1222-6. doi: 10.1016/j.jclinepi.2007.12.008. Epub 2008 Jul 10.

R and S-PLUS produced different classification trees for predicting patient mortality.

Author information

1
Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada. peter.austin@ices.on.ca

Abstract

OBJECTIVE:

There is a growing interest in using classification and regression trees in biomedical research. R and S-PLUS are two statistical programming languages that share a similar syntax and functionality. Both R and S-PLUS allow users to fit classification and regression trees. The objective was to compare classification trees grown using R with those grown using S-PLUS.

STUDY DESIGN AND SETTING:

Using data on 9,484 patients hospitalized with an acute myocardial infarction, we compared the classification trees for predicting mortality that were grown using R and S-PLUS. We also used repeated split-sample derivation to determine the predictive accuracy of classification trees grown using R and S-PLUS.

RESULTS:

The classification tree grown using R was substantially more parsimonious than the one grown using S-PLUS. The pruned classification tree grown using R was equal to a classification tree that was obtained by removing six subtrees from the pruned classification tree grown using S-PLUS. Repeated split-sample validation was then used to demonstrate that classification trees constructed using S-PLUS had greater discrimination and accuracy compared to classification trees grown using R.

CONCLUSIONS:

R can produce different classification trees than S-PLUS using the same data.

PMID:
18619801
DOI:
10.1016/j.jclinepi.2007.12.008
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center