Description of statistical methods used to model repeated measures of prostate-specific antigen

Andrew J Simpkin; Leila Rooshenas; Julia Wade; Jenny L Donovan; J Athene Lane; Richard M Martin; Chris Metcalfe; Peter C Albertsen; Freddie C Hamdy; Lars Holmberg; David E Neal; Kate Tilling

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Simpkin AJ, Rooshenas L, Wade J, et al. Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study. Southampton (UK): NIHR Journals Library; 2015 Jul. (Health Services and Delivery Research, No. 3.30.)

Cover of Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study

Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study.

Show details

Contents

< Prev Next >

Appendix 4Description of statistical methods used to model repeated measures of prostate-specific antigen

Linear mixed models

A LMM³⁷^,¹²³ for the observed responses Y_ij, i = 1, . . ., n, j = 1, . . ., n_i, where i indexes individuals and j indexes measurements within individuals, can be written as:

Y_{i j} = W_{i} β + X_{i j} γ + Z_{i} u_{i} + ε_{i j},

(9)

where β = β₀,. . ., β_p is the p + 1 dimensional vector of fixed effects and γ = (γ₁,. . ., γ_q) is the q dimensional vector of time effects. The random effects u_i∼ N(0,Σ_u) are assumed here to have an unstructured r × r covariance matrix Σ_u. W_i, X_i and Z_i are the fixed, time and random design matrices, respectively, which represent the fixed, time and random covariates specified in the model. The ε_ij∼ N(0,Σ_ε) are the residuals from the model. This model assumes that the relationship between outcome and covariate is linear. This may not always be the case, and so we discuss two methods for accommodating non-linear relationships within the mixed-models framework.

Fractional polynomials

Royston and Altman¹²⁴ suggest a simple set S = (–2, –1, –0.5, 0, 0.5, 1, 2, 3) of powers for transformation of a covariate in a multiple regression setting (the power 0 is taken as log [X]). Fractional polynomials (FPs) are used to model a response which is non-linear in some covariates. FP1 denotes a first degree fractional polynomial where a transformation of a covariate X is captured in a single term (i.e. X^q, q ε S; for a single covariate X. A second degree fractional polynomial (FP2) transforms a covariate X as X^q = X(^q1,^q2) where:

X^{q} = X^{(q_{1}, q_{2})} = {\begin{matrix} (X^{q_{1}}, X^{q_{2}}); q_{1} \neq q_{2} \\ (X^{q_{1}}, X^{q_{1}} log X); q_{1} = q_{2} \end{matrix} .

(10)

The best FP degree and power(s) are found through first fitting all models for each degree and choosing the model from each degree (i.e. FP1, FP2) with the lowest deviance. Once the ‘best’ model of each degree has been selected, a closed form algorithm¹²⁵ is used to find the best model. This algorithm provides a straightforward hypothesis test of the linearity assumption, which should be checked before continuing with a linear model.

Fractional polynomials were developed primarily for standard regression analysis but are easily commuted to a multilevel framework.⁹⁸^,¹²⁶ The model is written as in (1) where the time coefficient γ = (γ₁,. . .,γ_m) is a vector of length m, corresponding to the degree of fractional polynomial used. For example, a FP2 mixed model, taking powers 1 and 3 from S, with no fixed covariates W, would be:

Y_{i j} = β_{0} + X_{i j} γ_{1} + X_{i j}^{3} γ_{2} + u_{0 i} + X_{i j} u_{1 i} + X_{i j}^{3} u_{2 i} + ε_{i j .}

(11)

By incorporating a random intercept and random effects for all terms involving time (X_ij), the individuals may vary about the typical intercept and polynomial trend (linear and cubic trend here).

Regression splines

Splines are piecewise polynomial functions that allow the response to be modelled differently in separate intervals of the time covariate X. This is done by introducing knots or breakpoints to partition the range of X. In statistical analyses, spline functions may offer the flexibility required to describe accurately non-linear patterns which may exist in the relationship between variables. A linear spline basis for a single covariate leads to the so-called broken stick model, where the fit is K + 1 line segments joined end to end at the K knots. Only the linear spline model is considered here, but higher degree spline bases may be used to capture a conceptually smoother process.

Regression spline mixed models are simply LMMs with a reparameterisation of the time covariates.¹²⁷^–¹²⁹ The model can be written as (1) with time coefficients γ = (γ₁,. . .γ_s), where S is the sum of the degree of spline basis and the number of knots used. For example, taking one knot at time τ for all individuals, a RSMM with no fixed covariates W is:

Y_{i j} = β_{0} + X_{i j} γ_{1} + {(X_{i j} - τ)}_{+} γ_{2} + u_{0 i} + X_{i j} u_{1 i} + {(X_{i j} - τ)}_{+} u_{2 i} + ε_{i j},

(12)

where a₊ = a if a > 0 and 0 otherwise. This model allows each individual to have their own intercept (β₀ + u_0i), their own slope (γ₁ + u_1i) and their own adjustment to this slope after the knot (γ₂ + u_2i). Regression splines also offer a framework for checking the assumption of linearity of the response in the predictor. If any estimated spline coefficients are found to have CIs not containing 0, this gives evidence against the linearity assumption.

Functional principal components analysis

Functional data analysis¹³⁰ offers an extension of non-parametric smoothing to repeated measures data. These methods allow for flexible curves to be fitted to each member of a group and are very useful when forgoing any assumptions about the shape of these curves. However, for a large number of individuals with irregular measurement times, several functional data analysis methods become inefficient. Yao et al.¹³¹ propose a version of functional principal components analysis (FPCA) whereby sparse and irregular longitudinal data (such as those in our example) can be modelled.

The process of FPCA is highlighted in Figure 14 (constructed using the PACE example in MATLAB). Firstly, the hierarchical structure of the data is ignored and a smooth curve is fitted to the pooled data. This estimate for the mean is then used to build up a matrix of covariances; these represent the deviations from the mean for each pair of time points. For example, two residuals on the same side of the mean fitted curve would have a positive covariance; two on opposite sides would have a negative covariance. In order to construct the required two-dimensional curves, this three-dimensional covariance matrix (or surface) is decomposed (or summarised) into a linear combination of orthogonal eigenfunctions and eigenvalues. These eigenfunctions, known as functional principal components (FPCs), act as a basis on which individual trajectories can be constructed. In terms of mixed models, the FPCs can be seen as patterns of within subject variance left over after the mean fit. The first FPC summarises the main pattern of variation from the mean. After this, the second FPC, which is orthogonal to the first, explains the next main pattern of variation from the mean and so on. Thus, the data are now summarised in terms of the mean pattern and functions of the variation from the mean. The final step is to construct individual curves using these summaries of the complex data. The individuals’ curves are found by multiplying the FPCs by scaling factors which quantify the extent to which the individual’s trajectory correlates with the corresponding FPC (or pattern of variation). For instance, if an individual follows the mean pattern exactly, their FPC scores would be zero. The combination of the mean fit and FPC score scaled individually results in a smoothed curve for each member of the cohort.

FIGURE 14

Fitting process of functional PCA. (a) Smooth all data; (b) smooth covariance surface; (c) find FPCs from smoothed surface; and (d) obtain individual curves using mean fit and FPCs.

Copyright © Queen’s Printer and Controller of HMSO 2015. This work was produced by Simpkin et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK305590

Contents

< Prev Next >

PubReader
Print View
Cite this Page
Simpkin AJ, Rooshenas L, Wade J, et al. Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study. Southampton (UK): NIHR Journals Library; 2015 Jul. (Health Services and Delivery Research, No. 3.30.) Appendix 4, Description of statistical methods used to model repeated measures of prostate-specific antigen.
PDF version of this title (2.6M)

Description of statistical methods used to model repeated measures of prostate-s...
Description of statistical methods used to model repeated measures of prostate-specific antigen - Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

Bookshelf

Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study.

Appendix 4Description of statistical methods used to model repeated measures of prostate-specific antigen

Linear mixed models

Fractional polynomials

Regression splines

Functional principal components analysis

FIGURE 14

Views

In this Page

Other titles in this collection

Recent Activity