![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Maximum Likelihood Estimation of Marginal Pair-wise Associations with Multiple Source Predictors 1Department of Mathematics, Colby College 5838 Mayflower Hill, Waterville, ME 04901, USA 2Department of Biostatistics, Harvard School of Public Health 655 Huntington Avenue, Boston, MA 02115, USA 3Division of General Medicine, Brigham and Women’s Hospital 1620 Tremont Street, Boston, MA 02120, USA 4Department of Mathematics, Smith College Northampton, MA 01063, USA Summary Researchers interested in the association of a predictor with an outcome will often collect information about that predictor from more than one source. Standard multiple regression methods allow estimation of the effect of each predictor on the outcome while controlling for the remaining predictors. The resulting regression coefficient for each predictor has an interpretation that is conditional on all other predictors. In settings in which interest is in comparison of the marginal pairwise relationships between each predictor and the outcome separately (e.g., studies in psychiatry with multiple informants or comparison of the predictive values of diagnostic tests), standard regression methods are not appropriate. Instead, the generalized estimating equations (GEE) approach can be used to simultaneously estimate, and make comparisons among, the separate pairwise marginal associations. In this paper, we consider maximum likelihood (ML) estimation of these marginal relationships when the outcome is binary. ML enjoys benefits over GEE methods in that it is asymptotically efficient, can accommodate missing data that are ignorable, and allows likelihood-based inferences about the pairwise marginal relationships. We also explore the asymptotic relative efficiency of ML and GEE methods in this setting. Keywords: Log-linear models, mixed parameter transform, multiple informants, multivariate logistic transform 1 Introduction Statistical data analysis often involves analyzing data obtained from several covariates, or predictors, and how these covariates relate to an outcome. Standard multiple regression models typically produce estimates of coefficients for the covariates that have conditional interpretations. That is, the interpretation of each regression coefficient can only be made by holding the values of all other covariates fixed. This is the standard, and preferred, method of analysis when the partial pairwise relationships between the outcome and each predictor, conditional on the remaining predictors, is of interest. While each covariate may provide some unique information about changes in the outcome, there are many settings where the covariates provide overlapping information. This is common in psychiatry where reports on a subject’s psychopathology are gathered from multiple informants. Multiple informant data can be particularly useful in studies of child psychopathology since gathering information from parents, teachers, and mental health workers may give a more accurate assessment of the underlying state of the child (Achenbach et al., 1987). In this setting, it may be of interest to estimate and compare regression parameters that are specific to each informant (or source). The use of multiple sources is also common in the field of diagnostic testing, where more than one type of test may be available to detect the presence of a risk factor, symptom, or disease. (Leisenring et al., 2000). In the latter setting, it will be of interest to determine the predictive value of each test unconditional on the results of the other tests under consideration. This will be the analytic method of choice when the goal is to choose a single diagnostic or screening instrument from a group of candidate instruments. A similar type of example arises in studies of obesity. In this setting the goal may be to determine the best marginal predictor of obesity in adulthood (see, for example, Whitaker et al., 1997). Thus interest is not in prediction of obesity in adulthood as a function of many childhood risk factors, but rather in determining its best single predictor early in life. We explore the latter two applications in more detail in the following paragraphs and return to a multiple informant application in Section 4. To better illustrate this non-standard application of regression in the diagnostic testing setting, consider the following study reported in Leisenring et al. (2000). This study was designed to compare two binary screening measures for coronary artery disease (CAD). The true presence or absence of CAD for each subject was known from the results of an angiogram—the gold standard. While the angiogram is the gold standard, the procedure is invasive and expensive, making screening procedures valuable tools for assessing the need to administer more rigorous testing. The two screening measures under consideration were presence or absence of chest pain (CP), and the passing or failing of an exercise stress test (EST). Note that if these data were jointly analyzed in a standard regression modelling framework, then CAD status would be regarded as the outcome, with CP and EST being the predictors in, for example, a multiple logistic regression model. The results would provide probabilities or risk of CAD conditional on values for both CP and EST. However, when the goal is to choose a single screening measure, the regression parameter relating CP (or EST) to CAD has a somewhat unappealing interpretation due to conditioning on EST (or CP), and vice versa. Here, the question of interest is which of the two screening measures is more strongly associated with the presence of CAD unconditional of the value of the screening measure not under consideration. While this comparison could be made in terms of the misclassification rates, Pepe et al. (1999) recommend obtaining and comparing estimates of the association (expressed in terms of an odds ratio) of each of the screening measures with the outcome (unconditional on the results of the other screening measure). Interestingly, when the marginal probabilities of the two screening measures are very similar, the apparent misclassification rate can be shown to be a monotonic function of the odds ratio between the screening measure and CAD status. Note that interest is not in prediction of CAD as a function of CP and EST, but rather in comparing the unconditional associations of each of the predictors with CAD. While the screening measures used in this particular example may be relatively inexpensive to obtain, in general it would not be desirable to gather information from more than one diagnostic test or screen. Thus, the goal is to select the screening measure with the strongest (unconditional) association with the outcome. The comparison of the regression coefficients from the two separate analyses required to assess this is a non-standard problem because the estimates are correlated, and there is no widely accepted method for jointly obtaining and comparing estimates of such marginal associations. Another example where information is obtained from multiple sources with the goal of relating each to the outcome marginally occurs in obesity research. A study conducted by Whitaker et al. (1997) sought to determine the best childhood predictors of adult obesity. Data were collected retrospectively about each subject from three sources: a measure of obesity obtained on the subject during the her or his childhood, a measure of obesity on the subject’s mother, and a measure of obesity on the subject’s father. These three predictors were measured at five occasions during the subject’s childhood. It is important to reiterate that interest was not in prediction of obesity in adulthood as a function of these three childhood predictors, but was rather to determine which of the three was the most strongly associated with obesity in adulthood. As noted by Pepe et al. (1999), such marginal associations are generally of primary interest in clinical applications due to their interpretability. In this example, obtaining such associations allows one to assess whether a child’s risk of obesity is most strongly influenced by her or his own obesity, or by his or her parents’ obesity status. Standard regression analyses that include all sources do not allow a direct assessment of this, however, since the associations obtained are conditional on the other sources. Much attention has recently been given to the statistical analysis of data arising from multiple sources (Horton and Fitzmaurice, 2004). One method for jointly obtaining estimates of the separate marginal associations relating each predictor to the response was simultaneously developed by Horton et al. (1999) and Pepe et al. (1999). Both used a non-standard application of generalized estimating equations (GEEs), under a so-called “working independence” assumption. In this approach, the univariate outcome is duplicated as many times as there are numbers of predictors, producing a cluster of identical responses for each subject. A separate marginal model is specified for each predictor but all regression parameters are jointly estimated. Note that the empirical (or “sandwich”) variance estimator (Huber, 1967) must then be used to obtain valid standard errors of these coefficients. While this GEE method produces valid estimates of the regression parameters, it may potentially lose efficiency in certain circumstances. Although in general, the efficiency losses associated with GEE methods are often not substantial (Liang and Zeger, 1986), the potential loss of efficiency has not been investigated in this setting. In this paper, we present a method for obtaining estimates of the separate pairwise associations with a binary outcome via maximum likelihood (ML) estimation. We compare the asymptotic relative efficiency of the working independence GEE-based method of Horton et al. (1999) and Pepe et al. (1999) to ML—particularly when the marginal associations can be assumed to be of similar magnitude. The joint models for the marginal associations considered in this paper are a subset of those proposed by Glonek (1996) and later extended and generalized by Bergsma and Rudas (2002) as well as Rudas and Bergsma (2004). In particular, we consider a multinomial model for the binary response and binary predictors whose parameters are a mixture of marginal parameters that describe the marginal logits and pairwise log odds ratios among the response-predictor pairs separately, and conditional parameters for all other pairwise and higher-order associations. Thus, the model described here is a hybrid of log-linear models (e.g., Bishop et al., 1975) and the multivariate logistic models described by Glonek and McCullagh (1995) and Bergsma and Rudas (2002). A similar type of “mixed parameter” model was developed by Fitzmaurice and Laird (1993) for longitudinal binary outcomes. Such models are desirable due to the marginal interpretation of the parameters of primary interest. In addition, ML estimation of these parameters is robust to misspecification of the conditional pairwise and higherorder associations when the data are complete (Fitzmaurice and Laird, 1993; Glonek, 1996). That is, even if the model for the complementary set of conditional parameters for all other pairwise and higher-order associations is misspecified, the marginal parameter estimates remain consistent provided the model for these marginal associations has been correctly specified. The model and estimation method are described in Section 2. We consider the efficiency losses associated with using working independence GEE-based methods over ML in Section 3. In Section 4, the proposed method is applied to multiple informant data from a psychiatric study in which there are two predictors (two informants) providing information on psychiatric distress, and it is of interest to examine and compare their marginal relationships with mortality. We conclude in Section 5 with a discussion of the potential benefits and drawbacks of the likelihood-based method described in this paper. 2 A Mixed Parameter Model for the Marginal Pair-wise Associations Consider a binary response, Yi, obtained from i = 1, 2, . . . , N subjects. Associated with each response is a set of K binary predictors given by, The resulting data can be summarized in a 2K+1 contingency table for the response and the K binary predictors. For each subject a set of joint probabilities, π, describes the probability of belonging to any given cell of the contingency table. We denote the vector of all first-, second-, and higher-order products among Standard log-linear models for the multinomial probabilities, π, are based on a transformation from π to a set of conditional parameters - through the expression,
The marginal parameters of interest could be estimated by fitting a multivariate logistic model of the type considered by McCullagh and Nelder (1989). Such models are based on a transformation of π to a set of marginal parameters, ψ, described by,
Fitzmaurice and Laird (1993) noted this conundrum and developed a class of models that includes a mixture of log-linear and multivariate logistic parameters in such a way as to realize advantageous features of both. In their formulation, where interest is in the marginal means, the first-order logits were marginal while all two- and higher-order parameters were conditional. We, however, consider an alternative mapping, Given our mapping, we define a vector of marginal parameters, ψ, for the first-order logits and the marginal pairwise outcome-predictor associations defined by the subset of A given by,
ML Estimation ML estimation of the parameters in the model given by (3) can be implemented using a variety of techniques. However, given this formulation, parameter estimates can be readily obtained using a similar algorithm to the one proposed by Glonek and McCullagh (1995). In their paper, the model parameters consisted entirely of marginal parameters as given by (2) and a Fisher scoring procedure was implemented for estimation. Note that the transformation η2 → π is not straightforward. Glonek and McCullagh (1995) thus used a Newton-Raphson procedure to iteratively obtain the joint probabilities, π, from η. Now consider the linear model relating the mixture of marginal (ψ) and conditional (ω) parameters given in (3) to a known design matrix, Z,
From (3) and (4), we have
Note that each update on the Fisher scoring algorithm requires updating π. However, there is no analytical expression for the transformation η → π (or, equivalently; β → π) since η is a vector of both marginal and conditional parameters. Glonek & McCullagh (1995) and Glonek (1996) used a Newton-Raphson scheme to obtain π from ψ. We, however, follow the approach of Fitzmaurice and Laird (1993) and use an iterative pro-portional fitting (IPF) algorithm (Deming and Stephan, 1940) to obtain the estimated joint probabilities, To implement the IPF, we specify a 2K+1“start table,”
A key advantage of the mixed parameter model proposed here is that even if the model for ω has been misspecíed,the ML estimate The ML estimation routine described here is not implemented in widely available commercial software. Thus, we implemented the iterative routine in R utilizing the IPF algorithm that is included as part of the loglin() function. In the next section, we investigate the asymptotic relative efficiency of the independence GEE as compared to ML. We illustrate the proposed method in Section 4 using data collected from two predictors (two informants) of the psychiatric status of 953 subjects from the Stirling County study. 3 Asymptotic Relative Efficiency of Independence GEE Estimator Relative to ML In this section the asymptotic efficiency of the working independence GEE estimator of Horton et al. (1999) and Pepe et al. (1999) is compared to the ML estimator described in Section 2. The minimal asymptotic relative efficiencies (ARE) of GEE estimation with a working independence assumption, relative to ML estimation, were obtained over a wide grid of values for ψ and ω. A full description of the independence GEE method can be found in Appendix A. We define the asymptotic relative efficiency of the GEE estimator relative to the ML estimator by, First, we consider the case where there are two predictors. The values for the first-order marginal parameters (e.g., In the first scenario, where the two coefficients for the outcome-predictor marginal log odds ratios are unique, the ARE=1, and asymptotically, there is no efficiency loss associated with independence GEE estimation when compared to ML estimation. This occurs because the GEE and ML estimators for the marginal outcome-predictor associations are equivalent when the multinomial model is saturated (see Appendix A). Thus, the ARE=1 when there are no shared parameters, regardless of the strength of the marginal pairwise associations or the values of the conditional parameters considered ( Table 1 shows the minimum ARE across a wide range of values for ψ and ωfor various strengths of the outcome-predictor log odds ratio
We next consider the case when there are three predictors. Given that the set of conditional parameters, ω, is 8-dimensional, we place some simplifying constraints on the values investigated. Specifically, all second-order predictor-predictor associations were constrained to be equal (ωX1X2 = ωX1X3 = ωX2X3), as well as all third-order associations among the outcome and pairs of predictors (ωYX1X2 = ωYX1X3 = ωYX2X3). The remaining three-way (ωX1X2X3) and four-way (ωYX1X2X3) associations were allowed to vary independently. While these conditional parameters are not required to be constrained, and such equality constraints are undesirable in practical applications, we do so here solely to reduce the dimensionality of the study of ARE in the three-predictor variable case. The ARE was calculated for a grid of values for the marginal logits ranging from -3 to 3, and values of the conditional parameters ranging from -5 to 5, again at steps of 1. The range of the conditional parameters was more restricted than in the 2-predictor case due to extremely small joint probabilities encountered. The values for the ψYX associations again ranged from 0 to 4. As described in the two-predictor case, the GEE and ML estimators for the marginal outcome-predictor associations are equivalent when the model is saturated. This is reflected by the ARE being equal to 1 when the coefficients for the marginal outcome-predictor associations are unique The ARE also equals 1 when ω = 0 (even for a shared ψYX association). However, when a common parameter for all three marginal outcome-predictor associations was considered
We see that, in general, the efficiency loss of the independence GEE increases as the magnitude of the marginal outcome-predictor association (ψYX) increases. In the case of two predictors this loss is modest (approximately 10 percent or less). However, in the case of three predictors, we see losses approaching 25 percent, even for reasonably small values of the predictor-predictor association. Thus, these results indicate that the efficiency losses associated with using the working independence GEE method of Horton et al. (1999) and Pepe et al. (1999) relative to ML are dependent on the number of predictors in the model. We see that there may be some value to using ML estimation when there are three predictors, but perhaps not when there are only two. 4 Example In this section, we utilize the multiple predictor model described in Section 2 to analyze data arising from the Stirling County depression study. The data come from a large cohort study consisting of 953 subjects in Eastern Canada. More information about the Stirling County study can be found in Leighton (1959), Murphy (1980), and Murphy et al. (1985). The data were obtained prospectively from 1952 to 1968, with one outcome of interest being mortality during this 16-year period. Information on psychiatric distress was obtained from two predictors, and it is of interest to relate each of these predictors to the outcome, marginally. One predictor was a self-report measure called the DPAX (depression-anxiety scale) and was processed via computer algorithm (see Murphy et al., 1985, for more information about the DPAX). This measure is designed to detect the presence of anxiety and/or depression through a self-report questionnaire. The second predictor (denoted GP) contained information about the presence of psychiatric distress as determined by a general physician. Each physician diagnosis was validated by a psychiatrist. Note that while the DPAX detects the presence of depression and/or anxiety, the GP data indicates any general mental disturbance deemed relevant by the physician. These data are summarized in Table 3.
Note that since we do not have covariates in addition to the two binary predictors (X1 for DPAX; X2 for GP), we can calculate the joint probabilities directly. However, it is of interest to determine whether X1 and X2(DPAX and GP) have marginal pair-wise associations with the outcome (mortality) that are significantly different. Standard regression techniques would not readily allow such a comparison since the coefficients from a regression analysis including both X1 and X2(DPAX and GP) have conditional interpretations. In order to assess this, we first fit the saturated model,
We next fit a reduced model in which we assume a shared association parameter, with β4 = β5(describing the marginal log odds ratio between the outcome with each of the two predictors separately). The ML estimates of β and their model-based standard errors are shown in Table 4 for the saturated and reduced models. In order to test the hypothesis that β4 = β5, we performed a likelihood ratio test (LRT) of H0: β4 = β5 and found that the model with a shared parameter for the marginal pairwise associations between mortality and DPAX, and mortality and GP, is tenable (LRT = 0.03; p > 0.85). We caution the reader that such a comparisons is meaningful only when the predictors have the same underlying scales of measurement (as they do in this example).
We note that the independence GEE (using empirical standard errors) and ML methods both result in estimates and standard errors for the regression coefficients that are nearly identical. This is not surprising given the ARE results for the 2-predictor case reported in Section 3. We also used the joint probabilities estimated from the Stirling County data to determine the corresponding ARE. We obtained an ARE of approximately 1 which agrees very closely with the comparison of the standard errors from the GEE and ML analyses. Thus, in this example, the DPAX and GP predictors have similar relationships to mortality. This indicates that receiving a positive DPAX or GP assessment is associated with 1.76 times the odds of mortality relative to someone without a positive assessment. Thus having a diagnosis of psychiatric distress is associated with mortality irrespective of source. Note that even though the predictors are exchangeable, we obtain substantially smaller standard errors (0.158 versus 0.212 and 0.206) by using all the available data to estimate the common association. In this example, we report model-based standard errors for the ML estimator, however they did not differ substantively from those obtained via the empirical variance estimator since the model for the remaining conditional associations is saturated. 5 Discussion While obtaining information from multiple predictors is common in many diverse applications, methods for simultaneously obtaining and comparing the marginal pairwise associations between the outcome and each predictor have seen relatively little work. The approach developed by Horton et al. (1999) and Pepe et al. (1999) provides a method for obtaining and comparing estimates of such marginal relationships using a GEE strategy in which the relationship between the outcome and each predictor can be modeled separately (but estimated simultaneously). This is advantageous in that no distributional assumptions about Y need to be made in order to obtain valid estimates of β(Liang and Zeger, 1986); the GEE approach also does not require assumptions about the joint distribution of X1, . . . , XK. This method is also advantageous in that it can be easily implemented using widely available commercial software (e.g., PROC GENMOD in SAS, or xtgee in Stata). In this paper, we have developed a likelihood-based framework in which estimates of the marginal pairwise associations can be obtained and compared. The model presented here is similar to models considered by Fitzmaurice & Laird (1993) and Glonek (1996). While such models require specification of the joint distribution for ML estimation, only the model for the marginal parameters needs to be correctly specified for consistent estimates of the pairwise marginal associations with the outcome to be obtained. ML estimation is generally advantageous relative to GEE-based estimation in that it is asymptotically efficient. However, we have shown that there is no loss in efficiency associated with independence GEE methods for estimation of β when there are no shared parameters among the marginal outcome-predictor associations. That is, when the model is saturated, the GEE and ML estimators for the marginal outcome-predictor associations are equivalent (see Appendix A). While the ARE results indicate there is some loss of efficiency when there is a shared parameter among these associations, this loss may be modest relative to the increased computational difficulty in implementing the ML estimation routine. We note that we have not investigated the efficiency loss when there are more than three predictors. It may be the case that this loss is greater; how-ever, the computational difficulty in implementing the ML estimation procedure would also be greater. One would thus need to weigh the relative advantage of the efficiency gain when using ML estimation against its added computational burden. The lack of substantial efficiency gain is consistent with previous work. For example, Fitzmaurice (1995) investigated the ARE of GEE versus ML estimation for the analysis of multivariate binary data. He showed that there is some efficiency loss associated with GEE versus ML methods using a working independence assumption for covariates that vary within a subject, but very little loss in efficiency when there are no covariates that vary within a subject. While he considered multiple outcomes for any set of predictors, recall that an appealing feature of logistic regression models is that the odds ratio can be estimated from a prospective or retrospective study design. One implication of this result when X is binary is that if X is treated as the response in the logistic regression, and Y is treated as the predictor, the estimate of the pairwise log odds ratio remains the same. Considering this fact, the ARE results reported in Fitzmaurice (1995) have implications in our setting. If we were to regard X1, . . . , XK as multiple responses, and Y as a predictor, then the ARE results from Fitzmaurice (1995) suggest that the independence GEE is almost efficient when compared to ML. That is, Y is a between-subject “predictor” in the logistic regression model for the multiple correlated “responses”, X1, . . . , XK. Aside from efficiency, ML estimation is advantageous over GEE methods since it can incorporate missing data that are either missing completely at random (MCAR; Little and Rubin, 1987) or missing at random (MAR; Little and Rubin, 1987), but at the cost of requiring assumptions about the joint distribution of the outcomes. Given that missing data are common in survey data, this method will allow consistent estimates of the separate pairwise associations to be obtained provided missingness is ignorable. However, incorporating missing data is not straightforward since only the multivariate logistic component (ψ) of the model is reproducible (i.e., it will require the use of the EM algorithm). Also, by utilizing ML estimation, it is possible to conduct likelihood-based inference. This is advantageous in the discrete data setting since likelihood ratio tests (LRT) are purported to have better finite-sample properties than Wald tests when the outcome is binary (Hauck and Donner, 1977). ML estimation also has the advantage of allowing modelling of the joint distribution in a variety of ways. In contrast, by focusing only on the marginal probability of the outcome, GEEs do not allow modelling of the joint distribution at all. While the method presented in this paper enjoys the benefits of any ML-based estimation procedure, for discrete multivariate data it becomes cumbersome as the number of predictors increases. That is, as the number of second- and higher-order nuisance parameters (which number 2K+1- 2K - 1) proliferates it becomes computationally demanding. When the data are complete, our ARE results suggest that using ML estimation over the GEE method of Horton et al. (1999) and Pepe et al. (1999) may yield modest gains from an efficiency perspective alone. Investigating the asymptotic relative efficiency of GEE-based methods compared to the ML methods described in this paper when there are missing data would be a valuable extension to this work. While the complete data results show modest losses, greater losses may result when there are missing data. Also, extensions of this method when there are additional covariates in the model would be of interest in many practical settings. Appendix A: Equivalency of Independence GEE and ML Estimators in the Saturated Model Let Yi be the binary response from subject i, with Xi1 and Xi2 being two binary predictors. These three binary variables define a 2 × 2 × 2 contingency table. The number of subjects falling into a given cell is denoted by nuvw where we order the subscripts denoting presence (1) or absence (0) of the outcome, first predictor, and second predictor, respectively. Thus, the number of subjects satisfying the condition Yi = Xi1 = Xi2 = 0 is given by n000. We use the “+” notation to denote summation over an index. For example, the count of subjects satisfying Yi = Xi1 = 0, but having no restriction on Xi2, is given by n00+. We abbreviate the total count of subjects, n+++, by N as in Section 2. Given that the 8 cell counts follow a multinomial distribution, the ML estimate of the joint probability for a given cell of the 2×2×2 contingency table is given by the count in that cell, divided by the total number of subjects (e.g.,
Now consider the GEE estimation method of Horton et al. (1999) and Pepe et al. (1999) for the two predictor case. Although we have a univariate response from each subject, Horton et al. (1999) and Pepe et al. (1999) derive the estimating equations byconsidering a 2 × 1 vector of responses, Yi = (Yi, Yi)′. This vector is then related to the two predictors via a multivariate logistic regression model,
While we have illustrated this for the marginal log odds ratio for Y X1, we note that we may permute the ordering of the subscripts and obtain the same results for the Y X2 marginal log odds ratio. Similarly, if we had more than 2 predictors, then we marginalize over K - 1 predictors to obtain a 2 × 2 table similar to that given by Table 5. The results will remain the same, although summation will be taken over the K - 1 predictors not under consideration. This indicates that, when the model is saturated the ML estimator, Appendix B: Construction of C and L for 2 predictors In the case of 2 predictors, we order the probabilities as follows with the subscripts denoting presence (1) or absence (0) of the outcome, first predictor, and second predictor, respectively. Thus, the vector of joint probabilities is ordered in lexicographical order as, The matrix, L, takes these probabilities and appropriately marginalizes over them for the ψ parameters. Since ω parameters are conditional, the lower portion of L is the identity matrix. Note that the first row of L sums over all the cells to constrain the parameter ϕ to be equal to 1. The matrix, C, then takes the appropriate contrasts of log(Lπ). This matrix is block diagonal with the upper left corresponding to β and the lower right corresponding to ω. These matrices are as follows for the 2 predictor model, References [1] Achenbach TM, McConaughy SH, Howell CT. Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin. 1987;101(2):213–232. [PubMed] [2] Agresti A. Categorical Data Analysis. John Wiley and Sons; New York: 1990. [3] Aitchison J, Silvey SD. Maximum-likelihood estimation of parameters subject to constraints. Annals of Mathematical Statistics. 1957;29:813–828. [4] Bergsma WP, Rapcsák T. An exact penalty method for smooth equality constrained optimization with application to maximum likelihood estimation. Technical report, EURANDOM, 1 2005. [5] Bergsma WP, Rudas T. Marginal models for categorical data. Annals of Statistics. 2002;30:140–159. [6] Bishop YMM, Fienberg SE, Holland PW. Discrete Multivariate Analysis: Theory and Practice. MIT Press; Cambridge, MA: 1975. [7] Deming WE, Stephan FF. On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Annals of Mathematical Statistics. 1940;11:427–444. [8] Fitzmaurice GM. A caveat concerning independence estimating equations with multivariate binary data. Biometrics. 1995;51(1):309–317. [PubMed] [9] Fitzmaurice GM, Laird NM. A likelihood-based method for analysing longitudinal binary responses. Biometrika. 1993;80(1):141–151. [10] Glonek GFV. A class of regression models for multivariate categorical responses. Biometrika. 1996;83(1):15–28. [11] Glonek GFV, McCullagh P. Multivariate logistic models. Journal of the Royal Statistical Society—Series B. 1995;57(3):533–546. [12] Grizzle JE, Starmer CF, Koch GG. Analysis of categorical data by linear models. Biometrics. 1969;25:489–504. [PubMed] [13] Hauck WW, Donner A. Wald’s test as applied to hypotheses in logit analysis. Journal of the American Statistical Association. 1977;77:851–853. [14] Horton NJ, Fitzmaurice GM. Tutorial in biostatistics: Regression analysis of multiple source data and multiple informant data from complex survey samples. Statistics in Medicine. 2004;23:2911–2933. [PubMed] [15] Horton NJ, Laird NM, Murphy JM, Monson RR, Sobol AM, Leighton AH. Multiple informants: Mortality associated with psychiatric disorders in the Stirling County Study. American Journal of Epidemiology. 2001;154(7):649–656. [PubMed] [16] Horton NJ, Laird NM, Zahner GEP. Use of multiple informant data as a predictor in psychiatric epidemiology. International Journal of Methods in Psychiatric Research. 1999;8(1):6–18. [17] Huber PJ. The behaviour of maximum likelihood estimators under non-standard conditions. In: LeCam LM, Neyman J, editors. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press. 1967.pp. 221–233. [18] Lang JB, Agresti A. Simultaneously modeling joint and marginal distributions of multivariate categorical responses. Journal of the American Statistical Association. 1994;89(426):625–632. [19] Leighton AH. My Name is Legion: The Stirling County Study of Psychiatric Disorder and Sociocultural Environment. volume I. Basic Books Inc.; New York: 1959. [20] Leisenring W, Alonzo T, Pepe MS. Comparisons of predictive values of binary medical diagnostic tests for paired designs. Biometrics. 2000;56:345–351. [PubMed] [21] Liang K-Y, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. [22] Little RJA, Rubin DB. Statistical Analysis with Missing Data. Wiley; New York: 1987. [23] McCullagh P, Nelder JA. Generalized Linear Models. second edition. Chapman and Hall; New York: 1989. [24] Murphy JM. Continuities in community-based psychiatric epidemiology. Archives of General Psychiatry. 1980;37:1215–1223. [PubMed] [25] Murphy JM, Neff RK, Sobol AM, Rice JX, Olivier DC. Computer diagnosis of depression and anxiety: the Stirling County Study. Psychological Medicine. 1985;15:99–112. [PubMed] [26] Pepe MS, Whitaker RC, Seidel K. Estimating and comparing univariate associations with application to the prediction of adult obesity. Statistics in Medicine. 1999;18:163–173. [PubMed] [27] Rudas T, Bergsma W. On applications of marginal models for categorical data. Metron. 2004;17(1):1–25. [28] Whitaker RC, Wright JA, Pepe MS, Seidel KD, Dietz WH. Predicting obesity in young adulthood from childhood and parental obesity. The New England Journal of Medicine. 1997;337(13):869–873. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Psychol Bull. 1987 Mar; 101(2):213-32.
[Psychol Bull. 1987]Biometrics. 2000 Jun; 56(2):345-51.
[Biometrics. 2000]N Engl J Med. 1997 Sep 25; 337(13):869-73.
[N Engl J Med. 1997]Biometrics. 2000 Jun; 56(2):345-51.
[Biometrics. 2000]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]N Engl J Med. 1997 Sep 25; 337(13):869-73.
[N Engl J Med. 1997]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]Stat Med. 2004 Sep 30; 23(18):2911-33.
[Stat Med. 2004]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]Biometrics. 1969 Sep; 25(3):489-504.
[Biometrics. 1969]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]Arch Gen Psychiatry. 1980 Nov; 37(11):1215-23.
[Arch Gen Psychiatry. 1980]Psychol Med. 1985 Feb; 15(1):99-112.
[Psychol Med. 1985]Psychol Med. 1985 Feb; 15(1):99-112.
[Psychol Med. 1985]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]Biometrics. 1995 Mar; 51(1):309-17.
[Biometrics. 1995]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]Stat Med. 1999 Jan 30; 18(2):163-73.
[Stat Med. 1999]