Comparison of multinomial and binomial proportion methods for analysis of multinomial count data

J Anim Sci. 2010 Oct;88(10):3452-63. doi: 10.2527/jas.2010-2868. Epub 2010 Jul 2.

Abstract

Simulation methods were used to generate 1,000 experiments, each with 3 treatments and 10 experimental units/treatment, in completely randomized (CRD) and randomized complete block designs. Data were counts in 3 ordered or 4 nominal categories from multinomial distributions. For the 3-category analyses, category probabilities were 0.6, 0.3, and 0.1, respectively, for 2 of the treatments, and 0.5, 0.35, and 0.15 for the third treatment. In the 4-category analysis (CRD only), probabilities were 0.3, 0.3, 0.2, and 0.2 for treatments 1 and 2 vs. 0.4, 0.4, 0.1, and 0.1 for treatment 3. The 3-category data were analyzed with generalized linear mixed models as an ordered multinomial distribution with a cumulative logit link or by regrouping the data (e.g., counts in 1 category/sum of counts in all categories), followed by analysis of single categories as binomial proportions. Similarly, the 4-category data were analyzed as a nominal multinomial distribution with a glogit link or by grouping data as binomial proportions. For the 3-category CRD analyses, empirically determined type I error rates based on pair-wise comparisons (F- and Wald chi(2) tests) did not differ between multinomial and individual binomial category analyses with 10 (P = 0.38 to 0.60) or 50 (P = 0.19 to 0.67) sampling units/experimental unit. When analyzed as binomial proportions, power estimates varied among categories, with analysis of the category with the greatest counts yielding power similar to the multinomial analysis. Agreement between methods (percentage of experiments with the same results for the overall test for treatment effects) varied considerably among categories analyzed and sampling unit scenarios for the 3-category CRD analyses. Power (F-test) was 24.3, 49.1, 66.9, 83.5, 86.8, and 99.7% for 10, 20, 30, 40, 50, and 100 sampling units/experimental unit for the 3-category multinomial CRD analyses. Results with randomized complete block design simulations were similar to those with the CRD; however, increasing the size of the random block effect decreased the power of the F-test for the treatment effect. Power of the binomial approach with 4-category nominal data (CRD with 50 sampling units/experimental unit) depended on the probability of the category used, but the type I error rate for individual binomial proportions did not differ (P > 0.43) from the multinomial rate. Overall, analyzing a single binomial category from the multinomial distribution did not affect the type I error rate; however, analyzing multinomial data as a series of binomial proportions increased the experiment-wise type I error rate, and power varied among categories. Within the ordered category probabilities we modeled, power decreased as the number of sampling units per experimental unit decreased. Thus, for variables with probabilities similar to those modeled, power to detect treatment differences in count data from research settings with a small number of animals per pen would be limited.

MeSH terms

  • Animals
  • Cattle
  • Linear Models
  • Models, Statistical*
  • Monte Carlo Method
  • Random Allocation
  • Research / statistics & numerical data
  • Research Design
  • Veterinary Medicine / methods
  • Veterinary Medicine / statistics & numerical data