O'Reilly logo

Data Algorithms by Mahmoud Parsian

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 20. Cochran-Armitage Test for Trend

The Cochran-Armitage test for trend (CATT) is used in analyzing germline data. For example, variants in a VCF (variant call format) file generated by DNA sequencing can be labeled as germline data. The CATT is a statistical method of directing chi-squared tests toward narrow alternatives. If R is a set of response variables and E is a set of experimental variables, then the CATT is sensitive to the linearity between R(s) and E(s) and detects trends. The CATT can be expressed another way: if B is a binary outcome of some events {PASSED, FAILED} and C is a set of ordered categories {C1, ..., Cn}, then the CATT can be used as a linear trend in proportions on B across levels of C. To apply the CATT, we build a contingency table: two rows with outcome values {PASSED, FAILED} and n columns as {C1, ..., Cn}. The contingency table for the CATT is explained in the next sections.

According to Wikipedia:

The Cochran-Armitage test for trend, named for William Cochran and Peter Armitage, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and a variable with k categories. It modifies the Pearson chi-squared test to incorporate a suspected ordering in the effects of the k categories of the second variable. For example, doses of a treatment can be ordered as “low,” “medium,” and “high,” and we may suspect that the treatment benefit cannot become smaller as the dose ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required