Chi-squared or G goodness of fit with or without the Yates correction.
To arrive at this conclusion, follow the steps outlined in appendix b:
B.1 What type of investigation am I designing?
This is an experiment, you are starting out with a question (hypothesis) (go to B.2).
B.2 Which type of hypotheses am I testing?
There are three types of hypotheses which you need to choose between. If you are not sure which type of hypotheses you will be testing read the information in B2.1 - B.2.3 before deciding. For more information about hypotheses and hypothesis testing read Chapter 4.
In this example the student developed a number of genetic proposals against which she wished to test the data. Each genetic proposal therefore provided an expected set of values. This student therefore wished to test the first type of hypotheses.
B.2.1 Does the data match an expected ratio?
The choice of statistical test is determined by the number of variables and number of categories for each variable.
In this example there is one treatment variable (coat colour) and there is a genetic reason (a priori expectation) for expecting certain ratios. The stud book listed a number of different crosses and not surprisingly the ratio of coat colours in the offspring varied. When these were grouped into similar crosses it was clear that for some crosses there were more than two coat colours in the offspring, for some there were only two coat colours in the offspring and for some there was only one. From the table we can see that for most of this data either a chi-squared or G goodness of fit test with or without a Yates correction appears to be the most appropriate group of tests. (For crosses where there is only one expected coat colour any deviation from this will automatically indicate that the 'expected ratio' is not correct. You do not need statistics to tell you this).
Experimental design |
Test |
You have one variable and you have an a priori reason for expecting certain outcomes from your investigation. The variable has more than two categories. |
Chi-squared goodness of fit test (5.1.)
or
G goodness of fit test (5.5.1)
|
You have one variable and you have an a priori reason for expecting certain outcomes from your investigation. The variable has only two categories. |
Chi-squared goodness of fit test with Yates correction (5.4.1.)
or
G goodness of fit test (5.5.5.)
|
You have one variable with two or more categories and you do not have an a priori expectation. You have more than two samples in your data set and wish to know if the samples are similar or different from each other. |
Chi-squared test for heterogeneity (5.2.)
|
You do not have an a priori expectation. You have two variables. At least one of these variables has more than two categories. You wish to test for an association between the variables. |
Chi-squared test for association (5.3)
or
G test for association (5.5.2)
|
You do not have an a priori expectation. You have two variables. Both variables have only two categories. You wish to test for an association between the variables. |
Chi-squared test for association with Yates correction (5.4.2)
or
G test for association (5.5.2)
|
To use a chi-squared or G goodness of fit test you:
- Wish to compare your observed values to those predicted by an a priori expectation.
- Have one treatment variable.
- Have only one sample.
- Have data that falls into more than two categories .
- Have data that is counts or frequencies and is not percentages or proportions.
- Have observations that are independent.
- Have expected values that are more than 5.
From this it appears that most of these criteria are likely to be met. Criterion 7 cannot be confirmed until the data are collected. In this example stud records were pooled across a number of similar genetic crosses to ensure where possible that criterion 7 was met. Where criterion 7 was not met the hypotheses could not be tested.