Exercise 2
This exercise uses data from a real undergraduate research project. The aim of the exercise is to enable you to integrate many of the topics we covered in chapters 1 to 5, as well as the summary included in appendix b: Which statistical test should I choose?
Example W5.2. Attitudes to over the counter genetic tests.
An undergraduate carrying out her third year honours research project investigated students attitudes to 'over the counter' genetic tests. She asked a group of Biology students and a group of Education (non Biology) students a series of closed questions. The first question made it clear whether the participant knew what was meant by 'genetic testing' (Table W5.4). Having then provided some additional information, the second question asked whether the participants agreed with the idea of over the counter genetic testing (Table W5.5).
Table W5.4: The number of Biology and Education (non-Biology) students who understood the term 'genetic testing'
Yes, the term was understood. |
No, the term was not understood. |
|
|---|---|---|
|
Biology students |
16 |
4 |
Education (non Biology) students |
8 |
12 |
Table W5.5: The number of students that agreed with the idea of offering over the counter genetic tests
Agreed |
Disagreed |
|
|---|---|---|
|
Biology students |
6 |
14 |
Education (non Biology) students |
2 |
18 |
1 |
2 |
3 |
How to calculate Q W2.3a in Excel
Step 1: Enter the data into the spreadsheet using appropriate row and column headings.

Calculate the row totals: use the formula '=b3+c3' in cell d3, and drag it down into cell d4.

Calculate the column totals: use the formula '=b3+b4' in cell b5, and drag it across into cells c5 and d5.

Step 2: Calculate the expected values.
In this case, the expected values are given by (column total/grand total) x row total.
Use two new rows (rows 7 and 8), and in cell b7 type the formula '=b$5/$d$5*$d3'. In this formula, 'b$5' always refers to a column total in row 5; '$d$5' always refers to the grand total in cell d5; and' $d3' always refers to a row total in column d.

Drag this formula across and then down to populate the results space.

Step 3: Calculate the values of (obs-exp)2/exp - but remember to use the Yates' correction: (|obs-exp| - 0.5)2/exp, where the vertical bars mean 'take the absolute value of'. The 'absolute value' of a number (sometimes called its modulus) is the number, but ignoring any minus sign. In Excel, this is done using the 'abs' function.
Use rows 10 and 11, and in cell b10 type in the formula '=(abs(b3-b7)-0.5)^2/b7'. Click on the green tick, or press 'return'.

Drag this across into cell c10, then drag the row down to fill the result space.

Add these together by typing the formula '=sum(b10:c11)' into cell b13. Click on the green tick, or press 'return'.

The value of chi-squaredcalculated is 5.104167.
How to calculate Q W2.3a in SPSS
Step 1: Set up the variables
When SPSS starts, select the 'Type in data' option. Then choose 'variable view' from the tabs at the bottom left.

You will see a screen something like this:

Each row represents a variable for the analysis. We need one variable for the subject studied, and another to indicate whether the student understood 'genetic testing'. Enter 'subject' in the 'Name' field in row 1, and 'underst' in the 'Name' field in row 2.

Both these are categoric variables, so we need to use value labels. Click in the 'values' cell in row 1, and then click in the grey area that appears at the right-hand side of the cell. You will get a dialogue box. Enter a value of 1, and a value label of 'biology', then click on 'add' to register this pair with the system.

Repeat with the value-label pair of 2 and education, then click on 'OK'.
Since our values are integers, we should reduce the 'decimals' property to zero. Click in the 'decimals' cell of row 1, and use the 'up-and-down' arrows that appear to change the 'decimals' property to zero.

Repeat the process for the variable 'underst', using value labels of 'yes' and 'no', and also reducing the number of decimal places to zero.

Change to data view by clicking on the tab at bottom left.
Step 2: Enter the data.
Check that value labels are enabled by going to 'view' and, if necessary, ensuring that there is a tick next to 'value labels'.

Click in the first cell of column 1, and a drop-down menu will become accessible from an arrow at the right-hand side of the cell. Use this to enter the word 'biology'.

Repeat the process until the first 20 cells in the column have been filled. (It may be easier to do some copying and pasting.)

Repeat for the next 20 cells, but insert the word 'education'. Then go to the second column, and insert 'yes' and 'no' in the appropriate locations.

Go to 'Analyze', 'Descriptive Statistics', 'Crosstabs'. In the following dialogue box, click on 'subject' to highlight it, the click on the appropriate arrow to transfer it to the 'row(s) window.

Repeat the process to transfer 'underst' to the 'column(s)' window.
Click on 'Statistics', and make sure that 'Chi-square' is selected.

Click on 'continue' and then on 'OK', and the results will appear in a separate window.
Crosstabs

The final table gives the results of the chi-squared test. The uncorrected value of chi-squared is 6.667, but with the Yates' correction it is 5.104.
How to calculate Q W2.3a in Minitab
Step 1: Enter the observed data into the worksheet window of Minitab.

Calculate the totals for the rows. Go to 'Calc', 'Calculator', enter 'total' in the 'Store result in variable' window, and type '=c2+c3' in the expression window.

Click on 'OK'.

Now for the column totals. Click in the Session (upper) window in Minitab, go to 'Editor' and select 'Enable Commands'.

In column 1 cell 4, write 'total'.
At the 'MTB >' prompt, enter the command 'let c2(4) = sum(c2)'.

Repeat for columns 3 and 4, using the commands 'let c3(4) = sum(c3)' and 'let c4(4) = sum (c4)'.

Step 2: Calculate the expected values. Go to 'Calc', 'Calculator'; enter 'Bio (e)' in 'Store result in variable', and type 'c2(4)/c4(4)*c4' in the expression window.

Click on 'OK'.

Repeat the process for the Education students, using the variable 'Edu (e)' and the formula 'c3(4)/c4(4)*c4'.

Step 3: Calculate the values of chi-squared using the Yates' correction. This involves calculating the absolute value of the difference between the observed and expected values using the 'absolute' operator in Minitab.
Go to 'Calc', 'Calculator', and insert 'chi-sq (Bio)' in the 'Store results in variable' window. Type '(absolute(c2-c5)-0.5)**2/c5' in the expression window.

Click on 'OK'.

Repeat for the Education students, using the variable name 'chi-sq. (Edu)' and the formula '(absolute(c3-c6)-0.5)**2/c6'.

Add the individual values of chi-squared together. The easiest way to do this is to add the column totals for columns 7 and 8, but both of them have a meaningless entry in row 6. Delete this by highlighting it and pressing 'delete'. Now go to 'Calc', Calculator'. In the 'Store result as variable' window, type 'chi-sq. total', and in the expression window type 'sum(c7)+sum(c8)'.

Click on 'OK'.

Therefore, the value of chi-squaredcalculated is 5.10417.
4 |
5 |
6 |
Q W2.3dTherefore, do you reject the null hypothesis? [If you would like to save a record of your answer, please type it into this Word document] |
7 |
8 |
9 |
10 |
There are also instructions on how perform this calculation using the following software packages:
How to calculate Q W2.4c in Excel
Step 1: Enter the data into the spreadsheet using appropriate row and column headings.

Calculate the row totals: use the formula '=b3+c3' in cell d3, and drag it down into cell d4.

Calculate the column totals: use the formula '=b3+b4' in cell b5, and drag it across into cells c5 and d5.

Step 2: Calculate o ln(o) for all these values, where 'o' is an observed value, and 'ln' means 'the natural logarithm of'. This is done by the Excel function 'ln'. The easiest way to do this is to create another table of identical dimensions to the one we already have, and populate it with the numbers we need.

Now calculate the values of o ln(o). We can do this by typing a formula into one cell and dragging it into all the others, because our grids are the same shape and size. In cell b9, type '=b3*ln(b3)', the click on the green tick or press 'return'.

Now drag this cell across the rest of the table. (You may have to drag across a row, and then drag the row down.)

Following the outline in the book, we next add together all the values of o ln(o) for the individual measurements, and place them in a convenient (labelled) cell, say b13. Type in the formula '=sum(b9:c10)', then click on the green tick or press 'return'.

o ln(o) for the grand total is stored in cell d11, so we simply note that it is there: we will need it soon. The next thing to do is to add the values of o ln(o) for the rows and columns together, and put them somewhere convenient (b14). Type the formula '=sum(b11:c11)+sum(d9:d10)' into cell b14, then click on the green tick or press 'return'.

Taking values of o ln(o), we now need to find G = 2 x(measurements + grand total - rows & columns). Using cell b15, type '=2*(b13+d11-b14)', then click on the green tick or press 'return'.

Step 3: To find the Williams' correction, first work out 1/each row total and add these values together. Multiply by the grand total. Subtract 1. To do this, use a formula in a convenient cell, say, e14. The formula is '=(1/d3+1/d4)*d5-1'.

Do the same thing for the columns. The formula is '=(1/b5+1/c5)*d5-1'.

Multiply these two together: the formula is '=e14*e15'.

Now we calculate 6n(rows-1)(columns-1), where n is the total number of observations, and 'rows' and 'columns' are the numbers of rows and columns. Use the formula '=6*d5*(2-1)*(2-1)'.

The Williams correction factor is W = 1 + (rows x cols)/(6n(r-1)(c-1). Use the formula '=1+e16/e17'.

Gcalculated = G/W. The formula is '=b15/e18'.

The value of G is 6.64.
How to calculate Q W2.4c in Excel
SPSS does not do G tests directly.
How to calculate Q W2.4c in Minitab
Step 1: Enter the observed data into the worksheet window of Minitab.

Step 2: Calculate the totals for the rows. Go to 'Calc', 'Calculator', enter 'total' in the 'Store result in variable' window, and type '=c2+c3' in the expression window.

Click on 'OK'.

Step 3: Now for the column totals. Click in the Session (upper) window in Minitab, go to 'Editor' and select 'Enable Commands'.

In column 1 cell 4, write 'total'.
At the 'MTB >' prompt, enter the command 'let c2(4) = sum(c2)'.

Repeat for columns 3 and 4, using the commands 'let c3(4) = sum(c3)' and 'let c4(4) = sum (c4)'.

Step 4: Find the values of O ln (O), where 'ln' means 'the natural logarithm of'.
Go to 'Calc', 'Calculator', and enter 'O ln O (yes)' in the 'Store result in variable' window. In the expression window, type 'c2*loge(c2)', where 'loge' means 'logarithm to base e', or natural logarithm.

Click on 'OK'.

Repeat the process for the 'no' students, remembering that their data is in column 3.

Next, calculate O ln O for the grand total (from cell 4 of column 4). Go to 'Calc', 'Calculator', enter 'k1' in the 'Store result as variable' window, and in the expression window type 'c4(4)*loge(c4(4))'.

Click on 'OK'. There should be no change to the main screen.
Go to 'Project manager' (at bottom left) and open the 'constants' folder. Right-click on the word 'unnamed' next to 'k1', select 'rename' and name it 'o ln(o) (total)'.

(Note the typing error in the image above)
Minimise project manager.
Now calculate o ln(o) for the column totals. Click in the upper (Session) window, go to 'Editor' and select 'Enable commands'.

In column 1 cell 6, type 'o ln(o)'.
At the 'MTB >' prompt, type 'let c2(6) = c2(4)*loge(c2(4))', and press 'return'.

Repeat for column 3.

Add up the o ln(o) for the columns, and store it in a variable. Go to 'Calc', 'Calculator', type 'k2' in the 'Store results in variable' window, and put 'c2(6)+c3(6)' in the expression window.

Click on 'OK', and open the project manager. Rename K2 as 'o ln(o) (cols)'.

Minimise project manager.
Calculate o ln(o) for the total of each row. Go to 'Calc', 'Calculator', enter 'o ln(o) (row total)' in the 'Store result in variable' window, and type in the expression 'c4*loge(c4)'.

Click on 'OK'.

Next, we add the two row totals together, and store them as a constant. Go to 'Calc', Calculator', and type 'k3' in the 'store result in variable' window. Type 'c7(1) + c7(2)' in the expression window.

Click on 'OK', go to project manager, and rename k3 as 'o ln(o) (rows)'.

Finally, we find the sum of all the o ln(o) for the individual measurements, and store that in another constant. This is quickest by summing columns, but there are meaningless numbers in row 4 of columns 5 to 7. Remove these by highlighting them and pressing 'delete'.
Go to 'Calc', 'Calculator', put k4 in the 'store result in variable' window, and type 'sum(c5) + sum(c6)' in the expression window.

Click on 'OK'.
Open the project manager, and rename k4 as 'o ln(o) (individuals)'.

G = 2 x [o ln(o) (total) + o ln(o) (individuals) - o ln(o) (rows) - o ln(o) (columns)]
Go to 'Calc', 'Calculator', put k5 in the 'Store result in variable' window, and type '2*(k1+k4-k3-k2)' in the expression window.

Click on 'OK'.
Go to project manager, and rename k5 as 'G'.

Step 5: Now we need to apply the Williams correction.
First, we find 1/row total for each row, add them all together, multiply by the grand total, and subtract 1. This is slightly messy, but go to 'Calc', 'Calculator', put k6 in the 'store result in variable' window, and type '(1/c4(1)+1/c4(2))*c4(4)-1' in the expression window.

Click on 'OK'. Got to [project manager, and rename k6 as 'W (rows)'.

Repeat for the column totals, using k7 and the formula '(1/c2(4)+1/c3(4))*c4(4)-1'.

W is 1 + (W (rows) x W (cols))/(6 x total x (cols-1) x (rows-1). We have a 2 x 2 results table, an the total number of observations is in c4(4).
Go to 'Calc', 'Calculator', put k8 in the 'Store result in variable' window, and type '1+k6*k7/(6*c4(4)*(2-1)*2-1))'.

Click on 'OK'. Go to project manager, and rename k8 as W.

Gcalculated = G/W. This is k5/k8: put the result in G(calculated).

Click on 'OK'.

The value of Gcalculated is therefore 6.64148.
11 |
12 |
13 |
Q W2.4fTherefore, do you reject the null hypothesis? [If you would like to save a record of your answer, please type it into this Word document] |
14 |
15 |
16 |