Exercise 1
This exercise uses data from a real undergraduate research project. The aim of the question is to test your understanding of the topics covered in 5.3 and 5.6.
Example W5.1: The movement of European house dust mites (Dermatophagoides pteronyssinus) on different types of carpet
A final year honours student was investigating the distribution of house dust mites in relation to different types of carpet. As part of her investigation she wanted to see if dust mites moved at the same rate through these various carpets. She obtained four different carpet tiles: short pile wool, deep pile wool, short pile synthetic and deep pile synthetic. The carpet tiles were inoculated with dust mites and after 4 weeks examined to see how far the mites had travelled (Table W5.1).
Table W5.1: The distance travelled by European house dust mites on four types of carpets after four weeks.
|
Number of dust mites between 0 - 4.9cm |
Number of dust mites between 5 - 9.9cm |
Number of dust mites between 10 - 14.9cm |
|
|---|---|---|---|
Short pile wool |
1280 |
11 |
0 |
Deep pile wool |
840 |
34 |
0 |
Short pile synthetic |
877 |
33 |
5 |
Deep pile synthetic |
995 |
9 |
0 |
1 |
Q W1.1The student wishes to test the hypothesis that there is no association between the types of carpet and the distance travelled and she is planning to use the r x c chi-squared test for association. (See Chapter 5, 5.3.). Does the data fit the criteria for using this test? [If you would like to save a record of your answer, please type it into this Word document] |
2 |
3 |
4 |
There are also instructions on how to perform this calculation using the following software packages:
How to calculate Q W1.3b in Excel
Step 1: Put the data into the spreadsheet using appropriate row and column headings.

(The '(o)' means 'observed', to distinguish them from the expected values to be calculated later.)
Calculate the totals for the rows and columns. Start with the rows: in cell d3, type the formula '=sum(b3:c3)', and click on the green tick or press 'return'.

Click on cell d3 to highlight it, then hover the cursor over the bottom-right corner of the cell. It should change from an open horizontal-vertical cross into an addition sign. When it does, hold down the left mouse button, move the cursor down into cell d6, and release the button. The formula will have been copied, and the total for the other rows calculated.

Now for the columns: start in cell b7, and enter the formula '=sum(b3:b6)'. Click on the green tick or press 'return', then drag this across into cells c5 and d5.

Step 2: Calculate the expected values.
Create four new rows for the expected values, using '(e)' to indicate expected values. (In this example, rows 9 to 12 will be used.)
You could create a formula to go in each cell, but it is quicker to create a formula to be dragged across the results space. Go to cell b9 and enter the formula '=b$7/$d$7*$d3. The 'b$7' will always refer to a column total in row 7; the '$d$7' will always refer to the grand total in cell d7; and the $d3 will always refer to a row total in column d. Click on the green tick or press 'return', then drag the formula across into cell c9. You can now drag the formula down to cover the whole of the result space.

Step 3: Calculate chi-squared for this test for association. First, calculate the values of (obs - exp)2/exp for all possible combinations of pile length and fabric composition. Use rows 14 to 17, and enter the formula '=(b3-b9)^2/b9into cell b14. Click on the green tick, or press 'return'.

Because the observed and expected values are in identically arranged arrays, this formula can be dragged across and down to cell c17 to calculate all the other values of (obs - exp)2/exp. (You will have to do the drag across first, then drag the whole row down.)

Next, add them all up to find the value of chi-squared. They are in a nice rectangular array, so a single 'sum' function will do the job. We shall use cell b19, and the formula '=sum(b14:c17)'.

Therefore, the value of chi-squared is 45.537953.
How to calculate Q W1.3b in SPSS
It seems that the only way to do this using SPSS is to enter each dust mite on a separate row - SPSS does not work well with contingency tables. Each mite would have a categoric variable for carpet type, and another for distance moved (perhaps 'near' and 'far'). The problem here is that we have over 4000 dust mites, so our results table would need over 4000 rows. This would involve an awful lot of typing (or copying and pasting - another feature that SPSS doesn't seem to support very well), so it is probably easier to use another package for this type of analysis.
If you really want to work through this example using SPSS, follow the instructions in the Statistics Software section of the Online Resource Centre for SPSS Box 5.4.
How to calculate Q W1.3b in Minitab
Step 1: Enter the data into the worksheet section of the Minitab screen. The letters 'o' in brackets at the end of the column names indicate observed values.

Step 2: Calculate the totals for the rows. Go to 'Calc', 'Calculator'; type 'c2 + c3' in the expression box, and type 'total (o)' in the 'Store result in variable' box.

Now click on 'OK'.

Step 3: Calculate the totals for the columns. In c1(6), enter 'total' (leave row 5 blank to avoid confusion between data and totals).
Click in the 'Session' (top) window, go to 'Editor' and select 'Enable Commands'.

At the 'MTB >' prompt, type 'let c2(6) = sum(c2)'. This will add all the numbers in column 2, and place the result in cell 6 in column 2.

Repeat the process for columns 3 and 4.

Step 4: Calculate the expected values. Go to 'Calc', 'Calculator' and enter 'coastal (e)' into the 'Store Results in Variable' box. Enter 'c2(6)/c4(6)*c4' into the expression box.

Now click on 'OK'.

Repeat the process for the expected values for 5.0-14.9 cm, using the column heading '5.0-14.9 cm (e)' and the expression 'c3(6)/c4(6)*c4'.

Step 5: Calculate the test value of chi-squared. This is the sum of the terms like '(observed - expected)2/(expected), and in this case we have eight of them. The easiest way to do this is to calculate the eight parts separately, in two columns, and then add them all up.
Go to 'Calc, 'Calculator' and enter 'chi-sq. (near)' ('near' meaning 0.0-4.9 cm) into the 'Store results in variable' box, and type '(c2-c5)**2/c5' into the expression window.

Now click on 'OK'.

Repeat the process for the far (5.0-14.9 cm) data, using the column heading 'chi-sq. (far)' and the expression '(c3-c6)**2/c6'.

The total chi-squared is found by adding all these together. Go to 'Calc', 'Calculator' and type in the expression 'sum(c7)+sum(c8)'. Place this in a variable called simply 'chi-sq.'. (Actually, we only need the first four values in these columns, but the values in row 6 are both zero.)

Click on 'OK'.

Therefore, the value of chi-squaredcalculated is 45.5380.
5 |
6 |
7 |
Q W1.3eTherefore, do you reject the null hypothesis? [If you would like to save a record of your answer, please type it into this Word document] |
8 |