8.4 SIGNIFICANCE TESTING ON WEIGHTED TABLES
When doing significance testing with weighted data, it is recommended that you create the effective base row, even when the percentage base is the System Weighted Total or the System Weighted Any Response row. The effective base row is needed to verify any tests on weighted data.
The effective base is an estimation of how the weighting is affecting the test. It is the actual base number that is used when determining whether two samples are significantly different. The effective base will never be higher than the original unweighted base and will usually be slightly less. As the variance in the weights increases, the effective base decreases in order to compensate for the likely change in the percentages that will occur. Without this correction, some weighting factor could always be applied, which would make any item significantly greater than any other. Since the effective base is so integral to the test, it is recommended that it be printed on the table so that it can be determined how the weighting might be affecting the significance testing.
There are two different ways to create the effective base row. You can use either the $[BASE] or $[EFFECTIVE_N] keywords. The $[BASE] keyword creates two different rows, both of which are needed for the significance tests: the weighted total (which is needed to properly calculate all the percentages) and the effective base (which is used as the base for the significance test). The $[EFFECTIVE_N] keyword can be used to create the effective base when the percentage base is either the System Total or Any Response rows. In this case you do not need to again specify the percentage base because the system has already calculated it.
In the example below, the $[BASE] keyword is used to create the effective base. A new STUB_PREFACE is defined because the SET UNWEIGHTED_TOP option is used to also produce an unweighted total row.
NOTE: The following set of commands defines a standard front end for the next set of examples
>PURGE_SAME
>PRINT_FILE STAT4
~INPUT DATA
~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T401
~DEFINE
TABLE_SET= { BAN1:
EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STATISTICS_DECIMALS=2,
-PERCENT_SIGN,DO_STATISTICS=.95,RUNNING_LINES=1 }
BANNER=:
| GENDER AGE ADVERTISING AWARENESS
| <=========> <=================> <=========================>
| TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
| —– —- —— —– —– —– —— —— —— ——
}
COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4]
}
Example:
STUB= STUB_TOP_UNWGT:
[-VERTICAL_PERCENT] UNWEIGHTED TOTAL
[SUPPRESS] UNWEIGHTED NO ANSWER
[SUPPRESS] WEIGHTED TOTAL
[SUPPRESS] WEIGHTED NO ANSWER }
TABLE_SET= { TAB401:
WEIGHT=: SELECT_VALUE([6^1//3/X],VALUES(1.021,.880,1.130,1))
SET UNWEIGHTED_TOP
STATISTICS=: I=BC,I=DEF,GHIJ;
STUB_PREFACE= STUB_TOP_UNWGT
HEADER=: WEIGHTED TABLE WITH STATISTICAL TESTING
(USING BASE ROW) }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= TOTAL SAMPLE }
STUB=:
[BASE_ROW,VERTICAL_PERCENT=*] WEIGHTED TOTAL (% BASE)
[-VERTICAL_PERCENT] EFFECTIVE BASE (STAT BASE) NET GOOD
| VERY GOOD
| GOOD FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[STAT,LINE=0] MEAN
[STAT,LINE=0] STD DEVIATION
[STAT,LINE=0] STD ERROR }
ROW=: $[BASE] TOTAL $[] [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11]
STORE_TABLES=*
}
Here is the table that is printed:
WEIGHTED TABLE WITH STATISTICAL TESTING (USING BASE ROW)
TABLE 401
RATING OF SERVICE
BASE= TOTAL SAMPLE
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
UNWEIGHTED TOTAL 400 196 204 125 145 113 91 108 107 176
WEIGHTED TOTAL (% BASE) 400 194 206 128 128 128 91 108 106 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
EFFECTIVE BASE (STAT BASE) 396 194 202 125 145 113 90 107 106 174
NET GOOD 200 120C 80 58 55 84DE 43 55 54 106G
50.1 62.0 38.9 45.6 43.4 65.5 47.5 51.3 51.0 60.6
VERY GOOD 102 69C 33 36 33 31 19 32 30 60G
25.4 35.7 15.8 28.0 26.2 23.9 21.1 29.5 27.8 33.9
GOOD 99 51 48 22 22 53DE 24 24 25 47
24.6 26.3 23.1 17.6 17.2 41.6 26.3 21.8 23.2 26.7
FAIR 89 42 47 29F 39F 16 19 20 23 35
22.3 21.8 22.8 22.4 30.3 12.4 20.7 18.3 21.7 19.9
NET POOR 83 17 65B 30 26 23 19 27J 22 24
20.7 8.9 31.7 23.2 20.7 17.7 20.4 24.9 20.8 13.4
POOR 39 8 30B 12 15 11 8 13 13 12
9.6 4.2 14.7 9.6 11.7 8.8 8.8 11.8 12.0 7.0
VERY POOR 44 9 35B 17 11 11 11 14J 9 11
11.0 4.7 17.0 13.6 9.0 8.8 11.6 13.1 8.8 6.4
DON’T KNOW/REFUSED 28 14 14 11 7 6 10 6 7 11
7.0 7.3 6.7 8.8 5.5 4.4 11.4 5.5 6.5 6.1
MEAN 3.47 3.91C 3.07 3.40 3.42 3.66 3.41 3.45 3.53 3.79GHI
STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.31 1.40 1.30 1.20
STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
In the preceding table, the unweighted total, weighted total, and effective base are all printed. Compare the three numbers and notice that the effective base is usually a little less than the unweighted total. This is because all the weights were close together (near 1.00), so the weighting did not substantially change the percentages on the table. Compare these percentages with those that were printed on Table 101. If the weights had a greater variance (for example, respondents were assigned weights between 5 and .2), the effective base would have been much less than the unweighted total.
To see how the effective base works, look at the TOTAL column in the above table. Notice that the weighted and unweighted totals are both 400, because weights were chosen to weight the sample back to its original size. Also notice that the effective base is only 396, which is due to the minor variation in the weights that were applied to this table. The formula for the effective base is as follows:
EB= WEIGHTED TOTAL SQUARED DIVIDED BY THE SUM OF THE SQUARE OF EACH WEIGHT
Reproduce the number 396 from above by plugging in all the appropriate numbers from TABLE 401.
EB= (WT)**2 / ( Fn*(Wn**2))
EB= ((400)**2) / ((125*(1.021**2)) + (145*(.880**2)) + (113*(1.13**2)) + (17*(1**2)))
EB= 160000 / ( 130.30 + 112.29 + 144.29 + 17)
EB= 160000 / 403.88
EB= 396.16
An important characteristic of the effective base is demonstrated in the 31-50 AGE column, where the unweighted total and the effective base are both 145 while the weighted total is only 128. Since the weighting on this table was based on AGE and everyone in that column was weighted by the same factor of 0.880, the weighted total drops to 128. However, since there is no variance in the weighting, the effective base remains unchanged. Furthermore, the percentages in that column are exactly the same as those in Table 101.
Exactly the same table could be produced by using the $[EFFECTIVE_N] keyword instead of the $[BASE] keyword and a different STUB_PREFACE.
STUB= STUB_TOP_WGT:
[-VERTICAL_PERCENT] UNWEIGHTED TOTAL
[SUPPRESS] UNWEIGHTED NO ANSWER
[BASE_ROW] WEIGHTED TOTAL
[SUPPRESS] WEIGHTED NO ANSWER }
TABLE_SET= { TAB402:
STUB_PREFACE= STUB_TOP_WGT
HEADER=: WEIGHTED TABLE WITH STATISTICAL TESTING (USING EFFECTIVE_N) }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= TOTAL SAMPLE }
STUB=:
[-VERTICAL_PERCENT] EFFECTIVE BASE (STAT BASE)
NET GOOD
| VERY GOOD
| GOOD FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[STAT,LINE=0] MEAN
[STAT,LINE=0] STD DEVIATION
[STAT,LINE=0] STD ERROR }
ROW=: $[EFFECTIVE_N] TOTAL $[] [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11]
STORE_TABLES=* }
The printed table would be basically the same as Table 401.
8.4.1 Weighted Tables with Different Weights
When performing significance testing in conjunction with applying different weights to different columns, use the SET option MULTIPLE_WEIGHT_STATISTICS. This option allows significance testing on similarly weighted columns when the table has columns with varying weights. However, it does not allow a given respondent to have a different weight in the same test. You can test independent columns with different weights, but dependent columns must have the same weights applied to them. For instance, MULTIPLE_WEIGHT_STATISTICS allows significance testing on a table where both a weighted and unweighted total column have been created, but it does not allow the unweighted total to be tested against any of the weighted columns. If this statement is not used, then the program will print an error message if a STATISTICS statement is used in conjunction with a COLUMN_SHORT_WEIGHT or COLUMN_WEIGHT table element.
The example below shows how to produce an unweighted total column and still do significance testing on the rest of the table. Notice in the STATISTICS statement that all the letters are one lower in the alphabet than previous statements because an additional category has been added to the column variable.
TABLE_SET= { TAB403:
HEADER=: TABLE WITH STATISTICAL TESTING AND DIFFERENT WEIGHTS APPLIED
TO DIFFERENT COLUMNS OF THE TABLE }
SET MULTIPLE_WEIGHT_STATISTICS
COLUMN_SHORT_WEIGHT=: TOTAL WITH &
SELECT_VALUE([6^1//3/X],VALUES(1.021,.880,1.130,1))
STATISTICS=: I=CD,I=EFG,HIJK;
BANNER=:
| SEX AGE ADVERTISING AWARENESS
| UNWGHT WGHT <==========> <===================> <============================>
| TOTAL TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
| —— —– —- —— —– —– —– —— —— —— ——}
COLUMN=: TOTAL WITH TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4]
TITLE= TAB402
TITLE_4= TAB402
STUB= TAB402
ROW= TAB402
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH STATISTICAL TESTING AND DIFFERENT WEIGHTS APPLIED TO DIFFERENT COLUMNS OF THE TABLE
TABLE 403
RATING OF SERVICE
BASE= TOTAL SAMPLE
SEX AGE ADVERTISING AWARENESS
UNWGHT WGHT <==========> <===================> <============================>
TOTAL TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—— —– —- —— —– —– —– —— —— —— ——
UNWEIGHTED TOTAL 400 400 196 204 125 145 113 91 108 107 176
WEIGHTED TOTAL 400 400 194 206 128 128 128 91 108 106 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % % %
(C) (D) (E) (F) (G) (H) (I) (J) (K)
EFFECTIVE BASE (STAT BASE) 400 396 194 202 125 145 113 90 107 106 174
NET GOOD 197 200 120D 80 58 55 84EF 43 55 54 106H
49.2 50.1 62.0 38.9 45.6 43.4 65.5 47.5 51.3 51.0 60.6
VERY GOOD 102 102 69D 33 36 33 31 19 32 30 60H
25.5 25.4 35.7 15.8 28.0 26.2 23.9 21.1 29.5 27.8 33.9
GOOD 95 99 51 48 22 22 53EF 24 24 25 47
23.8 24.6 26.3 23.1 17.6 17.2 41.6 26.3 21.8 23.2 26.7
FAIR 92 89 42 47 29G 39G 16 19 20 23 35
23.0 22.3 21.8 22.8 22.4 30.3 12.4 20.7 18.3 21.7 19.9
NET POOR 83 83 17 65C 30 26 23 19 27K 22 24
20.8 20.7 8.9 31.7 23.2 20.7 17.7 20.4 24.9 20.8 13.4
POOR 39 39 8 30C 12 15 11 8 13 13 12
9.8 9.6 4.2 14.7 9.6 11.7 8.8 8.8 11.8 12.0 7.0
VERY POOR 44 44 9 35C 17 11 11 11 14K 9 11
11.0 11.0 4.7 17.0 13.6 9.0 8.8 11.6 13.1 8.8 6.4
DON’T KNOW/REFUSED 28 28 14 14 11 7 6 10 6 7 11
7.0 7.0 7.3 6.7 8.8 5.5 4.4 11.4 5.5 6.5 6.1
MEAN 3.46 3.47 3.91D 3.07 3.40 3.42 3.66 3.41 3.45 3.53 3.79HIJ
STD DEVIATION 1.31 1.31 1.12 1.35 1.41 1.28 1.22 1.31 1.40 1.30 1.20
STD ERROR 0.07 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.05) (all_pairs) columns tested CD, EFG, HIJK
8.5 PRINT PHASE STATISTICAL TESTING
Print phase statistical testing is calculated from the numbers that are printed on the table. This means that this has several advantages. But it also has several drawbacks. The advantages:
- Table processing time will be much faster.
- Tables can be loaded into table manipulation, altered, and still be tested.
- A different type of test can be performed on each row in the table, including a Kruskal-Wallis test on the COLUMN_MEAN.
- The columns being tested can be changed without rereading any data. The limitations:
- The columns must be independent or inclusive.
- The data cannot be weighted.
- Means created during the data reading phase with $[MEAN] will not be tested. Only Means created using the EDIT option COLUMN_MEAN can be tested.
- If testing errors are made, such as dependent or weighted columns are tested or inclusive tests are not marked as Inclusive, no error messages will be generated and possibly incorrect statistical markings will be printed on the table.
NOTE: Both print phase and table building phase significance testing can be performed on the same table, although it is recommended that each test different sets of columns.
8.5.1 EDIT Options
To produce print phase significance testing, you need to use the EDIT options DO_PRINTER_STATISTICS and DO_STATISTICS_TESTS instead of the STATISTICS statement that is used for table building phase tests. The DO_PRINTER_STATISTICS option lets the program know you are going to be doing print phase testing and the DO_STATISTICS_TESTS specifies the columns to be tested.
Like the STATISTICS statement, the DO_STATISTICS_TESTS option uses letters to designate which columns are being tested and a comma to separate multiple tests. Unlike the STATISTICS statement, all tests must be independent, (there is no I= option), however tests may be inclusive and must be marked as such by using T=. T values may be printed as with the STATISTICS statement, but the PRINTABLE_T option cannot be used to error check the tests. See 8.7 PRINTING THE ACTUAL T AND SIGNIFICANCE VALUES for more information about printing T values for print phase based tests and an example of using the T= option.
To test the first three banner points against each other, the EDIT statement would be look something like:
EDIT= EDIT1: DO_PRINTER_STATISTICS,DO_STATISTICS_TEST=ABC }
The other EDIT options, such as DO_STATISTICS= to set the confidence level, and NEWMAN_KEULS_TEST to perform a Newman Keuls test, work the same way with the same defaults.
NOTE: The following set of commands defines a standard front end for the next set of examples
>PURGESAME
>PRINT_FILE STAT5
~INPUT DATA
~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T501
~DEFINE
STUB= STUBTOP1:
[BASE_ROW] TOTAL
[SUPPRESS] NO ANSWER }
TABLE_SET= { BAN1:
EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STATISTICS_DECIMALS=2,
-PERCENT_SIGN,RUNNING_LINES=1 }
STUB_PREFACE= STUBTOP1
BANNER=:
| SEX AGE ADVERTISING AWARENESS
| <=========> <=================> <=========================>
| TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
| —– —- —— —– —– —– —— —— —— ——}
COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4]
}
Example:
TABLE_SET= { TAB501:
LOCAL_EDIT=: DO_PRINTER_STATISTICS,DO_STATISTICS_TESTS=BC,DEF ALL_POSSIBLE_PAIRS_TEST DO_STATISTICS=.95
COLUMN_STATISTICS_VALUES=VALUES(,5,4,3,,2,1) COLUMN_MEAN,COLUMN_STD COLUMN_SE }
HEADER=: TABLE WITH STATISTICAL TESTING DONE DURING THE PRINT PHASE }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= TOTAL SAMPLE }
STUB=:
NET GOOD
| VERY GOOD
| GOOD
FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[PRINT_ROW=MEAN,LINE=0] MEAN
[PRINT_ROW=STD,LINE=0] STD DEV
[PRINT_ROW=SE,LINE=0] STD ERR }
ROW=: [11^4,5/5/4/3/1,2/2/1/X]
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH STATISTICAL TESTING DONE ON THE PRINTED NUMBERS
TABLE 501
RATING OF SERVICE
BASE= TOTAL SAMPLE
SEX AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F)
NET GOOD 197 120C 77 57 63 74DE 42 55 54 105
49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7
VERY GOOD 102 70C 32 35 38 27 19 32 30 60
25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1
GOOD 95 50 45 22 25 47DE 23 23 24 45
23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6
FAIR 92 44 48 28F 44F 14 20 20 24 36
23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5
NET POOR 83 18 65B 29 30 20 19 27 22 24
20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6
POOR 39 9 30B 12 17 10 8 13 13 13
9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4
VERY POOR 44 9 35B 17 13 10 11 14 9 11
11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2
DON’T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11
7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2
MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79
STD DEV 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21
STD ERR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.05) (all_pairs) columns tested BC, DEF
This table is very similar to Table 101 at the beginning of this chapter. The only difference is that columns G, H, I, and J are not tested in this table because they are not independent. Notice that the footnote has nothing in it to differentiate between tests done during the print phase or table building phase.
8.5.2 Changing the Confidence Level and the Type of Test
Changing the default significance levels or type of test procedure is done in exactly the same way as with the table building phase tests. For example, if you wanted to test at the 90% confidence level using the N-K test procedure you would add the options DO_STATISTICS=.90 and NEWMAN_KEULS_TEST onto your EDIT statement. Bi-level testing and using the approximation formula are also specified the same way. See 8.1.3 through 8.1.7 for more information on the EDIT option DO_STATISTICS. See 8.4 SIGNIFICANCE TESTING ON WEIGHTED TABLES for more information about changing the type of test.
TABLE_SET= { TAB502:
LOCAL_EDIT=: DO_PRINTER_STATISTICS,DO_STATISTICS_TESTS=BC,DEF
NEWMAN_KEULS_TEST,DO_STATISTICS=.90
COLUMN_STATISTICS_VALUES=VALUES(,5,4,3,,2,1) COLUMN_MEAN,COLUMN_STD,COLUMN_SE }
HEADER=: TABLE WITH STATISTICAL TESTING DONE DURING THE PRINT PHASE
CHANGING THE CONFIDENCE LEVEL AND USING THE NEWMAN-KEULS PROCEDURE }
TITLE= TAB501
TITLE_4= TAB501
STUB= TAB501
ROW= TAB501
STORE_TABLES=* }
The printed table would similar to Table 501 except for some of the statistical markings and the footnote. The footnote would be as follows:
(sig=.10) (n_k) columns tested BC,DEF
8.5.3 Changing the Type of Test by Row
When doing print phase statistical testing you can use the STUB option DO_STATISTICS to change the type of test being performed on that row or to exclude that row entirely from testing. This means that you can test some of the rows using the APP test, test other rows using the N-K test, and not test other rows. DO_STATISTICS can be set to any of ALL_POSSIBLE_PAIRS_TEST, NEWMAN_KEULS_TEST, ANOVA_SCAN, FISHER, or KRUSKAL_WALLIS_TEST. If -DO_STATISTICS is used, then that row will be excluded from the test.
In the example below the NET GOOD, FAIR, and NET POOR are tested with the APP test as specified on the EDIT statement. The COLUMN_MEAN is tested using the Kruskall-Wallis test. The rest of the rows are not tested.
TABLE_SET= { TAB503:
LOCAL_EDIT=: DO_PRINTER_STATISTICS,DO_STATISTICS_TESTS=BC,DEF ALL_POSSIBLE_PAIRS_TEST,DO_STATISTICS=.95
COLUMN_STATISTICS_VALUES=VALUES(,5,4,3,,2,1) COLUMN_MEAN,COLUMN_STD,COLUMN_SE }
HEADER=: TABLE WITH STATISTICAL TESTING DONE DURING THE PRINT PHASE
USING THE KRUSKAL WALLIS TEST ON THE MEAN } TITLE= TAB501
TITLE_4= TAB501
TITLE_5=:\2NNETS AND FAIR MENTIONS ARE TESTED USING ALL PAIRS TEST
MEAN IS TESTED USING KRUSKAL WALLIS TEST }
STUB=:
NET GOOD
[-DO_STATISTICS] | VERY GOOD
[-DO_STATISTICS] | GOOD
FAIR
NET POOR
[-DO_STATISTICS] | POOR
[-DO_STATISTICS] | VERY POOR
[-DO_STATISTICS] DON’T KNOW/REFUSED
[PRINT_ROW=MEAN,DO_STATISTICS=KRUSKAL_WALLIS_TEST,LINE=0] MEAN
[PRINT_ROW=STD,LINE=0] STD DEV
[PRINT_ROW=SE,LINE=0] STD ERR }
ROW= TAB501
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH STATISTICAL TESTING DONE ON THE PRINTED NUMBERS
USING THE KRUSKAL WALLIS TEST ON THE MEAN
TABLE 503
RATING OF SERVICE
BASE= TOTAL SAMPLE
SEX AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F)
NET GOOD 197 120C 77 57 63 74DE 42 55 54 105
49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7
VERY GOOD 102 70 32 35 38 27 19 32 30 60
25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1
GOOD 95 50 45 22 25 47 23 23 24 45
23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6
FAIR 92 44 48 28F 44F 14 20 20 24 36
23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5
NET POOR 83 18 65B 29 30 20 19 27 22 24
20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6
POOR 39 9 30 12 17 10 8 13 13 13
9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4
VERY POOR 44 9 35 17 13 10 11 14 9 11
11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2
DON’T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11
7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2
MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79
STD DEV 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21
STD ERR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
NETS AND FAIR MENTIONS ARE TESTED USING ALL PAIRS TEST MEAN IS TESTED USING KRUSKAL WALLIS TEST
———————————
(sig=.05) (all_pairs) (k_w) columns tested BC, DEF
Notice that the standard footnote mentions that both the APP and Kruskall-Wallis tests were used, but does not say what rows were tested with which test. As in this example, you may want to use the TITLE_5 keyword to create a customized footnote.
8.6 EXCLUDING ROWS/COLUMNS FROM SIGNIFICANCE TESTING
Often when doing significance testing there will be only a particular row or couple of rows in the table that are of interest. The processing time and the number of undesirable letters that print on the table can be greatly reduced by only performing the statistical tests on the specific rows.
The Mentor program can be instructed to test mean rows only, test only specified rows, or to drop columns with low sample sizes from the testing.
8.6.1 Testing Mean Rows Only
To perform statistical testing on mean rows only you can use the SET option MEAN_STATISTICS_ONLY. This option causes the program to ignore all testing for percentages. Return to testing both means and percentages by turning the option off with -MEAN_STATISTICS_ONLY.
NOTE: The following set of commands defines a standard front end for the next set of examples.
>PURGESAME >PRINT_FILE STAT6 ~INPUT DATA ~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T601 ~DEFINE STUB= STUB_TOP_TOT: TOTAL [SUPPRESS] NO ANSWER } TABLE_SET= { BAN1: EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30, -COLUMN_TNA,STATISTICS_DECIMALS=2, -PERCENT_SIGN,RUNNING_LINES=1 } STATISTICS=: I=BC,I=DEF,GHIJ; BANNER=: | SEX AGE ADVERTISING AWARENESS | <=========> <=================> <=========================> | TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D | ----- ---- ------ ----- ----- ----- ------ ------ ------ ------} COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4] }
EX: TABLE_SET= { TAB601: STUB_PREFACE= STUB_TOP_TOT SET MEAN_STATISTICS_ONLY LOCAL_EDIT=: DO_STATISTICS=.95 } HEADER=: TABLE WITH STATISTICAL TESTS PERFORMED ON MEANS ONLY } TITLE=: RATING OF SERVICE } TITLE_4=: BASE= TOTAL SAMPLE } STUB=: NET GOOD | VERY GOOD | GOOD FAIR NET POOR | POOR | VERY POOR DON'T KNOW/REFUSED [STATISTICS_ROW] MEAN [STATISTICS_ROW] STD DEVIATION [STATISTICS_ROW] STD ERROR } ROW=: [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11] STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH STATISTICAL TESTS PERFORMED ON MEANS ONLY TABLE 601 RATING OF SERVICE BASE= TOTAL SAMPLE SEX AGE ADVERTISING AWARENESS <=========> <=================> <=========================> TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D ----- ---- ------ ----- ----- ----- ------ ------ ------ ------ TOTAL 400 196 204 125 145 113 91 108 107 176 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % % % % % (B) (C) (D) (E) (F) (G) (H) (I) (J) NET GOOD 197 120 77 57 63 74 42 55 54 105 49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7 VERY GOOD 102 70 32 35 38 27 19 32 30 60 25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1 GOOD 95 50 45 22 25 47 23 23 24 45 23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6 FAIR 92 44 48 28 44 14 20 20 24 36 23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5 NET POOR 83 18 65 29 30 20 19 27 22 24 20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6 POOR 39 9 30 12 17 10 8 13 13 13 9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4 VERY POOR 44 9 35 17 13 10 11 14 9 11 11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2 DON'T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11 7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2 MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79GH STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21 STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09 -------------------------------- (sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
Compare this table with Table 101 and notice that only the MEAN row is marked with any letter because the tests on all the other rows were suppressed.
NOTE: There is no change in the footnote on the table, so a customized notation may want to be included somewhere on the table pointing out which rows were tested.
8.6.2 Excluding any Row from Statistical Testing
If only specific rows in the table are to be tested, you can either mark the rows that are to be tested or the rows that are to be excluded. If a simple variable is being defined, the keyword STATISTICS may be used inside parentheses in front of the code as part of the data definition. If the variable definition is complex (it uses joiners or functions), then the keyword $[DO_STATISTICS] must be used to mark the parts of the table that will be tested and the keyword $[-DO_STATISTICS] to mark which parts of the table will not be tested.
For example, suppose you only wanted to test the top two box (codes 5 and 4) and the bottom two box (codes 1 and 2) in a 5 point scale stored in data position 21. Using the STATISTICS keyword method the variable would look like:
[21^(STATISTICS)4,5/5/4/3/(STATISTICS)1,2/2/1]
This would cause only those categories marked with the STATISTICS keyword to be tested, while all other categories would not be tested.
Using the $[DO_STATISTICS] keyword method the variable would look like:
[21^4,5] $[-DO_STATISTICS] [21^5/4/3] $[DO_STATISTICS] [21^1,2] & $[-DO_STATISTICS] [21^2/1]
The default is that categories are tested so the net of 4 and 5 will be tested. All categories after $[-DO_STATISTICS] are not tested, while $[DO_STATISTICS] turns testing back on.
If table printing phase statistical testing is being done, you can exclude a row from the test by using the STUB option -DO_STATISTICS. See 8.5.3 Changing the Type of Test by Row for more information.
NOTE: The $[DO_STATISTICS] keyword should not be confused with either the EDIT statement option DO_STATISTICS or the STUB option DO_STATISTICS.
The following example shows how to test only the top box, bottom box, and mean on a rating scale:
TABLE_SET= { TAB602: LOCAL_EDIT=: DO_STATISTICS=.95 } HEADER=: TABLE WITH STATISTICAL TESTS PERFORMED ON SELECTED ROWS ONLY } TITLE=: RATING OF SERVICE } TITLE_4=: BASE= TOTAL SAMPLE } TITLE_5=:\2N ONLY ROWS WITH (*) ARE TESTED } STUB=: NET GOOD (*) | VERY GOOD | GOOD FAIR NET POOR (*) | POOR | VERY POOR DON'T KNOW/REFUSED [STATISTICS_ROW] MEAN (*) [STATISTICS_ROW] STD DEVIATION [STATISTICS_ROW] STD ERROR } ROW=: [11^(STATISTICS)4,5/5/4/3/(STATISTICS)1,2/2/1/X] $[MEAN,STD,SE] [11] STORE_TABLES=* }
Here is an alternate way to write the row variable:
ROW602A:[11^4,5] $[-DO_STATISTICS] [11^5/4/3] & $[DO_STATISTICS] [11^1,2] $[-DO_STATISTICS] [11^2/1/X] & $[DO_STATISTICS, MEAN,STD,SE] [11]
Here is the table that is printed:
TABLE WITH STATISTICAL TESTS PERFORMED ON SELECTED ROWS ONLY TABLE 602 RATING OF SERVICE BASE= TOTAL SAMPLE SEX AGE ADVERTISING AWARENESS <=========> <=================> <=========================> TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D ----- ---- ------ ----- ----- ----- ------ ------ ------ ------ TOTAL 400 196 204 125 145 113 91 108 107 176 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % % % % % (B) (C) (D) (E) (F) (G) (H) (I) (J) NET GOOD (*) 197 120C 77 57 63 74DE 42 55 54 105G 49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7 VERY GOOD 102 70 32 35 38 27 19 32 30 60 25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1 GOOD 95 50 45 22 25 47 23 23 24 45 23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6 FAIR 92 44 48 28 44 14 20 20 24 36 23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5 NET POOR (*) 83 18 65B 29 30 20 19 27J 22 24 20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6 POOR 39 9 30 12 17 10 8 13 13 13 9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4 VERY POOR 44 9 35 17 13 10 11 14 9 11 11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2 DON'T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11 7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2 MEAN (*) 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79GH STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21 STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09 ONLY ROWS WITH (*) ARE TESTED -------------------------------- (sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
If you compare this table with Table 601 note that only the NET GOOD, NET POOR, and MEAN row in this table have statistical markings. In addition, a TITLE_5 variable was defined to create a customized footnote.
8.6.3 Excluding Columns with Low Bases from Statistical Testing
If some of the columns in the test could have low bases you might want to exclude them from the testing. You may want to do this because the small bases might skew the tests, or because the sample is such that you do not want to report on any small base. The EDIT option MINIMUM_BASE= can be used to suppress not only statistical testing but also all the other values in that column. If MINIMUM_BASE is set to some value like 50, then any column that has a base less than 50 will print an asterisk (*) under the base row where the letter usually prints, and the rest of the column will be blank. The FLAG_MINIMUM_BASE option can be used in conjunction with MINIMUM_BASE. Instead of blanking the column, the program will print all the numbers in that column followed by an asterisk where the statistical markings would normally print.
TABLE_SET= { TAB603: STATISTICS=: I=BC,I=DEF,GHIJ; LOCAL_EDIT=: MINIMUM_BASE=100,DO_STATISTICS=.95 } HEADER=: USING MINIMUM BASE OPTION TO SUPPRESS A COLUMN WITH A LOW BASE } TITLE=: RATING OF SERVICE } TITLE_4=: BASE= TOTAL SAMPLE } STUB=: NET GOOD | VERY GOOD | GOOD FAIR NET POOR | POOR | VERY POOR DON'T KNOW/REFUSED [STATISTICS_ROW] MEAN [STATISTICS_ROW] STD DEVIATION [STATISTICS_ROW] STD ERROR } ROW=: [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11] STORE_TABLES=* }
Here is the table that is printed:
USING MINIMUM BASE OPTION TO SUPPRESS A COLUMN WITH A LOW BASE TABLE 603 RATING OF SERVICE BASE= TOTAL SAMPLE SEX AGE ADVERTISING AWARENESS <=========> <=================> <=========================> TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D ----- ---- ------ ----- ----- ----- ------ ------ ------ ------ TOTAL 400 196 204 125 145 113 91 108 107 176 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % % % % % (B) (C) (D) (E) (F) (*) (H) (I) (J) NET GOOD 197 120C 77 57 63 74DE 55 54 105 49.2 61.2 37.7 45.6 43.4 65.5 50.9 50.5 59.7 VERY GOOD 102 70C 32 35 38 27 32 30 60 25.5 35.7 15.7 28.0 26.2 23.9 29.6 28.0 34.1 GOOD 95 50 45 22 25 47DE 23 24 45 23.8 25.5 22.1 17.6 17.2 41.6 21.3 22.4 25.6 FAIR 92 44 48 28F 44F 14 20 24 36 23.0 22.4 23.5 22.4 30.3 12.4 18.5 22.4 20.5 NET POOR 83 18 65B 29 30 20 27J 22 24 20.8 9.2 31.9 23.2 20.7 17.7 25.0 20.6 13.6 POOR 39 9 30B 12 17 10 13 13 13 9.8 4.6 14.7 9.6 11.7 8.8 12.0 12.1 7.4 VERY POOR 44 9 35B 17 13 10 14J 9 11 11.0 4.6 17.2 13.6 9.0 8.8 13.0 8.4 6.2 DON'T KNOW/REFUSED 28 14 14 11 8 5 6 7 11 7.0 7.1 6.9 8.8 5.5 4.4 5.6 6.5 6.2 MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.45 3.53 3.79H STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.40 1.29 1.21 STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.14 0.13 0.09 -------------------------------- (sig=.05) (all_pairs) columns tested BC, DEF, GHIJ * - small base
Notice that the column BRND A is blank except for the base value. Also notice that the footnote includes a note that the asterisk denotes a small base. You can suppress only the statistical testing instead of the entire column by also using the EDIT option FLAG_MINIMUM_BASE. In the example below the only difference from Table 603 is this option.
TABLE_SET= { TAB604: HEADER=: USING MINIMUM BASE OPTION TO FLAG A COLUMN WITH A LOW BASE } LOCAL_EDIT=: MINIMUM_BASE=100,FLAG_MINIMUM_BASE,DO_STATISTICS=.95 } TITLE= TAB603 TITLE_4= TAB603 STUB= TAB603 ROW= TAB603 STORE_TABLES=* }
Here is the table that is printed:
USING MINIMUM BASE OPTION TO FLAG A COLUMN WITH A LOW BASE TABLE 604 RATING OF SERVICE BASE= TOTAL SAMPLE SEX AGE ADVERTISING AWARENESS <=========> <=================> <=========================> TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D ----- ---- ------ ----- ----- ----- ------ ------ ------ ------ TOTAL 400 196 204 125 145 113 91 108 107 176 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % % % % % (B) (C) (D) (E) (F) (*) (H) (I) (J) NET GOOD (*) 197 120C 77 57 63 74DE 42* 55 54 105G 49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7 VERY GOOD 102 70 32 35 38 27 19* 32 30 60 25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1 GOOD 95 50 45 22 25 47 23* 23 24 45 23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6 FAIR 92 44 48 28 44 14 20* 20 24 36 23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5 NET POOR (*) 83 18 65B 29 30 20 19* 27J 22 24 20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6 POOR 39 9 30 12 17 10 8* 13 13 13 9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4 VERY POOR 44 9 35 17 13 10 11* 14 9 11 11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2 DON'T KNOW/REFUSED 28 14 14 11 8 5 10* 6 7 11 7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2 MEAN (*) 3.46 3.90C 3.05 3.40 3.42 3.66 3.38* 3.45 3.53 3.79H STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32* 1.40 1.29 1.21 STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15* 0.14 0.13 0.09 -------------------------------- (sig=.05) (all_pairs) columns tested BC, DEF, GHIJ * - small base
8.7 PRINTING THE ACTUAL T AND SIGNIFICANCE VALUES
Both t values and the significance of the t value can be printed on the table either in addition to or instead of the statistical letter markings. The STUB options DO_T_TEST and DO_SIG_T are used to print these values. If you are printing values from a STATISTICS statement you can also use PRINTABLE_T for error checking. PRINTABLE_T checks that the STAT= tests are only in pairs and that no column is the second column in any pair more than once (you can only print one T value per column).
The STUB options DO_T_TEST and DO_SIG_T can be set to any of the following:
-
DO_T_TEST=*
Print the t value for the last data row seen
-
DO_T_TEST=n
Print the t value for the Nth row in the table
-
DO_T_TEST=-n
Print the t value for the Nth row above this row in the table
-
DO_T_TEST=PRINT_MEAN
Print the t value for the COLUMN_MEAN
In the example below the t values and their significance are printed for the top box, bottom box, and mean rows in the table.
NOTE: The following set of commands defines a standard front end for the next set of examples:
>PURGESAME >PRINT_FILE STAT7 ~INPUT DATA ~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T701 ~DEFINE STUB= STUB_TOP_TOT: TOTAL [SUPPRESS] NO ANSWER } TABLE_SET= { BAN1: EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STATISTICS_DECIMALS=2, -PERCENT_SIGN } BANNER=: | SEX AGE ADVERTISING AWARENESS | <==========> <==================> <===============================> | TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D | ----- ---- ------ ----- ----- ----- ------ ------ ------ ------} COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4] }
EX: TABLE_SET= { TAB701: STUB_PREFACE= STUB_TOP_TOT STATISTICS=: PRINTABLE_T T=AB,T=AC,T=AD,T=AE,T=AF,T=AG,T=AH,T=AI,T=AJ LOCAL_EDIT=: DO_STATISTICS=.95 } HEADER=: TABLE WITH T AND SIGNIFICANCE VALUES PRINTED ON THE TABLE } TITLE=: RATING OF SERVICE } TITLE_4=: BASE= TOTAL SAMPLE } STUB=: NET GOOD [DO_T_TEST=*,SKIP_LINES=0] T-VALUE [DO_SIG_T=*,SKIP_LINES=0] SIGNIFICANCE | VERY GOOD | GOOD FAIR NET POOR [DO_T_TEST=*,SKIP_LINES=0] T-VALUE [DO_SIG_T=*,SKIP_LINES=0] SIGNIFICANCE | POOR | VERY POOR DON'T KNOW/REFUSED [STATISTICS_ROW] MEAN [STATISTICS_ROW] STD DEVIATION [STATISTICS_ROW] STD ERROR [DO_T_TEST=9] T-VALUE [DO_SIG_T=9] SIGNIFICANCE OF T } ROW=: [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11] STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH T AND SIGNIFICANCE VALUES PRINTED ON THE TABLE TABLE 701 RATING OF SERVICE BASE= TOTAL SAMPLE SEX AGE ADVERTISING AWARENESS <=========> <=================> <=========================> TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D ----- ---- ------ ----- ----- ----- ------ ------ ------ ------ TOTAL 400 196 204 125 145 113 91 108 107 176 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % % % % % (A) (B) (C) (D) (E) (F) (G) (H) (I) (J) NET GOOD 197C 120A 77 57 63 74A 42 55 54 105A 49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7 T-VALUE -4.69 4.69 0.98 1.75 -4.07 0.67 -0.41 -0.29 -3.69 SIGNIFICANCE 0.00 0.00 0.33 0.08 0.00 0.51 0.69 0.77 0.00 VERY GOOD 102C 70A 32 35 38 27 19 32 30 60A 25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1 GOOD 95E 50 45 22 25 47A 23 23 24 45 23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6 FAIR 92F 44 48 28 44A 14 20 20 24 36 23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5 NET POOR 83BJ 18 65A 29 30 20 19 27 22 24 20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6 T-VALUE 5.58 -5.58 -0.81 0.02 0.94 -0.03 -1.27 0.06 3.11 SIGNIFICANCE 0.00 0.00 0.42 0.98 0.35 0.97 0.20 0.95 0.00 POOR 39B 9 30A 12 17 10 8 13 13 13 9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4 VERY POOR 44BJ 9 35A 17 13 10 11 14 9 11 11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2 DON'T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11 7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2 MEAN 3.46C 3.90A 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79A STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.4 1.29 1.21 STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09 T-VALUE -6.58 6.58 0.57 0.44 -1.84 0.62 0.10 -0.60 -4.38 SIGNIFICANCE OF T 0.00 0.00 0.57 0.67 0.06 0.54 0.91 0.55 0.00 --------------------------------- (sig=.05) (all_pairs) columns tested T= AB, T= AC, T= AD, T= AE, T= AF, T= AG, T= AH, T= AI, T= AJ
Notice that t values are positive when the first item in the cell is greater than the second item, and negative when the opposite is true. Also notice that any cell with a significance of 0.05 or less is either marked with the letter A (negative t value) or the Total column is marked with its letter (positive t value). Further notice that the t values for males and females are opposites of each other. This is because each is being tested inclusively against the total which is actually the same as testing them against each other.
Basically the same table could be produced using the print phase tests. For more information on print phase tests see 8.5 PRINT PHASE STATISTICAL TESTING. Here is an example of printing the t values when doing print phase tests.
TABLE_SET= { TAB702: LOCAL_EDIT=: DO_PRINTER_STATISTICS,ALL_POSSIBLE_PAIRS_TEST,DO_STATISTICS=.95, DO_STATISTICS_TESTS=T=AB,T=AC,T=AD,T=AE,T=AF,T=AG,T=AH,T=AI,T=AJ COLUMN_STATISTICS_VALUES=VALUES(,5,4,3,,2,1)COLUMN_MEAN,COLUMN_STD,COLUMN_SE } HEADER=: TABLE WITH T AND SIGNIFICANCE VALUES PRINTED ON THE TABLE FOR TESTS PERFORMED ON THE NUMBERS ON THE PRINTED TABLE } TITLE=: RATING OF SERVICE } TITLE_4=: BASE= TOTAL SAMPLE } STUB=: NET GOOD [DO_T_TEST=*,SKIP_LINES=0] T-VALUE [DO_SIG_T=*,SKIP_LINES=0] SIGNIFICANCE | VERY GOOD | GOOD FAIR NET POOR [DO_T_TEST=*,SKIP_LINES=0] T-VALUE [DO_SIG_T=*,SKIP_LINES=0] SIGNIFICANCE | POOR | VERY POOR DON'T KNOW/REFUSED [PRINT_ROW=MEAN] MEAN [PRINT_ROW=STD] STANDARD DEVIATION [PRINT_ROW=SE] STANDARD ERROR [DO_T_TEST=PRINT_MEAN] T-VALUE [DO_SIG_T=PRINT_MEAN] SIGNIFICANCE OF T } ROW=: [11^4,5/5/4/3/1,2/2/1/X] STORE_TABLES=* }
The printed table will look basically the same as Table 701.
8.8 SIGNIFICANCE TESTING ON ROWS (PREFERENCE TESTING)
Significance testing on rows can be performed in two ways: a direct comparison test or a distributed preference test. In both cases only two rows may be compared at one time, although multiple pairs of rows may be compared in a single table. As with column testing, the STATISTICS statement and the DO_STATISTICS option on the EDIT statement control the tests.
NOTE: Row testing cannot be performed during the table-printing phase.
On the STATISTICS statement, rows are designated numerically rather than alphabetically as are the columns. The first data row is assigned the number “1”, the second data row the number “2”, and so on. Every data row is included in this count even if it is not printed. To do a direct comparison of rows use the letter D followed by an equal sign (=) before the two row numbers. Separate the two row numbers with a comma. Separate different pairs of rows with a space. To do a distributed preference test use P instead of D. Rows can only be tested sequentially and a given row may only be in one test on the table. For example, if rows 1 and 4 are being compared, then rows 5 and 6 could also be compared, but row 3 could not be compared with row 7 (row 4 being in between), nor could row 1 be compared to row 2 (row 1 is already being compared to row 4).
8.8.1 Direct Comparison Testing
A direct comparison of two rows is similar to the test that is performed on columns, except that the letter D must be specified to indicate that it is a direct test. To do a direct comparison of rows 1 and 2 use the following STATISTICS statement:
STATISTICS= ROWSTAT1: D=1,2
The following statement would test row 1 versus row 2, row 3 versus row 4, and row 5 versus row 6.
STATISTICS= ROWSTAT2: D=1,2 D=3,4 D=5,6
Row testing can be combined with column testing by specifying both the column and row tests on the same STATISTICS statement. The following statement would do column testing on columns B, C, and D, in addition to testing row 4 versus row 6.
STATISTICS= ROWSTAT3: BCD, D=4,6
The DO_STATISTICS option on the EDIT statement is again used to set the confidence level. The same setting is used for both the row and column tests. As with column testing, a footnote will be printed to indicate which rows were tested and the significance level used. If the difference is significant, a lower case “s” will print under the second row tested.
This is an example of a direct comparison of rows.
NOTE: The following set of commands defines a standard front end for the next set of examples:
>PURGESAME >PRINT_FILE STAT8 ~INPUT DATA ~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T801 ~DEFINE STUB= STUBTOP1: TOTAL [SUPPRESS] NO ANSWER } TABLE_SET= { BAN1: EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STUB_PREFACE=STUBTOP1, STATISTICS_DECIMALS=2,-PERCENT_SIGN,DO_STATISTICS=1 } BANNER=: | SEX AGE | <=========> <=================> | TOTAL MALE FEMALE 18-30 31-50 51-70 | ----- ---- ------ ----- ----- -----} COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] }
And here is this example:
TABLE_SET= { TAB801: STATISTICS=: D=1,2 D=4,5 D=7,8 HEADER=: TABLE WITH DIRECT STATISTICAL TESTING OF ROWS AT THE 95% CONFIDENCE LEVEL} TITLE=: PREFERENCE OF PRODUCTS } TITLE_4=: BASE= TOTAL SAMPLE } STUB=: [COMMENT,UNDERLINE] FIRST TEST [STUB_INDENT=2] PREFER BRAND A [STUB_INDENT=2] PREFER BRAND B [STUB_INDENT=2] NO PREFERENCE A VS B [COMMENT,UNDERLINE] SECOND TEST [STUB_INDENT=2] PREFER BRAND C [STUB_INDENT=2] PREFER BRAND D [STUB_INDENT=2] NO PREFERENCE C VS D [COMMENT,UNDERLINE] THIRD TEST [STUB_INDENT=2] PREFER BRAND E [STUB_INDENT=2] PREFER BRAND F [STUB_INDENT=2] NO PREFERENCE E VS F } ROW=: [7,8,9^1/2/X] STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH DIRECT STATISTICAL TESTING OF ROWS AT THE 95% CONFIDENCE LEVEL TABLE 801 PREFERENCE OF PRODUCTS TITLE_4=: BASE= TOTAL SAMPLE SEX AGE <=========> <=================> TOTAL MALE FEMALE 18-30 31-50 51-70 ----- ---- ------ ----- ----- ----- TOTAL 500 251 249 140 223 101 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % FIRST TEST ---------- (a) PREFER BRAND A 236 111 125 88 98 30 47.2 44.2 50.2 62.9 43.9 29.7 (b) PREFER BRAND B 214 113 101 43 101 59 42.8 45.0 40.6 30.7 45.3 58.4 s s NO PREFERENCE A VS B 50 27 23 9 24 12 10.0 10.8 9.2 6.4 10.8 11.9 SECOND TEST ----------- (d) PREFER BRAND C 266 125 141 78 121 48 53.2 49.8 56.6 55.7 54.3 47.5 (e) PREFER BRAND D 190 108 82 49 84 43 38.0 43.0 32.9 35.0 37.7 42.6 s s s s NO PREFERENCE C VS D 44 18 26 13 18 10 8.8 7.2 10.4 9.3 8.1 9.9 THIRD TEST ---------- (g) PREFER BRAND E 248 132 116 87 93 44 49.6 52.6 46.6 62.1 41.7 43.6 (h) PREFER BRAND F 187 87 100 40 96 42 37.4 34.7 40.2 28.6 43.0 41.6 s s s NO PREFERENCE E VS F 65 32 33 13 34 15 13.0 12.7 13.3 9.3 15.2 14.9 --------------------------------- (sig=.05) (all_pairs) rows tested a/b, d/e, g/h
Notice the “s” in the FEMALE column underneath the PREFER BRAND D row. This indicates that there is a significant difference between PREFER BRAND C and PREFER BRAND D for females. The blank under the MALE column in that row indicates that there is no significant difference for males. Also notice the additional lower case letter assigned to each row that was tested. This allows easy identification of which rows were tested against each other when compared to the footnote that prints at the bottom of the page.
8.8.2 Distributed Preference Testing
A distributed preference test allows a “No Preference” (or similar neutral third category) to be distributed between the two original categories while ensuring the integrity of the underlying statistical test. This is usually done for cosmetic purposes so that the percentages of the two preference categories add up to 100 percent.
The rules for the STATISTICS statement are the same as for the direct comparison, except the letter P is used instead of a D. To do a distributed preference test on rows 1 and 2, use the following STATISTICS statement:
STATISTICS= ROWSTAT4: P=1,2
In a distributed preference test, the SELECT_VALUE function is used to define the row variable. This ensures that the “No preference” response is evenly divided between the two categories (see 9.3.2 Functions for more information on the SELECT_VALUE function). A typical row definition might look like this:
ROW=: SELECT_VALUE([7^1/X],VALUES(1,.5)) WITH & SELECT_VALUE([7^2/X],VALUES(1,.5))
This causes the X punch (“No preference”) to have a value of .5 for both categories, splitting it evenly between the two.
As with the direct comparison, significant differences are marked with an “s” underneath the second row being tested. However, unlike the direct comparison test, small (not significant) differences are marked with a lower case “ns” and statistically equal rows are marked with a lower case “e”.
The following example uses a distributed preference test to compare the same rows used in Table 801. Note the difference in the row variable definition.
TABLE_SET= { TAB802: STATISTICS=: P=1,2 P=3,4 P=5,6 HEADER=: TABLE WITH DISTRIBUTED PREFERENCE TESTING OF ROWS AT THE 95% CONFIDENCE LEVEL} TITLE=: PREFERENCE OF PRODUCTS } TITLE_4=: BASE= TOTAL SAMPLE } STUB=: [COMMENT,UNDERLINE] FIRST TEST [STUB_INDENT=2] PREFER BRAND A [STUB_INDENT=2] PREFER BRAND B [COMMENT,UNDERLINE] SECOND TEST [STUB_INDENT=2] PREFER BRAND C [STUB_INDENT=2] PREFER BRAND D [COMMENT,UNDERLINE] THIRD TEST [STUB_INDENT=2] PREFER BRAND E [STUB_INDENT=2] PREFER BRAND F } ROW=: SELECT_VALUE([7^1/X],VALUES(1,.5)) WITH & SELECT_VALUE([7^2/X],VALUES(1,.5)) WITH & SELECT_VALUE([8^1/X],VALUES(1,.5)) WITH & SELECT_VALUE([8^2/X],VALUES(1,.5)) WITH & SELECT_VALUE([9^1/X],VALUES(1,.5)) WITH & SELECT_VALUE([9^2/X],VALUES(1,.5)) STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH DISTRIBUTED PREFERENCE TESTING OF ROWS AT THE 95% CONFIDENCE LEVEL TABLE 802 PREFERENCE OF PRODUCTS BASE= TOTAL SAMPLE SEX AGE <=========> <=================> TOTAL MALE FEMALE 18-30 31-50 51-70 ----- ---- ------ ----- ----- ----- TOTAL 500 251 249 140 223 101 100.0 100.0 100.0 100.0 100.0 100.0 % % % % % % FIRST TEST ---------- (a) PREFER BRAND A 261 124 136 92 110 36 52.2 49.6 54.8 66.1 49.3 35.6 (b) PREFER BRAND B 239 126 112 48 113 65 47.8 50.4 45.2 33.9 50.7 64.4 e e ns s e s SECOND TEST ----------- (c) PREFER BRAND C 288 134 154 84 130 53 57.6 53.4 61.8 60.4 58.3 52.5 (d) PREFER BRAND D 212 117 95 56 93 48 42.4 46.6 38.2 39.6 41.7 47.5 s ns s s s e THIRD TEST ---------- (e) PREFER BRAND E 280 148 132 94 110 52 56.1 59.0 53.2 66.8 49.3 51.0 (f) PREFER BRAND F 220 103 116 46 113 50 43.9 41.0 46.8 33.2 50.7 49.0 s s ns s e e --------------------------------- (sig=.05) (all_pairs) rows tested a/b, c/d, e/f
Compare this table with Table 801 and notice how the frequency and percentages have changed. The numbers in this table equal the sum of the numbers in Table 801 plus half of the numbers that were in the NO PREFERENCE row. Also notice that “s” appears in the same place, but that cells that were previously blank now contain either an “ns” or “e”, depending upon the difference of the two cells. One additional thing to notice is that the footnote for this table is exactly the same as the one from Table 801, so the only way to tell which test was done is by looking to see if there are any of the “ns” or “e” markings on the table.
8.9 CHI-SQUARE AND ANOVA TESTS
Statistical significance testing is often desirable as a part of cross-tabulation reporting. Such testing is used to determine whether or not a statistically significant relationship exists between two or more tabulated factors. Tests commonly used for this are chi-square and analysis of variance (ANOVA). ANOVA tests for significance between means. So, if significance testing is desired on any question for which means are calculated, ANOVA would be the likely choice. The chi-square test is for significance between parts of the column axis (banner) and the entire row axis (stub). It would be chosen for questions where calculation of means is not applicable. For both ANOVA and chi-square testing, row categories must be mutually exclusive, as must column categories within tested parts of the banner.
The ANOVA and chi-square tests discussed here are invoked by EDIT= statements rather than by variable/axis definition expressions and are performed by Mentor at the time of printing rather than at the time of numeric calculation. Run times are shortened by this approach thus improving overall efficiency. There are occasions when having statistics computed at the time of numerical calculation is required. A discussion of invoking statistical testing with variable/axis definition expressions can be found in the Appendix B: TILDE COMMANDS.
The following are some relevant keywords for creating tables with ANOVA and chi-square tests.
EDIT
Used in the ~DEFINE block, this controls numerous printing and percentage options. Each table can have its own EDIT statement, so options can be changed as required by varying question types. Some options that pertain to ANOVA and chi-square testing are:
TABLE_TESTS=<region> is used to specify table regions to be tested. Text labeling for the test is included in the region definition. The EDIT statement must include a separate TABLE_TESTS= command for each region tested.
-TABLE_TESTS causes statistical testing not to be performed. This option should be included in an EDIT statement separate from that which includes TABLE_TESTS=<region> options, and invoked for tables that do not require statistical testing.
COLUMN_STATISTICS_VALUES=VALUES(<values>) assigns response weights to the row categories. These weights are used in the calculation of statistics such as mean and standard deviation, and for ANOVA testing.
MEAN, STD, ANOVA, and CHI_SQUARE cause the corresponding statistical calculations to be performed and printed as part of the table. (STD is the abbreviation for standard deviation.)
-CHI_SQUARE_ANOVA_FORMAT places chi-square and ANOVA statistics in list form after the corresponding table. If this option is not used, these statistics will be printed directly under the table regions tested, provided the regions are wide enough. ANOVA and chi-square statistics for table regions with narrow column widths, such as yes/no questions in the banner, will not print and will cause Mentor to generate error messages.
SHOW_SIGNIFICANCE_ONLY causes only the significance to show under the table regions tested. This will not work with -CHI_SQUARE_ANOVA_FORMAT, so it cannot be done in list form.
MARK_CHI_SQUARE marks cells as significant based on chi-square testing. This is an alternative to Newman_Keuls, ANOVA_SCAN, and ALL_PAIRS testing. For each significant CHI_SQUARE test on the table, a formula is used to determine which cells are the most extreme. For bi-level testing, the process is repeated.
The syntax is: EDIT={edit1: MARK_CHI_SQUARE=abcde } See Mentor, Volume II, ~DEFINE EDIT for more details and examples.
LOCAL_EDIT=<name>
Used in the ~EXECUTE block, this invokes a previously defined EDIT statement. Options specified in the EDIT statement and also named in a LOCAL_EDIT command will take precedence over the same options in any other EDIT statements previously invoked. Options that are in a previously invoked EDIT statement, but not in the LOCAL_EDIT command will stay in affect. If ~SET DROP_LOCAL_EDIT is used, a LOCAL_EDIT command is in effect only for the first table following it.
STUB
Text that will label each vertical category on the printed tables is defined using this statement. Control is offered over various options, but one is of particular interest:
[-COLUMN_STATISTICS_VALUES] excludes that category from statistical calculations. It is often used to exclude the “Don’t know” category. More detailed descriptions of the capabilities and syntax of these keywords can be found in Appendix B: TILDE COMMANDS, STUB=.
The example that follows illustrates how to create tables with ANOVA and chi-square tests. Also demonstrated are the following design characteristics:
-
Separate definition of table regions for testing makes the specs more readable and easier to understand should maintenance be required in the future.
-
Region definitions ($R) are created to be banner (column) specific, but not stub (row) specific by always typing “1 to LAST” for the row part of the definition. This way, it is necessary to type the column parts of the region definitions for a banner only once since the same banner regions are tested each time a specific banner is used. Testing of row categories is then controlled on a question-by-question basis through use of [-COLUMN_STATISTICS_VALUES].
-
Separate EDIT statements invoked by LOCAL_EDIT commands control which statistical tests are to be performed on each question.
-
For banner 1, the -CHI_SQUARE_ANOVA_FORMAT option is used to print ANOVA and chi-square statistics in list form following their corresponding tables.
~INPUT DATACLN ~SET AUTOMATIC_TABLES,DROP_LOCAL_EDIT ~DEFINE EDIT={STATS_OFF: -TABLE_TESTS } ''Banner 1 definitions '' column row '' Stat tests must be labeled region region '' since they will appear in list form in banner in stub '' ------------------------------------ --------- ------- BAN1_REG1: [$T="STAT TEST FOR SERVICE TYPE " $R 2 TO 3 BY 1 TO LAST] BAN1_REG2: [$T="STAT TEST FOR NEW SERVICE " $R 4 TO 5 BY 1 TO LAST] BAN1_REG3: [$T="STAT TEST FOR TAX PREPARATION " $R 6 TO 7 BY 1 TO LAST] BAN1_REG4: [$T="STAT TEST FOR FREQUENCY OF USE " $R 8 TO 10 BY 1 TO LAST] BAN1_REG5: [$T="STAT TEST FOR SALES " $R 11 TO 14 BY 1 TO LAST] EDIT={ BAN1_EDIT: -COLUMN_TNA,PERCENT_DECIMALS=0, COLUMN_WIDTH=5,STUB_WIDTH=25, -CHI_SQUARE_ANOVA_FORMAT, ''puts stat tests in ''list following table TABLE_TESTS=BAN1_REG1,TABLE_TESTS=BAN1_REG2, TABLE_TESTS=BAN1_REG3,TABLE_TESTS=BAN1_REG4, TABLE_TESTS=BAN1_REG5 } BANNER={BAN1_BANNER: '' REG1 REG2 REG3 REG4 REG5 '' <-------> <-------> <-------> <--------------> <---------------------> | SERVICE NEW TAX FREQUENCY SALES | TYPE SERVICE PREP. OF USE ======================= | ========= ========== ========= =============== 500- 1- | SW NW ME- <500 <1 4.9 5+ | TOTAL NEW OLD AREA AREA YES NO LOW DIUM HIGH MIL. BIL. BIL. BIL. | ----- --- --- ---- ---- --- -- --- ---- ---- ---- ---- ---- ---- } BAN1_COL: TOTAL WITH & [199^1/2] WITH & ([217#S] AND [15^1,2]) WITH & ([199^1] AND (([217#S] AND & [15^N1,2]) OR [217#L])) WITH & [78^1/2] WITH & [147^1,2,6/3/4,5] WITH & [182.8#1-499999/500000-999999/1000000-4999999/5000000-99999999] ''End of banner 1 definitions ''Banner 2 definitions '' column column '' region region '' in banner in stub '' --------- --------- BAN2_REG1: [$R 2 TO 4 BY 1 TO LAST] BAN2_REG2: [$R 5 TO 8 BY 1 TO LAST] ''Since -CHI-SQUARE-ANOVA-FORMAT is not used in this edit ''statement, statistics tests for BAN2 will appear under their ''corresponding table regions. EDIT={BAN2_EDIT:-COLUMN_TNA,PERCENT_DECIMALS=0, COLUMN_WIDTH=5,STUB_WIDTH=25, TABLE_TESTS=BAN2_REG1,TABLE_TESTS=BAN2_REG2 } BANNER={BAN2_BANNER: '' REG1 REG2 '' <--------------> <--------------------> | FREQUENCY SALES | OF USE ====================== | ================ 500- 1- | ME- <500 <1 4.9 5+ | TOTAL LOW DIUM HIGH MIL. BIL. BIL. BIL. | ----- ---- ---- ---- ---- ---- ---- ---- } BAN2_COL: TOTAL WITH [147^1,2,6/3/4,5] WITH & [182.8#1-499999/500000-999999/1000000-4999999/5000000-99999999] ''End of banner 2 definitions ''Stub definitions ''Question 4 TITLE={Q4_TITLE: Q4. Please indicate on a scale from 1 to 4 how satisfied you are with your overall relationship with this company. } STUB={Q4_STUB: 4 - Very satisfied 3 - Satisfied 2 - Dissatisfied 1 - Very dissatisfied [-COLUMN_STATISTICS_VALUES] Don't know/not sure ''excluded from ''statistics } Q4_ROW: [163^4//1/Y] EDIT={Q4_EDIT: COLUMN_STATISTICS_VALUES=VALUES(4,3,2,1),MEAN,STD,ANOVA } ''Question 19 TITLE={Q19_TITLE: Q19. Does this company prepare taxes? } STUB={Q19_STUB: Yes No [-COLUMN_STATISTICS_VALUES] Don't know/not sure ''excluded from ''statistics } Q19_ROW: [78^1//3] EDIT={Q19_EDIT: CHI_SQUARE } EDIT={Q19SIG_EDIT: CHI_SQUARE,SHOW_SIGNIFICANCE_ONLY } >CREATE_DB TABLES >PRINT_FILE TABLES ~EXECUTE ''BAN1's EDIT statement causes statistics tests to be printed as ''lists after the corresponding tables. BANNER=ban1_banner,EDIT=ban1_edit,COLUMN=ban1_col ''Question 4 LOCAL_EDIT=q4_edit,TITLE=q4_title,STUB=q4_stub,ROW=q4_row ''Question 19 with statistics test LOCAL_EDIT=q19_edit,TITLE=q19_title,STUB=q19_stub,ROW=q19_row ''BAN2's EDIT statement allows statistics tests to be printed ''under their corresponding table regions (default). BANNER=ban2_banner,EDIT=ban2_edit,COLUMN=ban2_col ''Question 19 with statistics test LOCAL_EDIT=q19_edit,TITLE=q19_title,STUB=Q19_STUB,ROW=Q19_ROW ''Question 19 with statistics test showing significance only LOCAL_EDIT=q19sig_edit,TITLE=q19_title,STUB=q19_stub,ROW=q19_row ''Question 19 without statistics test LOCAL_EDIT=stats_off,TITLE=q19_title,STUB=q19_stub,ROW=q19_row RESET,PRINT_ALL
Here are the tables that are printed:
TABLE 001 Q4. Please indicate on a scale from 1 to 4 how satisfied you are with your overall relationship with this company. SERVICE NEW TAX FREQUENCY SALES TYPE SERVICE PREP. OF USE =================== ======= ========= ======== ============= 500- 1- SW NW ME- <500 <1 4.9 5+ TOTAL NEW OLD AREA AREA YES NO LOW DIUM HIGH MIL. BIL. BIL. BIL. ----- --- --- ---- ---- --- -- --- ---- ---- ---- --- ---- ---- Total 151 96 55 12 84 64 79 56 30 41 37 18 49 25 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% N/A - - - - - - - - - - - - - - 4 - Very satisfied 31 22 9 - 22 12 19 9 7 9 12 3 7 3 21% 23% 16% 26% 19% 24% 16% 23% 22% 32% 17% 14% 12% 3 - Satisfied 68 39 29 6 33 34 30 25 16 17 12 12 22 12 45% 41% 53% 50% 39% 53% 38% 45% 53% 41% 32% 67% 45% 48% 2 - Dissatisfied 38 28 10 5 23 11 25 16 5 11 12 1 14 6 25% 29% 18% 42% 27% 17% 32% 29% 17% 27% 32% 6% 29% 24% 1 - Very dissatisfied 8 4 4 1 3 5 2 4 1 2 - 1 4 2 5% 4% 7% 8% 4% 8% 3% 7% 3% 5% 6% 8% 8% Don't know/not sure 6 3 3 - 3 2 3 2 1 2 1 1 2 2 4% 3% 5% 4% 3% 4% 4% 3% 5% 3% 6% 4% 8% Mean 2.8 2.8 2.8 2.4 2.9 2.9 2.9 2.7 3.0 2.8 3.0 3.0 2.7 2.7 Standard Deviation 0.8 0.8 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.8 0.8 STAT TEST FOR SERVICE TYPE anova = 0.02, df1,df2 = (1,143) prob = 0.8694 STAT TEST FOR NEW SERVICE anova = 3.83, df1,df2 = (1,91) prob = 0.0504 STAT TEST FOR TAX PREPARATION anova = 0.01, df1,df2 = (1,136) prob = 0.9204 STAT TEST FOR FREQUENCY OF USE anova = 1.10, df1,df2 = (2,119) prob = 0.3374 STAT TEST FOR SALES anova = 1.50, df1,df2 = (3,119) prob = 0.2178
TABLE 002 Q19. Does this company prepare taxes? SERVICE NEW TAX FREQUENCY SALES TYPE SERVICE PREP. OF USE =================== ======= ========= ======== ============= 500- 1- SW NW ME- <500 <1 4.9 5+ TOTAL NEW OLD AREA AREA YES NO LOW DIUM HIGH MIL. BIL. BIL. BIL. ----- --- --- ---- ---- --- -- --- ---- ---- ---- --- ---- ---- Total 151 96 55 12 84 64 79 56 30 41 37 18 49 25 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% N/A - - - - - - - - - - - - - - Yes 64 23 41 4 19 64 - 25 14 22 15 11 18 14 42% 24% 75% 33% 23% 100% 45% 47% 54% 41% 61% 37% 56% No 79 69 10 8 61 - 79 31 15 15 20 5 27 11 52% 72% 18% 67% 73% 100% 55% 50% 37% 54% 28% 55% 44% Don't know/not sure 8 4 4 - 4 - - - 1 4 2 2 4 - 5% 4% 7% 5% 3% 10% 5% 11% 8% STAT TEST FOR SERVICE TYPE chi_square = 38.51, d_f = 1, prob = 0.0000 STAT TEST FOR NEW SERVICE chi_square = 0.13, d_f = 1, prob E<5 STAT TEST FOR TAX PREPARATION chi_square = 138.98, d_f = 1, prob = 0.0000 STAT TEST FOR FREQUENCY OF USE chi_square = 2.00, d_f = 2, prob = 0.3694 STAT TEST FOR SALES chi_square = 4.93, d_f = 3, prob = 0.1764
TABLE 003 Q19. Does this company prepare taxes? FREQUENCY SALES OF USE ====================== ================ 500- 1- ME- <500 <1 4.9 5+ TOTAL LOW DIUM HIGH MIL. BIL. BIL. BIL. ----- --- ---- ---- ---- ---- ---- ---- Total 151 56 30 41 37 18 49 25 100% 100% 100% 100% 100% 100% 100% 100% N/A - - - - - - - - Yes 64 25 14 22 15 11 18 14 42% 45% 47% 54% 41% 61% 37% 56% No 79 31 15 15 20 5 27 11 52% 55% 50% 37% 54% 28% 55% 44% Don't know/not sure 8 - 1 4 2 2 4 - 5% 3% 10% 5% 11% 8% CHI-SQUARE: <-- 2.00 --> <-- 4.93 --> D.F.: 2 3 SIG: 0.3694 0.1764
TABLE 004 Q19. Does this company prepare taxes? FREQUENCY SALES OF USE ====================== ================ 500- 1- ME- <500 <1 4.9 5+ TOTAL LOW DIUM HIGH MIL. BIL. BIL. BIL. ----- --- ---- ---- ---- ---- ---- ---- Total 151 56 30 41 37 18 49 25 100% 100% 100% 100% 100% 100% 100% 100% N/A - - - - - - - - Yes 64 25 14 22 15 11 18 14 42% 45% 47% 54% 41% 61% 37% 56% No 79 31 15 15 20 5 27 11 52% 55% 50% 37% 54% 28% 55% 44% Don't know/not sure 8 - 1 4 2 2 4 - 5% 3% 10% 5% 11% 8% CHI-SQUARE (SIG): <-- 0.3694 --> <-- 0.1764 -->
TABLE 005 Q19. Does this company prepare taxes? FREQUENCY SALES OF USE ====================== ================ 500- 1- ME- <500 <1 4.9 5+ TOTAL LOW DIUM HIGH MIL. BIL. BIL. BIL. ----- --- ---- ---- ---- ---- ---- ---- Total 151 56 30 41 37 18 49 25 100% 100% 100% 100% 100% 100% 100% 1 00% N/A - - - - - - - - Yes 64 25 14 22 15 11 18 14 42% 45% 47% 54% 41% 61% 37% 56% No 79 31 15 15 20 5 27 11 52% 55% 50% 37% 54% 28% 55% 44% Don't know/not sure 8 - 1 4 2 2 4 - 5% 3% 10% 5% 11% 8%
DISCUSSION OF OUTPUT
While interpretation of these statistical tests for specific reports is beyond the scope of this manual, some explanation of the results might he helpful.
For table 001, output from the ANOVA was as follows:
STAT TEST FOR SERVICE TYPE anova = 0.02, df1,df2 = (1,143) prob = 0.8694 STAT TEST FOR NEW SERVICE anova = 3.83, df1,df2 = (1,91) prob = 0.0504 STAT TEST FOR TAX PREPARATION anova = 0.01, df1,df2 = (1,136) prob = 0.9204 STAT TEST FOR FREQUENCY OF USE anova = 1.10, df1,df2 = (2,119) prob = 0.3374 STAT TEST FOR SALES anova = 1.50, df1,df2 = (3,119) prob = 0.2178
The number following “ANOVA=” is the value of the F statistic for the region tested. “df1,df2” are the degrees of freedom for the F statistic’s numerator and denominator respectively. “prob=” is the probability of there not being a more than coincidental relationship between the factors tested.
To check whether the desired row and banner categories are in fact being used in the ANOVA calculation, the following equations can be used:
df1 = (the number of banner points included in the ANOVA) – 1
df2 = (the sum of frequencies of all cells included in the ANOVA) – (the number of banner points included in the ANOVA)
If a column is blank, it is not considered as being included in the test.
Checking the “STAT TEST FOR SALES” ANOVA above, there are four banner points included in the test, resulting in
df1 = 4 – 1 = 3
Adding together all cells included in this test (“DON”T KNOW/NOT SURE” is excluded), ”<500 MIL”=36, “500-<1 BIL.”=17, “1-4.9 BIL.”=47, and “5+ BIL.”=23.
The number of banner points is four, resulting in
df2 = (36 + 17 + 47 + 23) – 4 = 119.
For table 002, the chi-square test produced the following output:
STAT TEST FOR SERVICE TYPE chi_square = 38.51, d_f = 1, prob = 0.0000 STAT TEST FOR NEW SERVICE chi_square = 0.13, d_f = 1, prob E<5 STAT TEST FOR TAX PREPARATION chi_square = 138.98, d_f = 1, prob = 0.0000 STAT TEST FOR FREQUENCY OF USE chi_square = 2.00, d_f = 2, prob = 0.3694 STAT TEST FOR SALES chi_square = 4.93, d_f = 3, prob = 0.1764
Similar to the ANOVA output, the number following “chi_square=” is the value of the chi-square statistic for the region tested. “d_f” represents the degrees of freedom for this statistic. The same facts apply to “prob=” as did for ANOVA.
“E<5” means that the expected value of the frequency for 25% or more of the cells in the tested region is less than 5, possibly making probability calculations for the region invalid.
Checking that the desired categories are being tested is easier for the chi-square test is easier than for the ANOVA. For degrees of freedom, the following equation applies:
d_f = (number of stub points included in the test – 1) * (number of banner points included in the test – 1).
If a row or column is blank, it is not considered as being included in the test.
Using “STAT TEST FOR SALES” as an example, “number of stub points included in the test”=2 since “DON’T KNOW/NOT SURE” is excluded, and “number of banner points included in the test”=4. Putting these numbers in the equation:
d_f = (2 – 1) * (4 – 1) = 1 * 3 = 3.
OTHER ANOVA AND CHI-SQUARE OPTIONS
It is possible to mix ANOVA and chi-square tests on the same table. This is true only when the -CHI_SQUARE_ANOVA_FORMAT option is used to output the results in list format. Here is an example of an EDIT statement that would do this. The regions have been defined as in the above example.
EDIT={Q47_EDIT2: CHI_SQUARE=BAN1_REG1 CHI_SQUARE=BAN1_REG2 CHI_SQUARE=BAN1_REG3 CHI_SQUARE=BAN1_REG4 ANOVA=BAN1_REG5 }
NOTE: An alternative way of defining regions and invoking statistical tests is as follows:
EDIT={EDIT1: TABLE_TESTS=[$R 1 TO 2 BY 1 TO LAST] TABLE_TESTS=[$R 3 TO 4 BY 1 TO LAST] TABLE_TESTS=[$R 5 TO 6 BY 1 TO LAST] ANOVA }
If this method is used, Mentor will not allow both ANOVA and chi-square on the same table.
Two other interesting possibilities are testing overlapping ranges and testing columns that are not next to each other. Again, -CHI_SQUARE_ANOVA_FORMAT must be used. Also, doing this might require some creative labeling of the horizontal axis. Example specs follow.
Here are some example specs:
BAN3_REG1: [$T="GENDER" $R 2 TO 3 BY 1 TO LAST] BAN3_REG2: [$T="DRIVE CAR" $R 4 TO 5 BY 1 TO LAST] BAN3_REG3: [$T="1-10, 11-49, & 50+ YRS OF DRIVING" $R 6,7,9 BY 1 TO LAST] BAN3_REG4: [$T="<50 & 50+ YRS OF DRIVING" $R 8,9 BY 1 TO LAST] ''BAN3_REG3 defines columns which are not next to each other. ''BAN3_REG4 defines a region that overlaps BAN3_REG3. EDIT={BAN3_EDIT: -COLUMN_TNA,PERCENT_DECIMALS=0 COLUMN_WIDTH=7,STUB_WIDTH=25 -CHI_SQUARE_ANOVA_FORMAT TABLE_TESTS=BAN1_REG1 TABLE_TESTS=BAN1_REG2 TABLE_TESTS=BAN1_REG3 TABLE_TESTS=BAN1_REG4 } BANNER={BAN3_BANNER: | GENDER DRIVE CAR YEARS OF DRIVING | =========== ========== ========================= | TOTAL MALE FEMALE YES NO 1-10 11-49 <50 50+ | ----- ---- ------ --- -- ---- ----- --- --- } BAN3_COL: TOTAL WITH & [5^1/2] WITH & [10^1/2] WITH & [30.2*P#1-10/11-49/1-49/50-99] EDIT={Q49_EDIT: ''Used as a LOCAL_EDIT to invoke tests for a ''specific question. CHI_SQUARE=BAN3_REG1 CHI_SQUARE=BAN3_REG2 CHI_SQUARE=BAN3_REG3 CHI_SQUARE=BAN3_REG4 }
8.10 NOTES ON SIGNIFICANCE TESTING
This section contains some additional notes on significance testing in order to help you understand some of the problems or errors that might occur.
When using other sources to verify significance testing accuracy, be aware that the majority of other programs or textbooks do not take into consideration the consequences of dependent or inclusive tests, the Newman-Keuls procedure, and pooled variances. Mentor uses the additional information available to produce more reliable results in general.
8.10.1 What Can and Cannot Be Tested
Significance testing can only be performed on means or percentages produced from simple frequencies. As a general rule, a “simple” frequency counts respondents, not responses, nor does it include any mathematical calculations. Basically each data case must return either a 1 (it is there) or a 0 (it is not there) for the cells being tested.
The only exceptions to this are either a Mean or a weighted table. You can test cells with different weighting schemes, but only if a given data case has the same weight everywhere in that test.
The following constructions will not produce simple frequencies and therefore cannot be tested for significance:
-
*L modifier
-
all summary statistics other than means
-
any arithmetic operation
-
sigma
-
sums
-
NUMBER_OF_ITEMS or any other number returning function
These rules apply to both ROW and COLUMN definitions. In addition, if you have any kind of summary statistic such as a $[MEAN] in the COLUMN variable, you cannot test any part of the table. If you have any of the above constructions in the ROW other than summary statistics, you will not be able to test any part of the table, unless you turn off the testing around that item using $[-DO_STATISTICS] (See Section 8.6 EXCLUDING ROWS/COLUMNS FROM THE SIGNIFICANCE TESTING).
Also the AXIS commands $[BREAK] and $[BASE] are acceptable in ROW definitions but not in COLUMN definitions. $[OVERLAY] and $[NETOVERLAY] tables can be tested. If you wish to do a test on responses, you must either use a $[OVERLAY] or a READPROCEDURE to read each data case multiple times.
Tables that are created or modified by table manipulation cannot be tested for significance. Market share tables which use sums cannot be tested. If you are trying to test items based on market share we recommend that you test the mean of the items. Testing a straight market share is very dangerous as a single outlier can easily skew the results.
Any construction can be tested using print phase tests, but remember that Mentor does not check or complain if your testing is illogical. For example, you should never use Mentor to test two numbers that were created as a calculation.
8.10.2 Degrees of Freedom
Degrees of freedom (df) is a measure of sample size for use in statistical tests. The higher the degrees of freedom, the more reliable the resulting t values are. The degrees of freedom should be calculated as follows:
Let’s assume a sample of size n (either simple count or effective_n) with n1 in group 1, n2 in group 2, and n_both in the overlap.
The degrees of freedom for the All Possible Pairs Test procedure, regardless of the variance specified, or for the Newman-Keuls procedure when only testing two groups (all simple t-tests), is calculated as follows:
no overlap, means df = n1 + n2 – 2
no overlap, percents df = n1 + n2 – 1
overlap, means or percents df = n1 + n2 – n_both – 1
When you use the T= (for inclusive tests) option to take out the overlap, the program utilizes the no overlap formula which results in n – 2 for means and n – 1 for percents. If one group is completely contained in the other it always results in n – 1.
The degrees of freedom for the Newman-Keuls procedure when testing more than two groups would be:
no overlap df = n1 + n2 – 2
overlap df = n1 + n2 – n_both – 1
8.10.3 Verifying Statistical Tests
It may be desirable to verify the results on tables that utilize Mentor’s statistical tests. We could use the formulas shown in Appendix A: STATISTICAL FORMULAS, but more often a cursory check that the correct columns and rows were tested is all that is needed.
The ~SET keyword STATISTICS_DUMP does this by sending a printout to the list file showing key elements of the statistical procedure.
For example, if the test performed was an independent test of the means which included a test of column A against column B, then a portion of the list file pertaining to this statistical test will look similar to this:
test 1 (216 len, 2 groups, err=0, base_row=0):I=AB row/col=(15,15; 1,2) means ncases=28 group 1 12, 76, 564, 0, group 2 16, 106, 778, 0, effn,mean,std: 12 6.33333 2.74138 -- sumsq,sumsqadj,effn: 82.6667 1 12 effn,mean,std: 16 6.625 2.24722 -- sumsq,sumsqadj,effn: 75.75 1 16 tags: 1, 2, getpoolv: 6.09295,2:26=158.417/26 doqs (6.09295,26) 0-1 tags: 1, 2, -->0: (-0.437582,26) SIGFAREA(26,-0.437582->0.05 for 1) returns 0.7572 from -0.309417 -> 0,0.7572 differences[ in AB: ]
This information can be used in numerous ways to check the statistical test performed. The first line tells us this is a test of two groups, the base row for statistics is the System Total row (base_row=0), it is an independent test of columns A and B (:I=AB), it is using row 15 and columns 1 and 2, and it is a test of means.
The number of cases in the statistics base is 28 (or 16 + 12). The line with –>0: (-0.437582,26) shows 0:(q-value, df). In the next line down, the “returns 0.7572 from -0.309417 → 0,0.7572” means “returns <significance> from <t-value>”.
A printout like the one shown above would be generated for each pair of columns tested.
If our test was a dependent test of columns A, B, C, D, E, and F for all percent rows and the mean row, then a portion of the list file pertaining to this statistical test would look similar to this:
test 1 (1232 len, 6 groups, err=0, base_row=0):ABCDEF row/col=(3,3; 1,6) percents ncases=166 group 1 12, 4, 4, 0, group 2 16, 6, 6, 0, group 3 14, 2, 2, 0, group 4 27, 13, 13, 0, group 5 69, 9, 9, 0, group 6 28, 9, 9, 0, sxy matrix all zero effn,mean,std: 12 0.333333 0.492366 -- sumsq,sumsqadj,effn: 2.66667 1 12 effn,mean,std: 16 0.375 0.5 -- sumsq,sumsqadj,effn: 3.75 1 16 effn,mean,std: 14 0.142857 0.363137 -- sumsq,sumsqadj,effn: 1.71429 1 14 effn,mean,std: 27 0.481481 0.509175 -- sumsq,sumsqadj,effn: 6.74074 1 27 effn,mean,std: 69 0.130435 0.339248 -- sumsq,sumsqadj,effn: 7.82609 1 69 effn,mean,std: 28 0.321429 0.475595 -- sumsq,sumsqadj,effn: 6.10714 1 28 tags: 1, 2, 3, 4, 5, 6, multiple comparisons getpoolv: 0.1931,2:165=43/166 doq (0.1931,165) 0-5 tags: 1, 2, 3, 4, 5, 6, -->0: (-0.351143,26) (1.55823,24) (-1.37423,37)(2.08774,79) (0.111041,38) -->1: (2.04147,28) (-1.08619,41) (2.83657,83) (0.550136,42) -->2: (-3.309,39) (0.136389,81) (-1.75572,40) -->3: (4.97691,94) (1.90971,53) -->4: (-2.74322,95) doqs (0.1931,165) 0-5 tags: 1, 2, 3, 4, 5, 6, -->0: (-0.316228,27) (1.59364,25) (-1.2021,38) (2.48386,80) (0.102869,39) -->1: (1.99451,29) (-0.94989,42) (3.25042,84) (0.50417,43) -->2: (-2.98179,40) (0.175692,82) (-1.73374,41) -->3: (5.17632,95) (1.69734,54) -->4: (-3.08478,96) differences[ in ABCDEF: D vs E; ] differences[ D vs E; ]
The first line tells us this is a test of six groups (ABCDEF), the base row is the System Total row (base_row=0), it is a dependent test of columns A, B, C, D, E, and F. The test is using row 3 and columns 1 to 6, and it is a test of percents. The line beginning with “doqs (0.1931,165)” shows:
(q-value,df) for -->0: (A vs B) (A vs C) (A vs D) (A vs E) (A vs F) -->1: (B vs C) (B vs D) (B vs E) (B vs F) -->2: (C vs D) (C vs E) (C vs F) -->3: (D vs E) (D vs F) -->4: (E vs F)
A printout like this would be available for each row tested.
8.10.4 Error and Warning Messages
The following is a list of error and warning messages that can occur while doing statistical testing. Each message has a brief description of why it might occur and how to fix it.
(ERROR #603) printable_t collision with code C
This error is caused when the STATISTICS= PRINTABLE_T option is used, but the column list is not printable. Column “C” is probably used multiple times. Either do not print t values or check your column list.
(ERROR #5055) table T002, test 3 (=GHIJ) is an i= test but has cells 4 and 2
Test of columns GHIJ was marked as independent, but is dependent. Data is either dirty or you are using the wrong test.
(ERROR #5056) table T001, test 4 (=ABC) (at 1): duplicated value or missing base
Test of columns ABC has a row which has items that are not in the base row causing an illegal test which cannot be run. If you do not want to actually test that row, you can change the Error to either a warning or to be silent using the SET option MISSING_OR_DUPLICATED_BASE=ERROR/WARNING/OKAY
(ERROR #5091) Newman_Keuls test not ok with approximate significance 0.75
This means you cannot use the DO_STATISTICS APPROXIMATELY option in conjunction with the Newman-Keuls test.
(ERROR #5520) tables with COL/ROW_WEIGHT= cannot have STATS= without SET MULTIWGT
COLUMN_SHORT_WEIGHT or ROW_SHORT_WEIGHT has been used in conjunction with a STATISTICS statement. Use the SET option MULTIPLE_WEIGHT_STATISTICS to override.
(ERROR #5524) stat test 3 (=GHIJ) had an error during construction
Test of columns GHIJ was marked as inclusive, but is not. Data is either dirty or you are using the wrong test.
(ERROR #6736) table T011, test 1 (at 2) has conflicting weight values 1 vs 0.88
Cannot have a data case with multiple weights in a single test even if SET MULTIPLE_WEIGHTS_STATISTICS is used. Table 011 has a data case with weight of 1.00 and 0.88.
(ERROR #6831) col A has stats percent 102/372 different from vertical percent 102/295
There is a cell in column A in which the statistical base does not match the vertical percentage base. In particular the cell has a frequency of 102, the statistical base a value of 372, and the percentage base a value of 295.
(WARN #1141) to install this option, we have to clear out the existing tables first at (18):
SET option STATISTICS_BASE_AR or -STATISTICS_BASE_AR has been used in the middle of a run. No override possible.
(WARN #5157) we will use all_pairs tests as the default statistical test
No test was specified on the EDIT statement so the default All Possible Pairs Test will be used. Suppress this warning by specifying ALL_POSSIBLE_PAIRS_TEST on the main EDIT.
(WARN #5385) table T006 with STATS= TAB6_st but no stat tests to do or report
This warning can appear for multiple reasons. One is that there is a non-simple frequency in the table that was not tested.
(WARN #5621) MULTIPLE_WEIGHT override for stats testing: be careful!
Set option MULTIPLE_WEIGHT_STATS has been used. No override possible.
8.10.5 Commands Summary
The following is a list of all statements/keywords/options that affect statistical testing. Information about these keywords can be found in either Appendix B: TILDE COMMANDS or in the section reference mentioned on the right.
STATEMENT/KEYWORD/OPTION SECTION AXIS ---- $[BASE] 8.2.2 and 8.4 $[EFFECTIVE_N] 8.4 $[DO_STATISTICS] 8.6.2 $[MEDIAN] 7.2.2 $[INTERPOLATED_MEDIAN] 7.2.2 [col.wid^code/(STATISTICS)/code/code] 7.2 EDIT= ----- ALL_POSSIBLE_PAIRS_TEST 8.3.1 ANOVA_SCAN 8.3.3 DO_PRINTER_STATISTICS 8.5.1 DO_STATISTICS 8.1.3, 8.1.5, 8.1.6, and 8.1.7 DO_STATISTICS_TEST 8.5.1 FISHER 8.3.3 FLAG_MINIMUM_BASE 8.6.3 MARK_CHI_SQUARE 8.9 MINIMUM_BASE 8.6.3 NEWMAN_KEULS_TEST 8.3.2 PAIRED_VARIANCE 8.3.4 POOLED_VARIANCE 8.3.4 SEPARATE_VARIANCE 8.3.4 USUAL_VARIANCE 8.3.4 STATISTICS= ----------- ABCD 8.1.1 I=ABCD 8.1.1 T=ABCD 8.1.1 and 8.1.8 PRINTABLE_T 8.1.1 and 8.7.1 D=1,2 8.8.1 P=1,2 8.8.2 RM=ABCD 8.3.3
STATEMENT/KEYWORD/OPTION SECTION STUB= ----- BASE_ROW 8.2.1 DO_SIG_T= 8.7.1 DO_STATISTICS= 8.5.3 DO_T_TEST= 8.7.1 SET OPTIONS ----------- MEAN_STATISTICS_ONLY 8.6.1 MISSING_OR_DUPLICATED_BASE MULTIPLE_WEIGHT_STATISTICS 8.4.1 STATISTICS_BASE_AR 8.2.1 STATISTICS_DUMP 8.9.2