[toc]
INTRODUCTION
This chapter explains how to do significance testing (T tests), chi square tests, ANOVA tests, and various other tests using the Mentor software. It also describes some of the program’s methodology and some common problems that occur with this type of testing. The actual statistical formulas are not presented in this chapter, but you can find them in Appendix A: STATISTICAL FORMULAS.
This chapter assumes a basic understanding of table building and of CfMC terminology. If you are not familiar with table building, review chapters 4, 5 and 6. Although a basic understanding of statistics is not needed to produce the tests, it is very useful for understanding the process. In most cases a brief statistical description is provided, but for a more detailed reference consult a statistics textbook.
Significance testing can be created in two different ways. Testing can be done (1) while the program is reading data cases (table building phase), or (2) from the numbers that are printed on the table (table printing phase). Tests created during the table building phase work in almost all situations, while tests created during the table printing phase only work in a select number of situations (independent and unweighted). Most of this chapter deals with tests created during the table-building phase. See 8.5 PRINT PHASE STATISTICAL TESTING for information on the table-printing phase tests.
The Mentor program can report significance testing in two ways: (1) marking cells that are significantly greater than other cells with the letter associated with the lesser column or (2) printing the actual statistical value and/or its significance. Most of this chapter describes marking the significantly greater cells. See 8.7 PRINTING THE ACTUAL T AND SIGNIFICANCE VALUES for information on printing the actual T values on the table.
You can test both columns and rows, although column tests are more common. Most of this chapter describes column tests. See 8.8 SIGNIFICANCE TESTING ON ROWS (PREFERENCE TESTING) for information on row testing.
You can only test means or percentages produced from simple frequencies. A simple frequency is one in which a given respondent has a value of 1 or 0 in a given cell (except for any weighting). Items such as sigma, sums, and medians cannot be tested. For a detailed list of what cannot be tested, see 8.10 NOTES ON SIGNIFICANCE TESTING.
8.1 SIGNIFICANCE TESTING TO MARK CELLS
To use significance testing on a table to mark cells, two options are required: the STATISTICS statement (as an element of the table) and the EDIT statement option DO_STATISTICS. The STATISTICS statement defines which columns will be tested, while the DO_STATISTICS option sets the confidence level. The system defaults are set to the following:
- System total row is the statistical base
- Tables can only have a single weight
- Confidence level is set to 95%
- The All Possible Pairs Test is used for multiple column tests
8.1.1 The STATISTICS Statement
The STATISTICS statement defines which columns and/or rows to test against each other and whether the test is independent, dependent, inclusive, or if t values are to be printed on the table. An independent test is one in which no case is in more than one of the cells being tested. A dependent test is one in which at least one case is in more than one of the cells. An inclusive test is one in which the first column in a test completely contains all the cases in all the rest of the columns. A printable test can only test pairs of columns and no column can be the last column in more than one pair.
To set up a column test, first assign a name to the statement, and then list the letters associated with each column (no spaces). The first column in the table is assigned the letter A, the second the letter B, and so on. To test the first three columns of the table against each other the statement would look like this:
STATISTICS= STAT1: ABC
Assigning a name to the STATISTICS= variable is optional, but doing so allows you to reference the statement by its name; for instance in a TABLE_SET definition or in the ~EXECUTE block. If there is more than one test that needs to be performed on a given table, end the first test with a comma, and then enter the next set of columns. To identify the test as independent, put I= in front of that particular set of columns. If a test is inclusive, put T= in front of the columns set. If t values will be printed on the table, use the PRINTABLE_T option before the list of columns.
A typical STATISTICS statement looks like this:
STATISTICS= STAT2: I=BC,DEF,I=GHI,T=AKLM,BDGJ
This statement would create the following five tests:
- An independent test on columns B and C
- A dependent test on columns D, E, and F
- An independent test on columns G, H, and I
- An inclusive test on columns A, K, L, and M
- A dependent test on columns B, D, G, and J
NOTE: The STATISTICS statement, like all other table elements, can be assigned a default name when it is defined inside of a TABLE_SET. All examples in this chapter use this convention.
8.1.2 Independent, Dependent, Inclusive, and Printable Tests
An important distinction exists between independent and dependent tests. The formula for the dependent test contains additional calculations that rely entirely on the distribution of the individual data cases. Therefore, you generally cannot verify the results of a dependent test by examining the values printed on the table. These calculations cause dependent tests to require a smaller percentage difference to be marked as significant. For example, suppose the mean rating of brand A is being tested against brand B. If one respondent rated brand A higher than brand B, he has a clear preference for brand A. However, if one person rated brand A higher than another rated brand B, the difference in the rating might be because the first person always rates high and the other person always rates low, so you must distribute the difference between the difference in the brands and the difference in the respondents.
If all the columns in a given test are independent, then the extra calculations for the dependent test drop out of the calculation. Computer processing time can be reduced and additional error checking performed on independent tests if these tests are specified as independent (I=) on the STATISTICS= statement. This allows the program to drop the additional unnecessary calculations and prints an error message if the tests are not really independent.
NEW PROTECTION VALUE IN SIGNIFICANCE TESTING
In the calculation of the variance used in Mentor’s significance testing, the protection limit on the lowest possible value in the denominator was changed from 0.0001 to 0.000001.
There is a new ~SET RINYV option which can be used to set this value back to the original default value of 0.0001 in the unlikely event the user needs this to reproduce past results.
An inclusive test is one in which all the subsequent columns are completely contained in the first column. This test, sometimes referred to as completely dependent, is modified in the following way. The resultant test is an independent test that is exactly the same test as testing the contained column against everyone who is in the other column, but not in the contained column.
Example:
• testing the total against males results in the same test as testing the females against males
• testing people from California against people from San Francisco results in the same test as
testing people from California, but not from San Francisco against people from San Francisco
If all the columns in a test have this property the program can perform additional error checking if the test has been specified as inclusive. This will cause the program to print an error message if it finds a case that is not in the first column, but is in a subsequent column.
If the t values or their significance are to be printed on the table, you can error check at definition time to make sure that there will be room on the table to print the values. Since the t value will be printed under the second column in the test, only pairs of columns may be tested and no column can be the second column in more than one pair. The use of the PRINTABLE_T option will error check the STATISTICS statement to make sure it follows these rules. If the PRINTABLE_T option is not used and more than one value is created in any cell, the error will not be reported until the table is printed.
8.1.3 Setting the Confidence Level
Significance testing is done in a two-step process:
First, the program reads the data and does all the initial calculations determined by the elements on the STATISTICS statement.
Second, it prints the table using the confidence level specified on the DO_STATISTICS option on the EDIT statement.
The DO_STATISTICS= option can be set to any of the following: .80, .90, .95, .99, and APPROXIMATELY .nn (where nn is any value). You can combine options with a plus sign (+). DO_STATISTICS without an option has a default significance level of 95%.
NOTE: Neither DO_STATISTICS=.80 nor APPROXIMATELY .nn can be used in conjunction with the Newman-Keuls testing procedure discussed later in this chapter.
Here are some example settings of the DO_STATISTICS option:
DO_STATISTICS=.90
Sets the confidence level to 90%; the significance level to 10%
DO_STATISTICS=.99
Sets the confidence level to 99%; the significance level to 1%
DO_STATISTICS=.90+.95
Performs bi-level testing at both the 95% and 90% confidence levels.
DO_STATISTICS=APPROXIMATELY .85
Sets the confidence level to 85%, but uses an approximation formula for the t- values rather than looking it up in a table.
DO_STATISTICS=.95+APPROXIMATELY .88
Performs bi-level testing at both the 95% confidence level (table look-up) and at the 88% level using an approximation formula.
The program will mark significantly higher cells by putting the letter associated with the lower cell to the right of the frequency of the higher cell. Multiple letters run into the next cell then are continued on the next line, until all the letters are printed. Normally the printed letters print in upper case, unless you are doing bi- level testing. If the difference is significant at the higher level the letter prints in upper case, and lower case if it is different at the lower level, but not at the higher level. For an example of bi-level testing, see 8.1.5 Changing the Confidence Level.
When significance testing is done, the program will print a percentage sign, along with the appropriate letter for that column under the System Total row. This makes it easy to distinguish both the statistical base and the columns that were tested. Columns that are not tested will not have a letter designation under the total row. It will also print an automatic footnote at the bottom of each page which includes the significance level, the test used, and which columns were tested.
NOTE: A 95% confidence level is equivalent to a .05 significance level.
8.1.4 Standard Significance Testing
The minimum specification required to do significance testing is a STATISTICS statement, along with the EDIT option DO_STATISTICS. If either of these statements is specified without the other, then no testing will be done and no footnote will print on the table.
In most of the examples in this chapter, the STATISTICS statement and the EDIT options that affect the significance testing are specified in the TABLE_SET definition for a given table, but in general these options would be specified in the TABLE_SET that defines the banner and therefore apply to the entire run. Switching settings from table to table may be both confusing and difficult to check. Here is an example of standard significance testing at the 95% confidence level.
NOTE: The following set of commands defines a standard front end for the next set of examples.
>PURGE_SAME
>PRINT_FILE STAT1
~INPUT DATA
~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T101
~DEFINE
STUB= STUBTOP1:
[BASE_ROW] TOTAL
[SUPPRESS] NO ANSWER }
TABLE_SET= { BAN1:
EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STATISTICS_DECIMALS=2,
-PERCENT_SIGN }
STUB_PREFACE= STUBTOP1
BANNER=:
| GENDER AGE ADVERTISING AWARENESS
| <=========> <=================> <=========================>
| TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
| —– —- ——- —– —– —– —— —— —— —— }
COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4]
}
Here is the first example:
TABLE_SET= { TAB101:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS }
HEADER=: TABLE WITH SIGNIFICANCE TESTING }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= TOTAL SAMPLE }
STUB=:
NET GOOD
| VERY GOOD
| GOOD
| FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[STAT,LINE=0] MEAN
[STAT,LINE=0] STD DEVIATION
[STAT,LINE=0] STD ERROR }
ROW=: [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11] STORE= T101 }
~EXECUTE
TABLE_SET= BAN1
TABLE_SET= TAB101
Here is the table that is printed:
TABLE WITH SIGNIFICANCE TESTING
TABLE 101
RATING OF SERVICE
BASE= TOTAL SAMPLE
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
NET GOOD 197 120C 77 57 63 74DE 42 55 54 105G
49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7
VERY GOOD 102 70C 32 35 38 27 19 32 30 60G
25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1
GOOD 95 50 45 22 25 47DE 23 23 24 45
23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6
FAIR 92 44 48 28F 44F 14 20 20 24 36
23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5
NET POOR 83 18 65B 29 30 20 19 27J 22 24
20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6
POOR 39 9 30B 12 17 10 8 13 13 13
9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4
VERY POOR 44 9 35B 17 13 10 11 14J 9 11
11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2
DON’T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11
7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2
MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79GH
STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21
STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
In Table 101, notice the “C” next to the frequency of 120 in the MALE column and NET GOOD row. This signifies that the percentage of males who were a NET GOOD mention is significantly higher at the 95% level than the percentage of females who were a NET GOOD mention. The letter “C” is used because the FEMALES column is the third column in the banner and it is so designated under the TOTAL row at the top of the table. Also notice that there is no “A” under the TOTAL column because it was not included in any of the tests.
The footnote on the bottom of the page tells you which significance level was used (.05 or a 95% confidence level), which test was used (All Possible Pairs), and which columns were tested (BC, DEF, and GHIJ).
8.1.5 Changing the Confidence Level
The confidence level easily can be changed by using the EDIT option DO_STATISTICS. For instance, to test at the 99% confidence level, you would specify DO_STATISTICS=.99. Below is an example of a table that would be tested at the 99% level. Table settings that are the same as Table 101 are restated here for clarity.
TABLE_SET= { TAB102:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS=.99 }
HEADER=: TABLE WITH SIGNIFICANCE TESTING AT THE 99% CONFIDENCE LEVEL }
TITLE= TAB101
TITLE_4= TAB101
STUB= TAB101
ROW= TAB101
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH SIGNIFICANCE TESTING AT THE 99% CONFIDENCE LEVEL
TABLE 102
RATING OF SERVICE
BASE= TOTAL SAMPLE
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
NET GOOD 197 120C 77 57 63 74DE 42 55 54 105
49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7
VERY GOOD 102 70C 32 35 38 27 19 32 30 60
25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1
GOOD 95 50 45 22 25 47DE 23 23 24 45
23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6
FAIR 92 44 48 28 44F 14 20 20 24 36
23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5
NET POOR 83 18 65B 29 30 20 19 27J 22 24
20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6
POOR 39 9 30B 12 17 10 8 13 13 13
9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4
VERY POOR 44 9 35B 17 13 10 11 14 9 11
11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2
DON’T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11
7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2
MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79G
STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21
STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.01) (all_pairs) columns tested BC, DEF, GHIJ
If you compare this table to Table 101, you will notice that there is no “G” in the NET GOOD row under the banner point BRND D. This is because these two cells are significantly different at the 95% level, but not at the 99% level. Also, notice the change in the table footnote.
8.1.6 Bi-Level Testing (Testing at Two Different Confidence Levels)
To do bi-level testing, or testing at two different confidence levels, on the same table specify the EDIT option DO_STATISTICS= to the two different levels and join them with a plus sign (+). For example, to test at both the 99% and 95% levels say DO_STATISTICS=.99+.95. The example below is the same as Table 101 above, except for the DO_STATISTICS option.
The program will use upper- and lower-case letters to distinguish between those cells that are significant at the upper and lower levels. The upper-case letter also will be associated with the higher significance level regardless of the order in which they are specified on the DO_STATISTICS command.
TABLE_SET= { TAB103:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS=.95+.99 }
HEADER=:TABLE WITH SIGNIFICANCE TESTING AT THE 95% AND 99% CONFIDENCE LEVELS }
TITLE= TAB101
TITLE_4= TAB101
STUB= TAB101
ROW= TAB101
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH SIGNIFICANCE TESTING AT THE 95% AND 99% CONFIDENCE LEVELS
TABLE 103
RATING OF SERVICE
BASE= TOTAL SAMPLE
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
NET GOOD 197 120C 77 57 63 74DE 42 55 54 105g
49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7
VERY GOOD 102 70C 32 35 38 27 19 32 30 60g
25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1
GOOD 95 50 45 22 25 47DE 23 23 24 45
23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6
FAIR 92 44 48 28f 44F 14 20 20 24 36
23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5
NET POOR 83 18 65B 29 30 20 19 27J 22 24
20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6
POOR 39 9 30B 12 17 10 8 13 13 13
9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4
VERY POOR 44 9 35B 17 13 10 11 14j 9 11
11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2
DON’T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11
7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2
MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79Gh
STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21
STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.01 + sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
If you compare the above table to both Table 101 and Table 102, you will notice that it is actually a combination of the two. Every letter marking from Table 102 is replicated on this table, and every letter marking from Table 101 that was not on Table 102 is printed here with a lower case letter. Notice the lower case “g” in the NET GOOD row under the BRND D column. This means that this cell is significantly greater than column G at the 95% level, but not at the 99% level. The footnote (sig=.01 + sig=.05) indicates testing at both levels.
8.1.7 Using Nonstandard Confidence Levels
The program can test at confidence levels other than 99%, 95%, 90%, or 80% by using the APPROXIMATELY option in conjunction with the DO_STATISTICS option on the EDIT statement. Use this option to test at any other confidence level that is desired. Be aware though that it uses a formula to determine whether an item is significant or not rather than looking up the t value in a table. This formula has the same error rates as the table values (i.e. its marking will be wrong the same percent of the time), but for values on the border of being significant it may be different.
NOTE: This option cannot be used in conjunction with the Newman-Keuls procedure.
This example is the same as Table 101, except for the DO_STATISTICS option on the EDIT statement
TABLE_SET= { TAB104:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS= APPROXIMATELY .93 }
HEADER=: TABLE WITH SIGNIFICANCE TESTING AT AN OTHER CONFIDENCE LEVEL (93%) }
TITLE= TAB101
TITLE_4= TAB101
STUB= TAB101
ROW= TAB101
STORE_TABLES=* }
The printed table would look similar to Table 101, except for some of the statistical markings and the footnote. The footnote would be as follows:
(sig= apprx 0.07) (all_pairs) columns tested BC, DEF, GHIJ
8.1.8 Inclusive T Tests
If you have a situation where you are testing a set of columns against the total column or against a column that contains all the values in those columns, you may want to mark the test as inclusive by using the T= option in front of the list of columns. This option will verify that all subsequent columns are contained in the first column in the list and report an error if it finds a data case that is not contained.
This example is the same as Table 101 except for the STATISTICS statement.
TABLE_SET= { TAB105:
STATISTICS=: T=ABC,T=ADEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS=.95 }
HEADER=:TABLE WITH INCLUSIVE SIGNIFICANCE TESTING AT THE 95% CONFIDENCE LEVEL}
TITLE= TAB101
TITLE_4= TAB101
STUB= TAB101
ROW= TAB101
STORE_TABLES=* }
The printed table would look similar to Table 101, except for some of the statistical markings and the footnote. The footnote would be as follows:
(sig=.05) (all_pairs) columns tested T= ABC, T= ADEF, GHIJ
Notice that the test of GHIJ is a dependent test, even though the others are inclusive.
8.2 Changing the Statistical Base
If the table is to represent percentages from a row other than the System Total row, you must specify which row will be the base. The program checks that the percentage base and the statistical base are the same row. If you change one, you must change the other. This is done to keep you from printing significance on cells that have no relationship to the percentage that is printing.
NOTE: Significance testing for means is unaffected by the statistical base, instead the base for each mean is calculated internally as part of calculating the mean.
8.2.1 Changing to the Any Response Row
If the System Any Response row is the percentage base, then you can change it to the default statistical base by using the SET option STATISTICS_BASE_AR. If you do not use this option and percentage off the Any Response row, the program will print an error message that the percentage base is different than the statistical base. If the percentage base is the System Any Response row, you will probably want to create a new STUB_PREFACE to mark it as the base row for the test.
This STUB_PREFACE should use the STUB option BASE_ROW along with the PRINT_ROW=AR, so that the program will print the percentage signs and column letters under the Any Response row. If the System Total row is also printed, the STUB option -BASE_ROW on its definition can be used to suppress the percentage signs and letters from printing under it.
You can set the default statistical base back to the System-generated Total row by using the SET option -STATISTICS_BASE_AR. However, be careful not to make mistakes if you turn this option on and off multiple times in one run. It is recommended that in a given run you use either the System Total or System Any Response row as the percentage base and not to flip back and forth.
NOTE: The following set of commands defines a standard front end for the next set of examples
>PURGE_SAME
>PRINT_FILE STAT2
~INPUT DATA
~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T201
~DEFINE
TABLE_SET= { BAN1:
EDIT=:COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STATISTICS_DECIMALS=2,-PERCENT_SIGN }
BANNER=:
| GENDER AGE ADVERTISING AWARENESS
| <=========> <=================> <=========================>
| TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
| —– —- —— —– —– —– —— —— —— ——}
COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4]
}
Example:
STUB= STUB_TOP_AR:
[-VERTICAL_PERCENT -BASE_ROW] TOTAL
[-VERTICAL_PERCENT] NO ANSWER
[PRINT_ROW=AR,BASE_ROW] TOTAL RESPONDING }
TABLE_SET= { TAB201:
SET STATISTICS_BASE_AR
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: VERTICAL_PERCENT=AR,DO_STATISTICS=.95 }
STUB_PREFACE= STUB_TOP_AR
HEADER=: TABLE WITH SIGNIFICANCE TESTING BASED ON THE ANY RESPONSE ROW }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= THOSE RESPONDING }
STUB=:
NET GOOD
| VERY GOOD
| GOOD
| FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[STAT,LINE=0] MEAN
[STAT,LINE=0] STD DEVIATION
[STAT,LINE=0] STD ERROR }
ROW=: [12^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [12]
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH SIGNIFICANCE TESTING BASED ON THE ANY RESPONSE ROW
TABLE 201
RATING OF SERVICE
BASE= THOSE RESPONDING
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
NO ANSWER 48 26 22 16 16 14 11 6 10 32
TOTAL RESPONDING 352 170 182 109 129 99 80 102 97 144
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
NET GOOD 171 102C 69 50 57 61DE 39 51 47 87I
48.6 60.0 37.9 45.9 44.2 61.6 48.8 50.0 48.5 60.4
VERY GOOD 86 58C 28 29 34 21 19 29 23 48
24.4 34.1 15.4 26.6 26.4 21.2 23.8 28.4 23.7 33.3
GOOD 85 44 41 21 23 40DE 20 22 24 39
24.1 25.9 22.5 19.3 17.8 40.4 25.0 21.6 24.7 27.1
FAIR 81 38 43 22 40F 14 16 18 24 30
23.0 22.4 23.6 20.2 31.0 14.1 20.0 17.6 24.7 20.8
NET POOR 76 18 58B 28 25 19 16 27J 20 19
21.6 10.6 31.9 25.7 19.4 19.2 20.0 26.5 20.6 13.2
POOR 35 9 26B 11 15 9 7 13 11 10
9.9 5.3 14.3 10.1 11.6 9.1 8.8 12.7 11.3 6.9
VERY POOR 41 9 32B 17 10 10 9 14J 9 9
11.6 5.3 17.6 15.6 7.8 10.1 11.2 13.7 9.3 6.2
DON’T KNOW/REFUSED 24 12 12 9 7 5 9 6 6 8
6.8 7.1 6.6 8.3 5.4 5.1 11.2 5.9 6.2 5.6
MEAN 3.43 3.84C 3.04 3.34 3.46 3.56 3.46 3.41 3.45 3.79HI
STD DEVIATION 1.32 1.15 1.35 1.44 1.25 1.24 1.33 1.42 1.27 1.20
STD ERROR 0.07 0.09 0.10 0.14 0.11 0.13 0.16 0.14 0.13 0.10
———————————
(sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
8.2.2 Changing to Any Row in the Table
When doing significance testing, it is often useful to define a base row so that both you and the System can easily determine which row is the statistical base row. Although the base row is only needed when the statistical base is not the System Total or Any Response row, or if the base changes in the body of the table or when the table is weighted, it is often useful to create it. This will not only reduce possible errors, but also will allow easy modification of the table due to a future client request. Make sure that the stub option BASE_ROW is defined on any such row so that the table will be labeled properly.
To create a base row use the keyword $[BASE] on a variable definition followed by the definition of the base. Follow that definition with a set of empty brackets (i.e. $[]). This forces the program to go back to producing the default frequencies. The base definition actually creates two different categories in the variable. Therefore, the stub has to have two separate labels, one for each of the categories. If the table is not weighted then the two categories will have the same values in each, and one of them may be suppressed. However, if the table is weighted then the first row will have the weighted base and the second row will contain something called the effective base. See 8.4 SIGNIFICANCE TESTING ON WEIGHTED TABLES for a definition of the effective base and why it is needed.
In the example below, the System Total row and both categories created by the $[BASE] would all produce the same numbers, so only the first category created by the $[BASE] is printed. Notice the VERTICAL_PERCENT=* option on the base row to guarantee that it is both the statistical and percentage base.
STUB= STUB_TOP_SUP:
[SUPPRESS] TOTAL
[SUPPRESS] NO ANSWER }
TABLE_SET= { TAB202:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS=.95 }
STUB_PREFACE= STUB_TOP_SUP
HEADER=: TABLE WITH SIGNIFICANCE TESTING AND HAS A BASE ROW SPECIFIED }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= TOTAL SAMPLE }
STUB=:
[BASE_ROW,VERTICAL_PERCENT=*] TOTAL (STAT BASE)
[SUPPRESS] EFFECTIVE BASE
NET GOOD
| VERY GOOD
| GOOD
| FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[STAT,LINE=0] MEAN
[STAT,LINE=0] STD DEVIATION
[STAT,LINE=0] STD ERROR }
ROW=: $[BASE] TOTAL $[] [12^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [12]
STORE_TABLES=* }
The printed table will look fundamentally the same as Table 201.
8.2.3 Changing in the Middle of a Table
If the table you wish to create has a percentage base which changes in the body of the table, then you must define a base row in the variable each time the percentage base changes. Suppose you were trying to do significance testing on a top box and bottom box table with a changing percent base (See 6.1.2 Top Box Tables with a Changing Percentage Base). For purposes of this example, only two brands are actually shown.
For each top box the percentage base not only has to be defined after a $[BASE] keyword, but the STUB options BASE_ROW and VERTICAL_PERCENTAGE=* should be specified on each stub.
TABLE_SET= { TAB203:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS=.95 }
STUB_PREFACE= STUB_TOP_SUP
HEADER=: TABLE WITH SIGNIFICANCE TESTING WITH A CHANGING PERCENTAGE/STATISTICAL BASE }
TITLE=: SUMMARY OF RATING OF SERVICE FOR BRANDS A AND B}
TITLE_4=: BASE= THOSE WHO RATED THE BRAND }
STUB=:
[BASE_ROW,VERTICAL_PERCENT=*] TOTAL RATED BRAND A (STAT BASE)
[SUPPRESS] EFFECTIVE BASE
TOP BOX
BOTTOM BOX
[BASE_ROW,VERTICAL_PERCENT=*] TOTAL RATED BRAND B (STAT BASE)
[SUPPRESS] EFFECTIVE BASE
TOP BOX
BOTTOM BOX }
ROW=: $[BASE] [12^1-5] $[] [12^4,5/1,2] $[BASE] [21^1-5] $[] [21^4,5/1,2]
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH SIGNIFICANCE TESTING WITH A CHANGING PERCENTAGE/STATISTICAL BASE
TABLE 203
SUMMARY OF RATING OF SERVICE FOR BRANDS A AND B
BASE= THOSE WHO RATED THE BRAND
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL RATED BRAND A (STAT 328 158 170 100 122 94 71 96 91 136
BASE) 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
TOP BOX 171 102C 69 50 57 61DE 39 51 47 87I
52.1 64.6 40.6 50.0 46.7 64.9 54.9 53.1 51.6 64.0
BOTTOM BOX 76 18 58B 28 25 19 16 27J 20 19
23.2 11.4 34.1 28.0 20.5 20.2 22.5 28.1 22.0 14.0
TOTAL RATED BRAND B (STAT 378 186 192 117 138 109 83 102 101 170
BASE) 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
TOP BOX 175 106C 69 53 55 65DE 38 46 48 98H
46.3 57.0 35.9 45.3 39.9 59.6 45.8 45.1 47.5 57.6
BOTTOM BOX 127 43 84B 39 45 35 28 42J 35J 41
33.6 23.1 43.8 33.3 32.6 32.1 33.7 41.2 34.7 24.1
———————————
(sig=.05) (all_pairs) columns tested BC, DEF, GHIJ
Notice that each base row is marked with percentage signs and the column letters designation, making it clear that the statistical base has changed.
8.3 Changing the Statistical Tests
Significance testing can produce two potential errors: marking a test as significant when it is not; or not marking a test as significant when it is. In general, the first type of error is the more dangerous of the two as it may cause decisions to be made based on incorrect assumptions. There are various ways to change the formulas thus reducing the possibility for this type of error.
There are five different sets of formulas that the program can use to do the significance testing.
- The All Possible Pairs Test (default)
- The Newman-Keuls Test
- The ANOVA-Scan Test
- The Fisher Test
- The Kruskal-Wallis Test
The following sections briefly describe each type of test, how each one works, and when it might be appropriate to use a particular test. For more detailed information about each test, consult a statistics textbook. Several of the tests are included in Appendix A: STATISTICAL FORMULAS.
8.3.1 The All Possible Pairs Test
The All Possible Pairs Test (APP) is the default. When more than two columns are tested at a time, the APP test will act as though each pair in the set of columns is being tested individually. For instance, the same set of tests would be performed whether your STATISTICS statement was specified as ABC or AB,AC,BC. In general, out of all the tests, the APP test is the most likely to mark cells as significantly different.
For an example that uses this test, see Table 101 in 8.1.4 Standard Significance Testing.
8.3.2 The Newman-Keuls Test Procedure
In general, the Newman-Keuls procedure (N-K) is not as likely to mark a test as significant as the APP test. This is because it uses the additional information supplied by all the columns in the test to produce a more reliable result. When testing more than two columns at a time, the N-K procedure assumes the items are alike and uses this additional information in its calculations.
NOTE: Testing items that are not alike (such as males and people from Chicago) using the N-K procedure may cause results to be skewed. You should be careful when testing dissimilar items, especially when using any test other than APP.
The N-K procedure requires larger differences because you are testing like items. You would ordinarily expect a larger difference between the smallest and the largest in the group. As additional columns are added into the test, it will require larger and larger differences between cells in order to see them as significantly different.
For example, suppose you flipped two coins 100 times each. One coin came up heads 42 times and the other came up heads 56 times. If you tested just these two samples against each other, you might guess that the second coin was more likely to come up heads due to the design of the coin. Now suppose you flipped five more coins and got values for heads like 47, 51, 54, 50, and 46. If you now look at all seven coins at one time, you would probably conclude that the coins are not actually different, but that the difference in the values is due to expected randomness.
The N-K procedure is different from the APP test in the following ways:
- The N-K procedure estimates which of the two columns is most likely to be different and tests it first. If they are not different, then it does not test any of the other pairs of columns. If they are different, then it looks for the next pair of columns that are most likely to be different and tests them. It continues this until it finds a pair that is not significantly different or until all the pairs have been tested. This procedure assumes that sample sizes for the columns being tested are similar, but does make corrections if they are not.
- The N-K procedure uses a pooled variance. This pooled variance is calculated by using a formula to combine the variance for each sample in the test. Since this pooled variance is derived from a much larger sample size than any of the individual samples, it is a more reliable estimation of the true variance. The effect of the pooled variance can be eliminated by use of one of the other VARIANCE options on the EDIT statement. See 8.3.4 Changing the Variance.
- The N-K procedure requires slightly higher t values to mark items as significantly different based on how many columns are being tested.
To use the N-K procedure instead of the APP test, use the option NEWMAN_KEULS_TEST on the EDIT statement.
The EDIT statement below shows an example of doing the Newman-Keuls procedure at the 95% confidence level.
NOTE: The following set of commands defines a standard front end for the next set of examples.
>PURGE_SAME
>PRINT_FILE STAT3
~INPUT DATA
~SET DROP_LOCAL_EDIT,BEGIN_TABLE_NAME=T301
~DEFINE
STUB= STUBTOP1:
TOTAL
[SUPPRESS] NO ANSWER }
TABLE_SET= { BAN1:
EDIT=: COLUMN_WIDTH=7,STUB_WIDTH=30,-COLUMN_TNA,STATISTICS_DECIMALS=2,
-PERCENT_SIGN,RUNNING_LINES=1 }
STUB_PREFACE= STUBTOP1
BANNER=:
| GENDER AGE ADVERTISING AWARENESS
| <=========> <=================> <=========================>
| TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
| —– —- —— —– —– —– —— —— —— ——}
COLUMN=: TOTAL WITH [5^1/2] WITH [6^1//3] WITH [7^1//4]
}
Example:
TABLE_SET= { TAB301:
STATISTICS=: I=BC,I=DEF,GHIJ
LOCAL_EDIT=: DO_STATISTICS=.95 NEWMAN_KEULS_TEST }
HEADER=:TABLE WITH SIGNIFICANCE TESTING USING THE NEWMAN-KEULS TESTING PROCEDURE }
TITLE=: RATING OF SERVICE }
TITLE_4=: BASE= TOTAL SAMPLE }
STUB=:
NET GOOD
| VERY GOOD
| GOOD
| FAIR
NET POOR
| POOR
| VERY POOR
DON’T KNOW/REFUSED
[STAT,LINE=0] MEAN
[STAT,LINE=0] STD DEVIATION
[STAT,LINE=0] STD ERROR }
ROW=: [11^4,5/5/4/3/1,2/2/1/X] $[MEAN,STD,SE] [11]
STORE_TABLES=* }
Here is the table that is printed:
TABLE WITH STATISTICAL TESTING USING THE NEWMAN-KEULS TESTING PROCEDURE
TABLE 301
RATING OF SERVICE
BASE= TOTAL SAMPLE
GENDER AGE ADVERTISING AWARENESS
<=========> <=================> <=========================>
TOTAL MALE FEMALE 18-30 31-50 51-70 BRND A BRND B BRND C BRND D
—– —- —— —– —– —– —— —— —— ——
TOTAL 400 196 204 125 145 113 91 108 107 176
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
% % % % % % % % % %
(B) (C) (D) (E) (F) (G) (H) (I) (J)
NET GOOD 197 120C 77 57 63 74DE 42 55 54 105
49.2 61.2 37.7 45.6 43.4 65.5 46.2 50.9 50.5 59.7
VERY GOOD 102 70C 32 35 38 27 19 32 30 60
25.5 35.7 15.7 28.0 26.2 23.9 20.9 29.6 28.0 34.1
GOOD 95 50 45 22 25 47DE 23 23 24 45
23.8 25.5 22.1 17.6 17.2 41.6 25.3 21.3 22.4 25.6
FAIR 92 44 48 28F 44F 14 20 20 24 36
23.0 22.4 23.5 22.4 30.3 12.4 22.0 18.5 22.4 20.5
NET POOR 83 18 65B 29 30 20 19 27J 22 24
20.8 9.2 31.9 23.2 20.7 17.7 20.9 25.0 20.6 13.6
POOR 39 9 30B 12 17 10 8 13 13 13
9.8 4.6 14.7 9.6 11.7 8.8 8.8 12.0 12.1 7.4
VERY POOR 44 9 35B 17 13 10 11 14 9 11
11.0 4.6 17.2 13.6 9.0 8.8 12.1 13.0 8.4 6.2
DON’T KNOW/REFUSED 28 14 14 11 8 5 10 6 7 11
7.0 7.1 6.9 8.8 5.5 4.4 11.0 5.6 6.5 6.2
MEAN 3.46 3.90C 3.05 3.40 3.42 3.66 3.38 3.45 3.53 3.79
STD DEVIATION 1.31 1.12 1.35 1.41 1.28 1.22 1.32 1.40 1.29 1.21
STD ERROR 0.07 0.08 0.10 0.13 0.11 0.12 0.15 0.14 0.13 0.09
———————————
(sig=.05) (n_k) columns tested BC, DEF, GHIJ
If you compare this table with Table 101 you will notice that the VERY POOR row under the column BRND B is marked with a “J” in Table 101, but not in this table. This means that the Newman-Keuls procedure determined that this was an expected difference when testing four separate samples, while the All Possible Pairs Test determined it was a significant difference. Notice that the footnote at the bottom of the page now says (n_k) instead of (all_pairs).
8.3.3 Other Testing Procedures
There are three other less used tests available. They are the ANOVA-Scan (A/S), Fisher, and Kruskal-Wallis (K/W) tests. These tests only can be performed on independent samples, and Kruskal-Wallis only works on means.
The ANOVA-Scan and Fisher tests are similar in that both first perform an ANOVA (analysis of variance) on the set of columns, and then only test the individual columns if the ANOVA shows significance. The difference between the two is that the Fisher test, like Newman-Keuls, takes into account that multiple columns are being tested and therefore requires a larger t value to show significance. In general, these two tests are less likely than the APP test is to mark items as significant.
The Kruskal-Wallis test only works on means of independent samples created with the EDIT option COLUMN_MEAN. It is normally done on rating scales and treats the ratings as ordinals instead of values. This means that it treats each point in the scale only as higher than the previous ones. In other words, a rating of 4 is not twice as high as 2, it is just two ratings higher.
For the ANOVA-Scan test, use the EDIT option ANOVA_SCAN, and for the Fisher test the EDIT option FISHER. The Kruskal-Wallis test is invoked by using the stub option DO_STATISTICS=KRUSKAL_WALLIS. See 8.5.3 Changing the Type of Test by Row for examples of the Kruskal-Wallis test.
Here is an example of an ANOVA-Scan test. This example is similar to the one for Table 301, except for the EDIT option ANOVA_SCAN and the STATISTICS statement that only specifies those tests that are independent.
TABLE_SET= { TAB302:
STATISTICS=: I=BC,I=DEF
LOCAL_EDIT=: DO_STATISTICS=.95,ANOVA_SCAN }
HEADER=: TABLE WITH STATISTICAL TESTING USING THE ANOVA SCAN TEST }
TITLE= TAB301
TITLE_4= TAB301
STUB= TAB301
ROW= TAB301
STORE_TABLES=* }
The printed table would look similar to Table 301, except for some of the statistical markings and the footnote. The footnote would be as follows:
(sig=.05) (anova_scan) columns tested BC, DEF
Here is an example of the Fisher test. It is the same as the one for Table 302, except that the EDIT option FISHER is used instead of ANOVA_SCAN. As with the ANOVA_SCAN, only the independent columns are specified on the STATISTICS statement.
TABLE_SET= { TAB303:
STATISTICS=: I=BC,I=DEF
LOCAL_EDIT=: DO_STATISTICS=.95,FISHER }
HEADER=: TABLE WITH STATISTICAL TESTING USING THE FISHER TEST }
TITLE= TAB301
TITLE_4= TAB301
STUB= TAB301
ROW= TAB301
STORE_TABLES=*
}
The printed table would look similar to Table 301, except for some of the statistical markings and the footnote. The footnote would be as follows:
(sig=.05) (Fisher) columns tested BC, DEF
REPEATED MEASURES OPTION
If you have a repeated measures test (everyone is in every banner point) or an incomplete replication (a person is in more than one of the banner points), then you must use the RM feature on the STATISTICS= statement with the ANOVA_SCAN or Fisher.
EX: STATISTICS=: RM=AB,RM=CD,RM=EFGH,RM=IJKL,RM=MNOP
This causes Mentor to accumulate the sum of within_person variance (and the corresponding degrees of freedom), so that they can be deducted from the total_variance (and degrees of freedom) prior to calculating the F_ratio for the analysis of variance. In the case of complete replication, this is equivalent to a properly done repeated_measures analysis of variance. In the case of incomplete replication, it has the same heuristic justification of that behind the use of the (incompletely based) correlation matrix in the Newman_Keuls procedure.
Here are the steps in this procedure:
- Both the fisher and ANOVA_SCAN do an ANOVA first.
- They both test for significance at the level specified.
- If the ANOVA is significant the program goes ahead otherwise it stops.
- If the program goes ahead, then:
- the ANOVA_SCAN does an All Possible Pairs test at the same significance level.
- the Fisher does a more stringent test on the pairs. The Fisher test takes into account that multiple columns are being tested and requires a larger t value to show significance. The significance used is:
the original significance level
———————————————–
(number of groups x (number of groups – 1)) / 2
where the number of groups is the number of points being tested in the banner.
In general, the ANOVA_SCAN and Fisher are less likely to mark items as significant than the all possible pairs test. If you are weighting the data, you must use STATISTICS=RM even if this is an independent sample.
Following are two examples. Example One (stat1.spx) uses the All Possible Pairs test. Example Two (stat2.spx) reproduces these same tables using the ANOVA_SCAN and Repeated Measures. Example Three (stat3.spx) again reproduces these same tables using the Fisher test and Repeated Measures.
EXAMPLE ONE – USING THE ALL POSSIBLE PAIRS TEST
~COMMENT stat1^spx
This job has 4 products but it is a 2 x 2 design. It is set up using multiple 80 column records. General
information is on record 1.
first product seen data [2/12.8]
second product seen data [3/12.8]
third product seen data [4/12.8]
fourth product seen data [5/12.8]
first product seen [8/1] \ code 1 = green high salt
second product seen [8/7] \ code 2 = blue low salt
third product seen [8/13] / code 3 = blue high salt
fourth product seen [8/19] / code 4 = green low salt
~DEFINE
”define a variable to keep track of first pass through the tables
firstpass[27/59^1]
”define a variable to keep track of first product seen
firstseen[27/60^1]
PROC= { proc1:
MODIFY firstpass = true ”this is first pass through with banner1
MODIFY firstseen = true ”this is first product seen
COPY [27/01] = [8/01]
COPY [27/12.8] = [2/12.8]
DO_TABLES LEAVE_OPEN
MODIFY firstseen = false ”this is not the first product seen
COPY [8/01] = [8/07]
COPY[2/12.8] = [3/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/13]
COPY [2/12.8] = [4/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/19]
COPY [2/12.8] = [5/12.8]
DO_TABLES
MODIFY firstpass = false ”this is not the first pass through
COPY [8/01] = [27/01]
COPY [2/12.8] = [27/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/07]
COPY [2/12.8] = [3/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/13]
COPY [2/12.8] = [4/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/19]
COPY [2/12.8] = [5/12.8]
DO_TABLES
}
banner1: [8/01^2/1/3/4] WITH &
([8/01^3/2/1/4] BY (total WITH firstseen))
banner2: [8/01^3/4/1/2] WITH &
((DUD WITH DUD WITH DUD WITH DUD) BY (DUD WITH DUD))
STUB={summary:
[FREQONLY] TOTAL
[SUPPRESS] NO ANSWER
}
STUB={preface:
[FREQONLY,-BASE_ROW] TOTAL
[FREQONLY] DON’T KNOW
[BASE_ROW,PRINT_ROW=AR,VPER=*] STAT BASE
}
EDIT={editsum: STUB_PREFACE=SUMMARY }
TABSET= {global:
GLOBAL_EDIT={:
-COL_TNA RANK_IF_INDICATED
RANK_COLUMN_BASE=1
-PERCENT_SIGN
STAT_DECIMALS=2
PUTCHARS=-Z–
CONTINUE=TOP
DO_STATISTICS=.90 + .95
ALL_POSSIBLE_PAIRS_TEST
}
}
TABSET= {ban1:
EDIT={: COL_WIDTH=6 STUB_WIDTH=24
STUB_PREFACE=preface }
STATS=: AB,CD,EFGH,IJKL
BAN={:
| PACKAGE PRODUCT TOTAL FIRST POSITION
| TYPE-TOTAL TYPE-TOTAL <——————–> <———————>
| <——–> <——–> BLUE BLUE GREEN GREEN BLUE BLUE GREEN GREEN
| HIGH LOWER HIGH LOWER HIGH LOWER HIGH LOWER HIGH LOWER
| BLUE GREEN SALT SALT SALT SALT SALT SALT SALT SALT SALT SALT
| ==== ===== ==== ===== ==== ==== ==== ==== ==== ==== ==== ====
}
COL=: banner1 when firstpass otherwise banner2
}
TABSET={q101:
LOCAL_EDIT=editsum
TITLE={: SUMMARY OF MEANS – DINNER ROLLS ATTRIBUTE RATINGS }
STUB={:
[STAT] OVERALL APPEARANCE OF PRODUCT AND PACKAGE
[STAT] NATURAL OR PROCESSED LOOK
[STAT] LOOKS MOIST OR DRY
[STAT] FEELINGS ABOUT THE MOISTNESS OR DRYNESS
[STAT] OVERALL COLOR
[STAT] LOOKS LIKE WHAT I EXPECTED
[STAT] EXPECTED LIKING OF TASTE
[STAT] ENVIRONMENTAL IMPACT OF PACKAGE
}
ROW=: &
$[MEAN] [2/12] &
$[MEAN] [2/13] &
$[MEAN] [2/14] &
$[MEAN] [2/15] &
$[MEAN] [2/16] &
$[MEAN] [2/17] &
$[MEAN] [2/18] &
$[MEAN] [2/19]
}
TABSET={q102:
TITLE={: OVERALL APPEARANCE OF PRODUCT AND PACKAGE }
STUB={:
[] TOP THREE BOX (NET)
[] TOP TWO BOX (NET)
[] (9) LOVE IT
[] (8)
[] (7)
[] (6)
[] (5)
[] (4)
[] (3)
[] (2)
[] (1) HATE IT
[] BOTTOM TWO BOX (NET)
[] BOTTOM THREE BOX (NET)
[STAT] MEAN
[STAT] STD DEVIATION
[STAT] STD ERROR
}
ROW=: [2/12^7.9/8,9/9//1/1,2/1.3] $[MEAN,STD,SE] [2/12]
}
~INPUT stats^tr WORK_LENGTH=2070 TOTAL_LENGTH=2160
>PRINTFILE stat1^prt
~SET
STATISTICS_BASE_AR
DROP_LOCAL_EDIT
” STAT_DUMP
~EXECUTE
READ_PROC=proc1
TABSET=global TABSET=ban1
TABSET=q101 TAB=*
TABSET=q102 TAB=*
~END
EXAMPLE TWO – USING THE ANOVA_SCAN AND REPEATED MEASURES
~COMMENT stat2^spx
This job has 4 products but it is a 2 x 2 design. It is set up using multiple 80 column records.
General information is on record 1.
first product seen data [2/12.8]
second product seen data [3/12.8]
third product seen data [4/12.8]
fourth product seen data [5/12.8]
first product seen [8/1] \ code 1 = green high salt
second product seen [8/7] \ code 2 = blue low salt
third product seen [8/13] / code 3 = blue high salt
fourth product seen [8/19] / code 4 = green low salt
~DEFINE
”define a variable to keep track of first pass through the tables
firstpass[27/59^1]
”define a variable to keep track of first product seen
firstseen[27/60^1]
PROC= { proc1:
MODIFY firstpass = true ”this is first pass through with banner1
MODIFY firstseen = true ‘this is first product seen
COPY [27/01] = [8/01]
COPY [27/12.8] = [2/12.8]
DO_TABLES LEAVE_OPEN
MODIFY firstseen = false ”this is not the first product seen
COPY [8/01] = [8/07]
COPY [2/12.8] = [3/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/13]
COPY [2/12.8] = [4/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/19]
COPY [2/12.8] = [5/12.8]
DO_TABLES
MODIFY firstpass = false ”this is not the first pass through
COPY [8/01] = [27/01]
COPY [2/12.8] = [27/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/07]
COPY [2/12.8] = [3/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/13]
COPY [2/12.8] = [4/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/19]
COPY [2/12.8] = [5/12.8]
DO_TABLES
}
banner1: [8/01^2/1/3/4] WITH &
([8/01^3/2/1/4] BY (total WITH firstseen))
banner2: [8/01^3/4/1/2] WITH &
((DUD WITH DUD WITH DUD WITH DUD) BY (DUD WITH DUD))
Summary:
[FREQONLY] TOTAL
[SUPPRESS] NO ANSWER
}
STUB={preface:
[FREQONLY,-BASE_ROW] TOTAL
[FREQONLY] DON’T KNOW
[BASE_ROW,PRINT_ROW=AR,VPER=*] STAT BASE
}
EDIT={editsum: STUB_PREFACE=SUMMARY }
TABSET= {global:
GLOBAL_EDIT={:
-COL_TNA
RANK_IF_INDICATED
RANK_COLUMN_BASE=1
-PERCENT_SIGN
STAT_DECIMALS=2
PUTCHARS=-Z–
CONTINUE=TOP
DO_STATISTICS=.90 + .95
ANOVA_SCAN
}
}
TABSET= {ban1:
EDIT={: COL_WIDTH=6 STUB_WIDTH=24 STUB_PREFACE=PREFACE
STATS=: RM=AB,RM=CD,RM=EFGH,I=IJKL }
BAN={:
| PACKAGE PRODUCT TOTAL FIRST POSITION
| TYPE-TOTAL TYPE-TOTAL <——————–> <——————–>
| <——–> <——–> BLUE BLUE GREEN GREEN BLUE BLUE GREEN GREEN
| HIGH LOWER HIGH LOWER HIGH LOWER HIGH LOWER HIGH LOWER
| BLUE GREEN SALT SALT SALT SALT SALT SALT SALT SALT SALT SALT
| ==== ===== ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
}
COL=: banner1 when firstpass otherwise banner2
}
TABSET={q101:
LOCAL_EDIT=editsum
TITLE={: SUMMARY OF MEANS – DINNER ROLLS ATTRIBUTE RATINGS }
STUB={:
[STAT] OVERALL APPEARANCE OF PRODUCT AND PACKAGE
[STAT] NATURAL OR PROCESSED LOOK
[STAT] LOOKS MOIST OR DRY
[STAT] FEELINGS ABOUT THE MOISTNESS OR DRYNESS
[STAT] OVERALL COLOR
[STAT] LOOKS LIKE WHAT I EXPECTED
[STAT] EXPECTED LIKING OF TASTE
[STAT] ENVIRONMENTAL IMPACT OF PACKAGE
}
ROW=: &
$[MEAN] [2/12] &
$[MEAN] [2/13] &
$[MEAN] [2/14] &
$[MEAN] [2/15] &
$[MEAN] [2/16] &
$[MEAN] [2/17] &
$[MEAN] [2/18] &
$[MEAN] [2/19]
}
TABSET={q102:
TITLE={: OVERALL APPEARANCE OF PRODUCT AND PACKAGE }
STUB={:
[] TOP THREE BOX (NET)
[] TOP TWO BOX (NET)
[] (9) LOVE IT
[] (8)
[] (7)
[] (6)
[] (5)
[] (4)
[] (3)
[] (2)
[] (1) HATE IT
[] BOTTOM TWO BOX (NET)
[] BOTTOM THREE BOX (NET)
[STAT] MEAN
[STAT] STD DEVIATION
[STAT] STD ERROR
}
ROW=: &
[2/12^7.9/8,9/9//1/1,2/1.3] $[MEAN,STD,SE] [2/12]
}
~INPUT stats^tr WORK_LENGTH=2070 TOTAL_LENGTH=2160
>PRINTFILE stat2^prt
~SET STATISTICS_BASE_AR, DROP_LOCAL_EDIT
” STAT_DUMP
~EXECUTE
READ_PROC=proc1
TABSET=global TABSET=ban1
TABSET=q101 TAB=*
TABSET=q102 TAB=*
~END
EXAMPLE THREE – USING THE FISHER TEST AND REPEATED MEASURES
~COMMENT stat3^spx
This job has 4 products but it is a 2 x 2 design. It is set up using multiple 80 column records.
General information is on record 1.
first product seen data [2/12.8]
second product seen data [3/12.8]
third product seen data [4/12.8]
fourth product seen data [5/12.8]
first product seen [8/1] \ code 1 = green high salt
second product seen [8/7] \ code 2 = blue low salt
third product seen [8/13] / code 3 = blue high salt
fourth product seen [8/19] / code 4 = green low salt
~DEFINE
”define a variable to keep track of first pass through the tables
firstpass[27/59^1]
”define a variable to keep track of first product seen
firstseen[27/60^1]
PROC= { proc1:
MODIFY firstpass = true ”this is first pass through with banner1
MODIFY firstseen = true ”this is first product seen
COPY [27/01] = [8/01]
COPY [27/12.8] = [2/12.8]
DO_TABLES LEAVE_OPEN
MODIFY firstseen = false ”this is not the first product seen
COPY [8/01] = [8/07]
COPY [2/12.8] = [3/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/13]
COPY [2/12.8] = [4/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/19]
COPY [2/12.8] = [5/12.8]
DO_TABLES
MODIFY firstpass = false ”this is not the first pass through
COPY [8/01] = [27/01]
COPY [2/12.8] = [27/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/07]
COPY [2/12.8] = [3/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/13]
COPY [2/12.8] = [4/12.8]
DO_TABLES LEAVE_OPEN
COPY [8/01] = [8/19]
COPY [2/12.8] = [5/12.8]
DO_TABLES
}
banner1: [8/01^2/1/3/4] WITH &
([8/01^3/2/1/4] BY (total WITH firstseen))
banner2: [8/01^3/4/1/2] WITH &
((DUD WITH DUD WITH DUD WITH DUD) BY (DUD WITH DUD))
STUB={summary:
[FREQONLY] TOTAL
[SUPPRESS] NO ANSWER
}
STUB={preface:
[FREQONLY,-BASE_ROW] TOTAL
[FREQONLY] DON’T KNOW
[BASE_ROW,PRINT_ROW=AR,VPER=*] STAT BASE
}
EDIT={editsum: STUB_PREFACE=SUMMARY }
TABSET= {global:
global_edit={:
-COL_TNA
-RANK_IF_INDICATED
-RANK_COLUMN_BASE=1
-PERCENT_SIGN
STAT_DECIMALS=2
PUTCHARS=-Z–
CONTINUE=TOP
DO_STATISTICS=.90 + .95
FISHER
}
}
TABSET= {ban1:
EDIT={: col_width=6 stub_width=24
STUB_PREFACE=PREFACE }
STATS=: RM=AB,RM=CD,RM=EFGH,I=IJKL
BAN={:
| PACKAGE PRODUCT TOTAL FIRST POSITION
| TYPE-TOTAL TYPE-TOTAL <——————–> <———————>
| <——–> <——–> BLUE BLUE GREEN GREEN BLUE BLUE GREEN GREEN
| HIGH LOWER HIGH LOWER HIGH LOWER HIGH LOWER HIGH LOWER
| BLUE GREEN SALT SALT SALT SALT SALT SALT SALT SALT SALT SALT
| ==== ===== ==== ==== ===== ==== ==== ==== ==== ==== ==== ====
}
COL=: banner1 when firstpass otherwise banner2
}
TABSET={q101:
LOCAL_EDIT=editsum
TITLE={: SUMMARY OF MEANS – DINNER ROLLS ATTRIBUTE RATINGS }
STUB={:
[STAT] OVERALL APPEARANCE OF PRODUCT AND PACKAGE
[STAT] NATURAL OR PROCESSED LOOK
[STAT] LOOKS MOIST OR DRY
[STAT] FEELINGS ABOUT THE MOISTNESS OR DRYNESS
[STAT] OVERALL COLOR
[STAT] LOOKS LIKE WHAT I EXPECTED
[STAT] EXPECTED LIKING OF TASTE
[STAT] ENVIRONMENTAL IMPACT OF PACKAGE
}
ROW=: &
$[MEAN] [2/12] &
$[MEAN] [2/13] &
$[MEAN] [2/14] &
$[MEAN] [2/15] &
$[MEAN] [2/16] &
$[MEAN] [2/17] &
$[MEAN] [2/18] &
$[MEAN] [2/19]
}
TABSET={q102:
TITLE={: OVERALL APPEARANCE OF PRODUCT AND PACKAGE }
STUB={:
[] TOP THREE BOX (NET)
[] TOP TWO BOX (NET)
[] (9) LOVE IT
[] (8)
[] (7)
[] (6)
[] (5)
[] (4)
[] (3)
[] (2)
[] (1) HATE IT
[] BOTTOM TWO BOX (NET)
[] BOTTOM THREE BOX (NET)
[STAT] MEAN
[STAT] STD DEVIATION
[STAT] STD ERROR
}
ROW=: [2/12^7.9/8,9/9//1/1,2/1.3] $[MEAN,STD,SE] [2/12]
}
~INPUT stats^tr WORK_LENGTH=2070 TOTAL_LENGTH=2160
>PRINTFILE stat3^prt
~SET
STATISTICS_BASE_AR
DROP_LOCAL_EDIT
” STAT_DUMP
~EXECUTE
READ_PROC=proc1
TABSET=global TABSET=ban1
TABSET=q101 TAB=*
TABSET=q102 TAB=*
~END
8.3.4 Changing the Variance
When determining significance the Mentor program can calculate the variance of a multiple column test in three different ways. It can use a pooled variance, a paired variance, or a separate variance. Pooled variance uses a formula that combines the variance of all the columns in a test. Paired variance combines the variance of each pair in a test. Separate variance calculates the variance for each item separately. The default variance for the All Possible Pairs Test is a paired variance, while the default variance for the Newman-Keuls test is a pooled variance.
You can change the default variance used by the program with one of these EDIT options: USUAL_VARIANCE (default), POOLED_VARIANCE, PAIRED_VARIANCE, or SEPARATE_VARIANCE.
The following is a table of the type of variance that is used when testing means using a STATISTICS statement that looks like STATISTICS= STAT1: ABCDE.
EDIT OPTION All Possible Pairs Newman-Keuls
USUAL_VARIANCE VAR(a,b) VAR(a,b,c,d,e)
POOLED_VARIANCE VAR(a,b,c,d,e) VAR(a,b,c,d,e)
PAIRED_VARIANCE VAR(a,b) VAR(a,b)
SEPARATE_VARIANCE VAR(a)+VAR(b) VAR(a)+VAR(b)
Separate variance may be useful when trying to duplicate a formula from a textbook or some other program, but it only makes sense for independent or inclusive tests. Also, any overlap in the sample (after taking into account any inclusive test) will force separate variance to be ignored and the paired variance to be used.
None of this affects the testing done on percentages. The variance for percentage tests is always defined as the square root of (p times (1-p)), where p is the sum of the all frequencies in the test divided by all the sum of all the percentage bases in the test.