Cleaning Specs for Various Data Types
Attached is a spec CLEANDATA.SPX that will show how to write a cleaning spec for each type of data, including all of the following:
Punch Data
Single Column – Single Response
Multiple Column – Single Response
Single Column – Multiple Response
Mutiple Column – Multiple Response
Multiple Non-Continguous Columns – Single Response
Multiple Non-Continguous Columns – Multiple Response
Field/Coded Data
Single Response
Multiple Response
Numeric Data
String Data
Text Data
Other Examples
Multiple (usually consecutive) locations with the same exact data checking
Constant Sum
Awareness grid
~Def Proc= Cleaning: ''********** Punch Data / Cat Questions *****************''Single Column - Single ResponseCHecK [11^1//5/Y] "Optional Message" ''Multiple Column - Single ResponseCHecK [12.2^1//15/23/24] "Optional Message" ''Single Column - Multiple Response (X and Y are marked exclusive)CHecK [13*P^1//7/(-)X/(-)Y] "Optional Message"''orCHecK [13^1-7/X/Y] "Optional Message"''Multiple Column - Multiple Response (X and Y in 2nd column are marked exclusive)CHecK [14.2*P^1//15/(-)23/(-)24] "Optional Message"''orCHecK [14.2^1-15/23/24] "Optional Message"''Single Response - Multiple Column - Non ContinguosCHecK [16.2*P^1//15/B] "Optional Message"CHecK [19*P^1//9/X/Y/B] "Optional Message"If [16.2^B] and [19^B] ThenError "No Answer in [16.2] and [19]"EndifIf [16.2^NB] and [19^NB]Error "Answer in both [16.2] and [19]"Endif ''Multiple Response - Multiple Column - Non ContinguosCHecK [16.2*P^1-15/B] "Optional Message"CHecK [19*P^1-9/X/Y/B] "Optional Message"If [16.2^B] and [19^B] ThenError "No Answer in [16.2] and [19]"Endif If [19^X,Y] and [16.2^NB]Error "19 is X or Y and 16.2 and 19 not single" [16.2$P] [19$P]Endif'********** Code Data / Fld Questions *****************''Single ResponseCHecK [21.2*Z#1//20/91/99]''Multiple ResponseCHecK [23.2,...,29.2*F*Z#1//20/91/(-)99]''********** Numeric Data / Num Questions *****************CHecK [31.3#1-100,dk,rf] "Optional Message"''or CHecK [31.3*R=1-100,,dk,rf] "Optional Message"''********** String Data / Var Questions *****************CHecK [36.20*P=10$] "Need at least 10 characters, no more than 20" ''********** Text Data / Text Questions *****************CHecK [57$T] "Optional Message" ''********** Consecutive Columns with similar Data ************'' This works for all types aboveCHecK [58,...,80^1//5/9]''********** Constant Sum **************************'' The following 5 questions must add to 100If [81.3] + [84.3] + [87.3] + [90.3] + [93.3] < 100 or Sum([81.3,...,93.3]) > 100 ThenERRor "Sum either > 100 or all answered and < 100" [81.3,...,93.3$]Endif''********** Awareness Grid / Fld Questions''where 101.2 first unaided mention''103.2-115.2 all other unaided mentions''117.2-131.2 unaided advertising mentions''133.2-141.2 aided mentions''143.2-151.2 aided advertising mentionsIf [101.2#1//20] Intersect [103.2,...,115.2*F#1//20] ThenERRor "First Mention Duplicates Other Mentions" [101.2$] [103.14$]EndifIf [101.2,...,115*F#1//20] Contains [117.2,...,131*F#1//20] ElseERRor "Ad Aware, but not Aware" [101.16$] [117.16$]EndifIf [101.2,...,115*F#1//5] Intersect [133.2,...,141*F#1//5] ThenERRor "Unaided Duplicates Aided Mentions" [101.16$] [133.10$]Endif If [101.2,...,115,133,135,...,141*F#1//5] Contains [143.2,...,151*F#1//5] ElseERRor "Something in Aided Ad, but not ever aware" [101.16$] [143.10$]EndifIf [117.2,...,131*F#1//5] Intersect [143.2,...,151*F#1//5] ThenERRor "Unaided Ad Duplicates Aided Ad" [117.16$] [143.10$]Endif}~End