Cleaning Specs for Various Data Types

Attached is a spec CLEANDATA.SPX that will show how to write a cleaning spec for each type of data, including all of the following:

Punch Data

Single Column – Single Response

Multiple Column – Single Response

Single Column – Multiple Response

Mutiple Column – Multiple Response

Multiple Non-Continguous Columns – Single Response

Multiple Non-Continguous Columns – Multiple Response

Field/Coded Data

Single Response

Multiple Response

Numeric Data

String Data

Text Data

Other Examples

Multiple (usually consecutive) locations with the same exact data checking

Constant Sum

Awareness grid

~Def Proc= Cleaning: ''**********  Punch Data / Cat Questions *****************''Single Column - Single ResponseCHecK [11^1//5/Y]  "Optional Message" ''Multiple Column - Single ResponseCHecK [12.2^1//15/23/24] "Optional Message" ''Single Column - Multiple Response (X and Y are marked exclusive)CHecK [13*P^1//7/(-)X/(-)Y] "Optional Message"''orCHecK [13^1-7/X/Y] "Optional Message"''Multiple Column - Multiple Response (X and Y in 2nd column are marked exclusive)CHecK [14.2*P^1//15/(-)23/(-)24] "Optional Message"''orCHecK [14.2^1-15/23/24] "Optional Message"''Single Response - Multiple Column - Non ContinguosCHecK [16.2*P^1//15/B] "Optional Message"CHecK [19*P^1//9/X/Y/B] "Optional Message"If [16.2^B] and [19^B] ThenError "No Answer in [16.2] and [19]"EndifIf [16.2^NB] and [19^NB]Error "Answer in both [16.2] and [19]"Endif ''Multiple Response  - Multiple Column - Non ContinguosCHecK [16.2*P^1-15/B] "Optional Message"CHecK [19*P^1-9/X/Y/B] "Optional Message"If [16.2^B] and [19^B] ThenError "No Answer in [16.2] and [19]"Endif If [19^X,Y] and [16.2^NB]Error "19 is X or Y and 16.2 and 19 not single" [16.2$P] [19$P]Endif'**********  Code Data / Fld Questions *****************''Single ResponseCHecK [21.2*Z#1//20/91/99]''Multiple ResponseCHecK [23.2,...,29.2*F*Z#1//20/91/(-)99]''**********  Numeric Data / Num Questions *****************CHecK [31.3#1-100,dk,rf] "Optional Message"''or CHecK [31.3*R=1-100,,dk,rf] "Optional Message"''**********  String Data / Var Questions *****************CHecK [36.20*P=10$] "Need at least 10 characters, no more than 20" ''**********  Text Data / Text Questions *****************CHecK [57$T] "Optional Message" ''**********  Consecutive Columns with similar Data ************''  This works for all types aboveCHecK [58,...,80^1//5/9]''**********  Constant Sum  **************************''   The following 5 questions must add to 100If [81.3] + [84.3] + [87.3] + [90.3] + [93.3] < 100 or Sum([81.3,...,93.3]) > 100 ThenERRor "Sum either > 100 or all answered and < 100" [81.3,...,93.3$]Endif''**********  Awareness Grid / Fld Questions''where 101.2  first unaided mention''103.2-115.2  all other unaided mentions''117.2-131.2  unaided advertising mentions''133.2-141.2  aided mentions''143.2-151.2  aided advertising mentionsIf [101.2#1//20] Intersect [103.2,...,115.2*F#1//20] ThenERRor "First Mention Duplicates Other Mentions" [101.2$] [103.14$]EndifIf [101.2,...,115*F#1//20] Contains [117.2,...,131*F#1//20] ElseERRor "Ad Aware, but not Aware" [101.16$] [117.16$]EndifIf [101.2,...,115*F#1//5] Intersect [133.2,...,141*F#1//5] ThenERRor "Unaided Duplicates Aided Mentions" [101.16$] [133.10$]Endif If [101.2,...,115,133,135,...,141*F#1//5] Contains [143.2,...,151*F#1//5] ElseERRor "Something in Aided Ad, but not ever aware" [101.16$] [143.10$]EndifIf [117.2,...,131*F#1//5] Intersect [143.2,...,151*F#1//5] ThenERRor "Unaided Ad Duplicates Aided Ad" [117.16$] [143.10$]Endif}~End

  • cleandata.spx