How is “Text” data stored and how much text area do I need?

When Survent executes a Text question it stores a pointer in the single column data location assigned to that question. The pointer points to the next available space in the text area which is where all the text data is stored. If the Text question is skipped, no data is stored anywhere. If a Text question with sub-type blank_ok is executed and no text is entered, the front pointer will be stored, but it will point to a blank response in the text area.

The text area is defined by using the TEXTSTART option in the Study Header statement. The TEXTSTART option defines where the text area starts and how many columns it will use. If no length is specified then the text area will span from the Textstart setting to the end of the case. In version 8.7+ the entire text area and any of the Text question pointers must be below column 65,535. In earlier versions (8.6.1), the maximum column allowed for a text pointer or the text area was 32,700.

In addition to the Text data being stored in the text area it is also compressed in such a way that 2 characters are stored in each column of the text area. Each question stored in the text area also uses up 2 columns as it needs a “back-pointer” which points back to the originating Text question. For example, if 50 characters are typed in a text question, then 27 columns will be used up in the text area. 50 divided by 2 is 25, plus 2 more for the question itself.

The best way to try to determine if you have enough text area is to first look at the QSP file and it will always have the length filled in on the TEXT_START option. For example:

TEXT_START=12001.3000,

You can then double that length in this case to 6000. The average person types about 200 characters a minute, so if you had 6000 characters they would have to type for 30 minutes to fill up the text area. The actual storage area will be slightly less, but this should be a very close approximation.

A good check is to use the dt* command in the ~Cleaner block of Mentor as it will not only show you all the text on a given case, but it will show you where the text pointers are, where the text is in the text area, and most importantly how much space is still left in the text area.

~CLeaNer command or >help–>dt *

con: dt *
ID: 0001 (study code=text, int_id=0001):
TEX: 7001 <-> 12002.47=this is the first open end entered on this case
TEX: 7005 <-> 12028.45=this is another text data entry for this case
Free text area 12051.2950and text area status is 2

7001 is the location of the text pointer. 12002.47 is the location in the text area where the text data is stored. Note, the .47 refers to the numbers of characters in the text pointer and not the number of columns it uses. Even though this was 47 characters in length, the next text question starts in 12028 which is only 26 columns later. The 2950 shows how many columns are still left on the case. Again, you have space for almost twice as many characters as that.

Note, if you edit an existing text question, the layout of the questions in the text area can change drastically as what happens is the edited question is removed from the text area, all other questions are slid forward, and the new edited text is stored at the end of the area.

For example, if you had text area that looked like this:

ID: 0001 (study code=text, int_id=0001):
TEX: 7001 <-> 12002.21=this is the some data
TEX: 7005 <-> 12015.50=this is another text question with some data in it
TEX: 7008 <-> 12042.44=this is yet another question with data in it
TEX: 7009 <-> 12066.54=and this is the current last question in the text area
Free text area 12094.2907 and text area status is 2

And then you edited the 2nd question in location 7005.

ID: 0001 (study code=text, int_id=0001):
TEX: 7001 <-> 12002.21=this is the some data
TEX: 7008 <-> 12015.44=this is yet another question with data in it
TEX: 7009 <-> 12039.54=and this is the current last question in the text area
TEX: 7005 <-> 12068.61=this question with some data in it and now it has been edited
Free text area 12099.2902 and text area status is 2

Notice in addition to the text for 7005 being different, it has moved in the list to the bottom. The 2nd and 3rd text questions have also moved up on the list and there locations in the text area have changed.