So, this is slightly more low-level than the advice offered previously on
this thread, but in terms of implementing an expansion of select_multiple
variables, my approach is this:
- I pad the strings with spaces on either end, e.g. "1 2" >> " 1 2 " (can
be done via a simple concatenate command in whatever program you're using) - Then, I generate dummy variables for each possible option, e.g. dummy1
dummy2, and set them equal to 1 if the string contains the space-padded
number, e.g. search for " 1 ", " 2 ", etc. (Does that make sense? This is
to prevent an option like "14 11" from showing up in a search for "1").
In Excel, you could use the search function to create these dummies, e.g.
dummy1 >> =IF(ISNUMBER(SEARCH(" 1 ",A1)),1,0)
dummy2 >> =IF(ISNUMBER(SEARCH(" 2 ",A1)),1,0)
(Obviously, you'll need to update the cell reference from A1).
If you use stata, I've written an ado file that automatically does this for
coded answers 1-15, -996, -998, -999 (you could update these ranges as
appropriate to your project and coding strategy); it also takes a defined
value label list and uses that to name the dummy output. Code pasted below.
Would love to hear how other people achieve the same.
Katherine
ยทยทยท
--------------------------*! Program to expand variables that list multiple choices as a string, into
distinct individual choices. Assumes a string of space-separated number
codes and "other" responses.
program define kexpand
version 12
syntax varlist, valuelabel(string)
foreach v of varlist varlist' { replace
v' = subinstr(v', "other", "-996", .) // Replace "other" responses with code replace
v' = "" if v'=="." // Replace "." responses with blank string replace
v' = " " + v' + " " if
v'!="" // Pad string with
spaces on both ends
forval x= 1/15 { // Expand to
dummies using regex match
gen v'_
x' = .
replace v'_
x' = 0 if v'!="" replace
v'_x' = 1 if regexm(
v', " x' ") local varname "
:label valuelabel'
x''"
label var v'_
x' "`varname'"
}
foreach x in 996 998 999 { // Expand missing
values to dummies using regex match
gen v'_
x' = .
replace v'_
x' = 0 if v'!="" replace
v'_x' = 1 if regexm(
v', "x'") local varname "
:label valuelabel' -
x''"
label var v'_
x' "`varname'"
}
}
foreach v of varlist varlist' { // Drop if variable is totally empty foreach x of varlist
v'_* {
qui sum x' if r(mean) == 0 & r(max) == 0 & r(min) == 0 { drop
x'
di "`x' dropped because all values were zero or
missing"
}
}
}
end
On Thursday, April 4, 2013 2:17:17 PM UTC-4, theago...@gmail.com wrote:
Hi,
I wondered if anyone had any tips on analysing data gathered using a
'select multiple' question, with data processed by ODK aggregateAt the moment, for each set of responses it produces a string separated by
spaces in each cell, for example:1 2
3
1 3 4
2 4But the problem is, this makes things tricky for data analysis. Sometimes,
I am interested just in everyone who answered '2' - but it isn't possible
to filter the results to see just the people who answered '2' - because
some people answered '1 2', some '2 4' etc.I wondered what your tips were on dealing with this kind of data?
Thanks!
TG