Populating answer choices from household roster

1. What is the problem? Be very detailed.
I am trying to get information on which members of the household participate in various livelihoods. The roster is completed using a repeat loop. The livelihoods are identified from a select_multiple question. How do I set up a select_multiple question for each livelihood using household members as the selection options?

2. What app or server are you using and on what device and operating system? Include version numbers.
Using ODK XLSForm Oline, MacOS Big Sur

3. What you have you tried to fix the problem?
I have set up the attached forum. I can do this with a select_one question using select_one ${person_name} but this allows the respondent to select only one household member per livelihood.

4. What steps can we take to reproduce the problem?
See the attached ODK form.
https://drive.google.com/file/d/1PEhJB7bkH46YTYlmCXGZp9bS7nNXMn9o/view?usp=sharing

5. Anything else we should know or have? If you have a test form or screenshots or logs, attach below.

Dynamic lists are hard, but possible.

Do you have a reasonable upper bound on the size of the household? If you can guarantee you would not need to do more than 10 (or 20, or 30), then there is a solution. This approach would use indexed-repeat and choice filtering.

https://docs.getodk.org/form-operators-functions/#indexed-repeat

First, calculate the size of the repeat (named person in your example form). It will be used later in a few places:

type: calculate, name: person_count, calculation: count(${person})

(EDIT: the function was changed from size to count)

Next, create as many calculates as your upper bound (10, 20, 30, etc.). The name is recorded inside the repeat as person_name.

type: calculate, name: person1, calculation: if(1 <= ${person_count}, indexed-repeat(${person_name}, ${person}, 1), "")
type: calculate, name: person2, calculation: if(2 <= ${person_count}, indexed-repeat(${person_name}, ${person}, 2), "")
...
type: calculate, name: person10, calculation: if(10 <= ${person_count}, indexed-repeat(${person_name}, ${person}, 10), "")

Add a choice filter column to any question using the hh_persons choice list:

choice-filter: filter <= ${person_count}

Then your hh_persons list becomes (make sure to add a column called filter and put the value in there):

list_name: hh_persons, name: person1, label::English: ${person1}, filter: 1
list_name: hh_persons, name: person2, label::English: ${person2}, filter: 2
...
list_name: hh_persons, name: person10, label::English: ${person10}, filter: 10

Hope that helps! I have not tried this myself, but I have done similar things for my work.

The reason we don't currently allow select_multiple from repeat is that it's really easy to end up with results that are hard or impossible to analyze if the field that is uses allows spaces. (e.g. "Marie Pierre Denis" could refer to one, two or three people who were selected). We could consider adding it.

In the mean time, you could unroll the choices as @jpringle suggests and make sure you don't allow spaces. In that case, I'd recommend using the choice_filter expression position() <= ${person_count} and avoiding an extra filter column.

Do you know what kind of analysis you're going to end up doing? Could you consider flipping things around and having an occupation select multiple in the person repeat? Depending on what you want to do with the data, that could be easier to work with. In order to avoid including all occupations in the selection, you could ask which occupations are represented in the household before you build the household roster.

Thanks for pointing out the spaces problem. I had not thought of that but your suggestion seems like an easy fix. I hope you reconsider adding select_multiple from repeat since even the work around suggested by @jpringle runs into the same problem. Or, is it possible to rethink how select_multiple responses are recorded? Comma delimited instead of space delimited?

I don't understand how asking the occupation question first fixes the problem. Wouldn't I still be including a select_multiple with dynamically generated list of occupations after the household member repeat? It flips the format but the problem remains unless I am misunderstanding your suggestion. Is the distinction whether the select_multiple comes inside or outside the repeat?

Thanks for your response, however I couldn't get this to work.

When calculating the size of the repeat, I get an error using the function "size". Has "size" function been replaced? I can use "sum" without errors, however, I cannot preview in a browser. Something still not right.

Modified form... https://drive.google.com/file/d/1s0C1vsNIF4fAhDjK24GEBl95Q2qXlNbe/view?usp=sharing

Whoops: it is count. I always have to look this one up. See https://docs.getodk.org/form-repeats/

It should be

type: calculate, name: person_count, calculation: count(${person})

This actually isn't a problem in the solution I presented. The choice names used in the select_multiple are person1, person2, ... person10.

See above. This isn't a problem in the solution I presented.

Unfortunately, no, it isn't possible to rethink it at this stage. There are many, many tools that rely on the ODK spec, which is derived from XForms and an longstanding package called JavaRosa that implements XForms. The underlying data is an XML file, and each question gets a single string field to store the response. That is the limiting factor.

2 Likes

I tested your form on my Enketo preview through ODK Central. I had to make a few changes (remove the group definitions in green because it was too much nesting) and change sum to count. This worked for me.

OK Thanks. I think ODK Aggregate must have different recursion limits. I'll try ODK Central.

Sorry, that's right! You can't have dynamic underlying values with that approach so that may be a fine way to go if what you need are lists of person indexes for each occupation. I'm a little uncomfortable with it because it's "unrolled" rather than being dynamic based on the person count and because being able to reference survey values from choices is more of a bug than an intentional future. We do see that this pattern has spread so it's unlikely we'd change it at this point but it's a strange thing to do from the XForms spec perspective.

I would consider what analysis is needed and what is easier for respondents to answer. Do you just need counts of people per household for each occupation? Or do you need to connect individuals with their occupations? Is it easier for respondents to state each household member's occupations or to go occupation by occupation and list household members? You may also consider that it will be harder to identify mistakes going occupation by occupation. For example, if I ask a respondent about Kwame's occupations and I mark "doctor" and "fisherman", I might do an extra confirmation because it's an unusual combination. If I include Kwame in the doctor and fisherman lists, I probably won't notice anything interesting about Kwame.

The structure I described entails first asking "What occupations are represented in your household" with multiple choices. Let's call this question household_occupations.

Then for each household member, you can ask "What occupation(s) do you have?" and filter the list based on occupations in the household. The choice_filter expression would be something like selected(${household_occupations}, name). It's completely dynamic and will update if the overall household occupations are updated. It ties occupations to individuals. It does not have any problem with spaces because the choice list is the pre-defined occupation list, there's no list built on household members.

Yes. Your suggested approach makes a lot of sense to me.