Numeric answer with "I don't know" option

I will like to design the xlsform that captures data as follows
1. How many people own dogs in your neighbourhood?
In this question respondent is expected to answer with a number like 10 and so forth, but sometimes respondent might not know the answer for the above question. In this case we do not want the question to remain blank. We expect the to respond with the answer "I don't know"
How do we design that form to enable it accept either a number or I don't know?
@Batkinson @Grzesiek2010

I'm sure there are additional ways to do this but you could do one of the following.

[Option 1]
You can split it into two questions and use a relevant condition. Something like this:

type name label required relevant contraint
select_one yn dogs_known Do you know how many people own dogs in this neighborhood? yes
integer dogs_number How many people own dogs? yes ${dogs_known} = 'yes' .>=0

[Option 2]
Or you could use a number outside of the expected range for "don't know" (here, assuming that the answer will always be less than 200):

type name label hint required contraint constraint_message
integer dogs_number How many people own dogs? (input "999" if don’t know) yes (.>=0 and .<200 ) or . = 999 value must be 0-200 (or 999 for don’t know)

Thank you @danbjoseph for your answer and great suggestions.

@oichirobert @danbjoseph I can suggest a third option, which I've developed recently.

Option 1 given by @danbjoseph has the advantage that the integer variable will either contain real data or be empty, so that statistics calculated on it should be relevant. (For instance, if only 700 out of 1000 respondents have answered that question, you would want to calculate an average value based on those 700 answers only.) On the other hand, a (minor) disadvantage is that it doubles the number of questions to ask (unless the first answer is "no", of course), which can become a problem if you have many such questions in your survey.

Option 2 has the advantage of combining both questions into a single one, using a special numeric code for "I don't know". However, the disadvantage is that it can mess up your statistics if you're not careful, or if the person who will analyse the data is not fully aware of this "trick". As an example, we had a survey using this method for some questions (with "9999" as the code number) and the survey managers initially got confused when visualising the data on the KoBoToolbox online platform, because the few "9999" answers lead to insane average values for these variables.

The third option that I propose has the disadvantage of being a bit convoluted in the coding, but it combines the advantages of both options 1 and 2: only "real" data is recorded and only one question is asked (unless the enumerator enters contradictory data, as you will see, but this is assumed to be exceptional). The idea is to combine the two following questions on the same screen, using a group with a "field-list" appearance:

  1. "How many people own dogs?", as an 'integer' question.
  2. or "I don't know", as a 'select_one' question with only a single option: "I don't know".

Because you only want one of these questions to be answered, none of them should be mandatory ('required'=false). So data control is then achieved using a subsequent 'note' (a warning), which will only be displayed if the answers given previously are contradictory. That is:

  • If both answers are empty.
  • Or if both answers are non-empty.

This is done with a 'relevant' formula for the note such as:

(${number}='' and ${number_idk}='') or (${number}!='' and ${number_idk}!='')

And to prevent the enumerator from moving forward without correcting the previous data, you define this note with 'required'=true. Of course, the note must explain why there is a problem and how to solve it! Hopefully, as I said, the warning note being displayed should only be the exception and would probably not happen more than once per enumerator (and maybe not at all if a preliminary training addresses this particular point).

If you want to test this option, you can try it online or download this commented XLSForm: Numerical question with I don't know.xlsx (13.1 KB)


Alternatively, you can leave your original question (aka ${question}) as it is, with perhaps an additional hint saying something like "Please leave blank if you do not know". And instead add a (hidden) calculation which will contain the actual response (${response}) that you are interested in capturing and subsequently processing, eg

calculation: response = coalesce(${question}, 'I dont know')

That way, ${response} should always contain the result you desire.

@sebmercier Thank you so much for your wonderful response. I will try that option you have suggested.

@Xiphware Thank you so much for being concerned and assisting me. I am always looking forward to learning from the entire crew of ODK Forum!
The above question is a requirement no skipping that is one of the conditions.

Please note there is no show/hide (ie skipping) involved; the original question "How many people own dogs?" is always visible and answerable. Rather, the value you wish to return for this question - either the number the user enters or "I dont know" - is essentially calculated from their response, if any.

Either way, you have a few options here to choose from. I'm sure one will best suit your purpose. :slight_smile:

1 Like

Thank you @Xiphware for your support

Thank you @sebmercier for your detailed explanation. Things are much much clearer than before.
I stand to be corrected if this is possible. I was thinking if that question can be designed to capture text rather than numerical data. On the data type we input 'Text' and have a hint to tell respondents to enter "I don't know " when they do not know the answer for that question, 'How many people own dogs in your neighbourhood?'
Here is a file to demonstrate that: Do_not_know.xlsx (11.4 KB)

Here is the link: Online form

What are the limitations of doing so?
There is a reason why we have text data type to accept both either text and numerical numbers in the same text box.

You can absolutely have a question that takes (freeform) text, including digits, into which the user can enter a number "123" as well as regular text.

The main limitation, as I see it, is that regardless of what you put in your hint, the user may still enter "I dont know" (no apostrophe), or "i don't know" (lowercase i), or... So in addition to now making the user type in a bunch of standard response text, you will (or should) have the issue of validating that their response is valid.

Avoiding making the user enter standard responses as plain text is good form design; it makes it a lot faster for the enumerator to fill in the form (using a select_one instead of typing in a long string) and it reduces data entry errors.