Strange output from "select_multiple" questions

I have output from an ODK survey that I was uploaded directly the ODK Collect app on the phone (v1.21.2) to a Google Drive Sheet. I just noticed that for the "select_multiple" questions, I got weird output. For example, we have one question that allows respondents to select multiple months, coded as 1-12. But for people that actually selected multiple months, there are weird numbers like 43686, 37623, 40399, etc. How do I make sense of these numbers and decipher which months were actually selected for this question?

Can you attach the form that you used?

HERE is a link to a form with the two questions on month that I referred to in my example. There were other "select_multiple" questions that had similar issues.

Hi, I just wanted to follow-up and see if you ever got a chance to take a look at the example form I put a link to and if you have any advice on what might be causing the strange output from the "select_multiple" questions. I'm going to be creating a similar survey to use in the coming weeks, and I would like to avoid this issue on the next round of surveys, if possible.

Hi @generic_name

I tried to reproduce your issue but to no avail. Could you share the sheet with answers?

HERE is the link to the output I got from those two questions.

@mmarciniak90
could you try to reproduce the issue? The result sheet shows that it happens like in 1/10 submissions.
Try v1.21.2, the attached form and various devices.

Thanks for details @generic_name I was able to reproduce the issue (with @mmarciniak90 help) and prepared a fix which should go in v1.23 (should be available in July). For now, I would recommend selecting those columns which will contain answers from select-multi questions and change their format to Plain text (In your spreadsheet Format -> Number -> Plain text). It won't fix those values you already submitted but it should avoid the same issue in the future.

Thank you. I appreciate you looking into this and trying to find a fix for the problem on future surveys. I'm wondering: is there anyway to figure out from the values that were produced from the previous survey what answers were selected?

I'm not sure if there it's a reliable way but in my case, it worked:

  1. Select a cell with the wrong value, eg. A4 43686
  2. Click Format->Number->Date
  3. It will give you 09/08/2019 what might indicate selected values were 9, 8, 19

but you have only 12 options in the form you attached...
did you have more by any chance? or maybe you used another form?

Hmmm... Yes, I only had 12 options, so numbers above 12 would not represent a valid response option.

If you are 100% sure you didn't send data with more options I have no idea then.

@generic_name If you only have 12 options, one thing you could try to "recover" the answers is send in some fake submissions (keep track of which) with some of those choices and see what the sheet converts them to.

You can also do this manually in Excel by playing around with the values.

  • To go from the number to a date: =TEXT("43686", "dd/mm/yyyy")
  • To go from the date to a number: =DATEVALUE("43686")

Is it an issue with how it uploaded to Google Sheets? Would I be able to retrieve the information if I just try to get the data from the phone on which these surveys were conducted and opened those files in a different format?

@generic_name, yes, this is specific to Google Sheets. If you have access to the devices, you can pull them off the device using ODK Briefcase and export to a CSV.

Excellent. I'll give that a try. Thank you so much for your help and advice!

1 Like

As of ODK Collect v1.23, all values are sent to Google Sheets from Collect as raw values. That means Google Sheets will not try to apply any kind of formatting. When you edit a cell, you will see a tick before the value (e.g. '1-2-3 rather than 1-2-3). You can perform calculations with those values as you normally would and you can apply a format (e.g. currency, date) to select columns yourself if you would like.

Hi, I totally understand the change since version 1.23 because I also ran in the date format issue like OP in the past, but now with the apostrophe in front of my entries I can't perform calculations anymore. Sum function gives 0 as a result unless you manually apply the numeric format to the cells. But that's not valid for new entries and the result is not automatically update, even if you applied the format to the whole column.

Am I missing something? @LN
Thanks for your help.
Daniele

I agree that this is a problem, @sneike. See Apostrophe in front of each answer in formula bar causing problems with formulas in other sheets - #2 by LN for more context and a suggested approach for now.

@LN thanks for pointing out that thread. I did several searches in the forum but I didn't find it. Looks like I will have to create new formatted sheets with converted values, but that is a real pain with big spreadsheets, many formulas and calculations, because you have to manually edit every single formula with the new converted data!

But yeah, I can't think of a better solution to fix the loss of imported data because of wong formats applied by spreadsheet.

Thanks a lot for your work and support.
Daniele

1 Like