Enketo not loading because of issues with CSV file used for pulldata

1. What is the issue? Please be detailed.
I have a CSV that I'm using to pulldata that works fine in ODK Collect but has issues with Enketo. Using Enketo I get the following error message:

Error occured during the loading of this form. We do not recommend you edit this record until this is resolved. Please contact sup...@ona.io with the link to this page and the error message below:
CSV column heading "" cannot be turned into a valid XML element

It seems as though others have had this issue with ODK Central and Kobo in the past, and while I'm using ONA, I think this more has to do with the format of my CSV file because I had it working in the past, but cannot find what part of the CSV I changed to break this. Is there any documentation available on formatting these CSV files? This is all I could find regarding Pulldata, but not the CSV file specifically. Perhaps this is an issue with ONA's server (I've already contacted them) but I figured I'd post here because I think it is an issue with the CSV format and others may have the same issue.

2. What steps can we take to reproduce this issue?
Try to open up the form in a web browser, or try to edit a previous submission and it brings up the same error.

3. What have you tried to fix the issue?
My original data that I used to make the CSV is in Google Sheets. I've tried Exporting from Google Sheets directly to CSV. Failure.

I've tried copying cells from google sheets into a new Excel workbook, then saving as both CSV UTF-8 (Comma Delimited) (.csv) or Comma Separated Values (.csv) and neither of those worked as I thought I remember @LN saying something about that being important, however, I was unable to track down anywhere that has specs on what the format of the CSV needs to be. I was unable to find any spaces in headers as that is what I think was causing issues in the past.

4. Upload any forms or screenshots you can share publicly below.
Import_test.csv (59.8 KB)

Hi Tyler, hope all is well.

Trying to open your CSV in python I get this:

And looking at the file using notepad++ text editor:

Seems that those strange characters are not being understood by Central (that reads utf-8 encoded files).

As Yaw has mentioned on the Insiders call this is the download link for notepad++: https://notepad-plus-plus.org/downloads/

Hope it helps!

Thanks for looking at this @rfvieira. I used both TextMate and VSCode to get a closer look, and VSCode (w Rainbow CSV Extension) does a good job highlighting the problem characters. I decided to just delete the whole French Column (Column G/7), that still didn't work. I then just deleted all of the values in the French except for the first couple rows with "Anacardier" Still no luck. I saw that there were a few Spanish/Portuguese characters (ñ and á) in the English common names so I deleted those and replaced them with "n" and "a". I tried again and still same error message. The other CSV also had 3 duplicate rows so I deleted two of them, still same error message.

Here is the updated CSV that still isn't working.
Import_2.csv (52.0 KB)

Based on everything I've tried I was hoping it was more of a file format issue not individual characters, but I can't seem to find the issue.

I've tried even small files like this example from the XLSForms Documentation by downloading the XLSX, opening in Excel, renaming the sheet name to Import, then saving as CSV UTF-8 (.csv), opening it in VSCode to double check, and then uploading it to ONA, only to receive the same error. Here is that short CSV file:
Import_from_XLSForms_fruits_example.csv (77 Bytes)

It wouldn't be that big of a deal but it renders Enketo useless which I don't use as much as ODK Collect, but still use frequently. The most frustrating part about this is I have 3 or 4 backed up CSVs, all of which have the same problem and I apparently never backed up the CSV that was working properly. :disappointed_relieved: :sob:

Sorry for bringing python to the table again, but I changed a bit the way I was reading your file, look:

By adding the parameter encoding_errors = 'backslashreplace' it starts replacing any bad characters on the input file so it can read the whole content and the on_bad_lines = 'warn' allows me to control what behaviour it will take if a line has a different number of columns for example (with more or less commas for example), in this case it will warn but still read the other lines.

The last code cell writes the CSV into a new file enforcing utf-8 encoding and comma as the separator (index=False means that I don't wanna export the indexes created while loading the file).

Now you can identify the changes made by this process on your file and change what is causing the error:

Tyler_Import_test.csv (60.2 KB)

@rfvieira, I went through your file and replaced all of the "" with normal English characters. I also noticed there was a blank line at the very end which I deleted, I still get the same result. Here is the file that cleaned from the one you gave me.

Import_3.csv (60.0 KB)

Well, at least no encoding erros :))

I tried to use your file on a select_one_from_file question but I got confused of what column should I use as the identifier (name column) and what should be displayed for the enumerator (label column).

How are you using this CSV? Can you give me an idea of the content of the survey sheet that references the file?

Do you have a device that at one time had the working CSV that you can lay your hands on? If so, install Files, navigate to the ODK folder, find the project / form and go back through the versions folders and take a copy of the CSV from here.

\org.odk.collect.android\files\projects\[yourprojectID]\forms\[yourformname]_[version#]-media\

Or connect over USB and browse from PC...

It's likely not your solution, but I did have an issue where excel was creating a slightly noncompliant CSV and I had to add a 'false' first column - header was 'a', no data in the cells below it, then my second column was 'name' etc etc.
And I also sometimes had errant cells that seemed empty but would cause problems, so going to the last column then selecting the next column to the right, then extending all the way to the end (ctrl-shift-right) and deleting, then going to the last row, selecting the next row below and extending to the bottom (ctrl-shift-down) and deleting would clear these out also.

@rfvieira , Here is my current XLSForm file: Farm_Observations.xlsx (163.2 KB)

Big picture
The survey form is used to manage farm tasks/events and tree observations designed for use by monolingual manual laborers, as well as French/English speaking/reading managers. See bottom for more details.

Specifics of CSV
The vast majority of surveys are done using a Tree ID number to make an observation, measurement, etc.

The CSV in particular is a list of all the trees that have ID numbers. When making an observation, enumerator either scans a barcode or manually enters the ID. The ID is used to pulldata for several different properties of the tree/plant, which determines some different questions later. Most of these pulldata calculations can be found in rows 28-57.

Details about Overall Survey/Project
The file/survey has become much larger than what I originally intended and thats mostly because ODK Collect is so AWESOME! It has become a game changer in a very positive way because it has allowed someone who is monolingual in a minority language and illiterate (but can read numbers decently) be able to collect consistent, detailed, and organized observations that previously were all just highly disorganized WhatsApp voice messages/pictures. We originally had about 300 grafted cashew trees and were taking detailed measurements/observations to confirm characteristics to verify the claims made about each variety. A couple years later, we added over 100 species with over 400 unique varieties of other fruit trees/plants, many of which don't have names in French and the English names are often stolen from Spanish/Portuguese. This is the first "big" project/survey file that I've created/used and at the moment it is primary used by myself and another monolingual, illiterate laborer. If there are comments/constructive criticisms about survey/form design that would help, feel free to PM me. Would love the feedback!

1 Like

I was able to get over 50 different Import.csv files from an Android device that was using ODK Collect using that method. All returned the same error except for a few where there was a space in one of the headers "Tree ID" instead of "Tree_ID", which returned this error:

CSV column heading "Tree ID" cannot be turned into a valid XML element

I deleted another column that wasn't being used by any pulldata calculation, which I really thought would have fixed it...still not working. Here is that CSV
Import_4.csv (55.1 KB)

I included my most current XLSForm in my response to @rfvieira . Is it possible that its a problem with something the XLSForm? Someone on the Insider Call mentioned they didn't think ONA upgraded anything recently. I can see on ONA that I have ODK Collect submissions from Sept 13 24 that were edited via Enketo on Sept 22, 24 as well as ODK Collect submissions from Sept 30 and Oct 1 that were edited on Oct 1 via Enketo. I'd hesitate to upload an old XLSForm because I've added fields since then with submissions that now have data.

Found the culprit...it was a DIFFERENT CSV file, one that was WAY simpler and easy to diagnose. I guess the positive side of this as it relates to developing Enketo/ODK Central is that having an error message that INCLUDES the name of the CSV in question would be extremely helpful in these cases so we don't go digging in the wrong place. Based on what I saw of the screenshots for this error message, this is something controlled by ODK Central correct? I'm not sure where the best place to be to suggest this would be...

BAD CSV with empty columns
Screenshot 2024-10-04 at 2.01.50 PM

Fixed CSV
Screenshot 2024-10-04 at 2.02.04 PM

Thank you to all you helped with suggestions for troubleshooting. I feel like a CSV master now and have better tools to diagnose these issues.

3 Likes

:face_with_symbols_over_mouth:

Glad you figured it out!