ODK Central corrupts large csv media files

1. What is the problem? Be very detailed.
ODK Central corrupts large csv media files, but not smaller files. The form with corrupted csv media files fails to work when downloaded to ODK Collect.

2. What app or server are you using and on what device and operating system? Include version numbers.
ODK Central (latest version, installed 9/25/2019) on DigitalOcean.

3. What you have you tried to fix the problem?
Tried various test files to isolate the problem

4. What steps can we take to reproduce the problem?
I have a test form and alternative csv media files that I can send.

5. Anything else we should know or have? If you have a test form or screenshots or logs, attach below.
Sorry, I do no see a means to attach the test files to this message. Please advise how to upload the files. Thanks.

Addendum to support request:

Below are test files to reproduce the problem.

*CentralTest19c.xlsx and CentralTest19c.xml are a extract from a much larger form.

  • WD_CentralTest19c.csv is an extract of relevant columns from the larger form's media file.

  • WD_CentralTest19c_FromCentral.csv is the same file after Central uploads it. ODK Collect (v1.23.3) cannot interpret the data in this file; it appears to be corrupted.

  • WD_CentralTest19c_small.csv is a much smaller version of the csv media file. It seems to work OK.

Any help with this issue would be greatly appreciated. Thanks! HB

[WD_CentralTest19c_FromCentral.csv|attachment] (191.9 KB)

[CentralTest19c.xml|attachment] (2.2 KB)
[CentralTest19c.xlsx|attachment] (22.7 KB)
[WD_CentralTest19c.csv|attachment] (1.4 MB)
[WD_CentralTest19c_small.csv|attachment] (16.5 KB)

links removed because they contain personal information

@Hayden_Boyd How much memory does your Digital Ocean box have? I think this is a memory issue. Read Failed download records .zip folder from project submission page - #5 by LN for few options while we try to repro this. My guess is that bumping up your RAM or adding swap might help.

Thanks @yanokwa. FYI, I started with 1GB Digital Ocean memory as suggested by the ODK instructions, and increased to 2GB when I got an insufficient memory error message during Central installation. We can certainly add more memory if needed for the form and submissions. Fortunately, we will not start field work soon with this survey, so there is plenty of time to resolve this.

Thanks very much for sharing your files, @Hayden_Boyd. I have reproduced the problem and it turns out that Collect is not unzipping the file when it should be. You can change the extension of the small file to zip and verify that it correctly expands to your full content.

I'll update the thread as we work on a fix for Collect.

I stand corrected, @issa has found that the file is being ingested zipped into Central and will be following up.

hey @Hayden_Boyd, i have some idea what's going on and i'll try to have a patch for you in the next day or two.

Thanks, @issa. I look forward to your patch.

hey @Hayden_Boyd:

this pull request: https://github.com/opendatakit/central/pull/99 should resolve your issue. it is preliminary, so i wouldn't advise using it on a mission-critical system, but i think the risk is low. i would still advise making a backup first if you have data you don't wish to lose, less because of the risk of this patch and more just because.. who knows? :slight_smile:

to install it:

  1. log into your server and navigate to the /central directory.
  2. apply the patch: git fetch and then git checkout issa/fix-large-uploads
  3. rebuild the nginx image: docker-compose build nginx
  4. restart the system: systemctl restart docker-compose@central

following this, re-upload your corrupted files and they should work correctly now.

1 Like