Problems with large form submissions to Aggregate - 60-second request limit ODK Collect form submission error?

What is the problem? Please be detailed.

We are seeing an error a lot (3,000+ times in the last month) when trying to submit rather large (e.g. 30 MB) forms (see example stack trace below). The large parts of the forms are photos and one video; the total size of the videos is around 20 MB.

Does this error have to do with the 60-second request limit whereby 10 MB submissions must be finished within 60-seconds, so slow/bad internet is the issue? We have had success loading similar sized forms from fast/good internet connections.

Also, is it a problem to have a single file (e.g. video) that is larger than 10 MB? The response code is "500 Internal Server Error" but I don't understand the line about the "Illegal character in path at index 30."

Here is the parsed stack trace from an example:

org.opendatakit.aggregate.task.UploadSubmissionsWorkerImpl: uploadAllSubmissions: org.opendatakit.aggregate.exception.ODKExternalServiceException: java.lang.IllegalStateException: java.net.URISyntaxException: Illegal character in path at index 30: http://35.192.48.24:8089/entry
at org.opendatakit.aggregate.externalservice.JsonServer.sendRequest (JsonServer.java:148)
at org.opendatakit.aggregate.externalservice.JsonServer.insertData (JsonServer.java:197)
at org.opendatakit.aggregate.externalservice.AbstractExternalService.sendSubmission (AbstractExternalService.java:138)
at org.opendatakit.aggregate.task.UploadSubmissionsWorkerImpl.sendSubmissions (UploadSubmissionsWorkerImpl.java:322)
at org.opendatakit.aggregate.task.UploadSubmissionsWorkerImpl.uploadSubmissions (UploadSubmissionsWorkerImpl.java:282)
at org.opendatakit.aggregate.task.UploadSubmissionsWorkerImpl.uploadAllSubmissions (UploadSubmissionsWorkerImpl.java:196)
at org.opendatakit.aggregate.task.gae.servlet.UploadSubmissionsTaskServlet.doGet (UploadSubmissionsTaskServlet.java:109)
at org.opendatakit.common.security.spring.SecurityContextHolderAwareAuthPreservingRequestFilter.doFilter (SecurityContextHolderAwareAuthPreservingRequestFilter.java:66)
at org.opendatakit.common.security.spring.DigestAuthenticationFilter.doFilter (DigestAuthenticationFilter.java:40)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter (FilterChainProxy.java:331)
at org.opendatakit.common.security.spring.OutOfBandUserFilter.doFilter (OutOfBandUserFilter.java:105)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter (FilterChainProxy.java:331)
at org.opendatakit.common.security.spring.Oauth2ResourceFilter.doFilter (Oauth2ResourceFilter.java:352)

What ODK tool and version are you using? And on what device and operating system version?

ODK Collect v1.17.0 (I'm not 100% sure that everyone trying to submit forms is on this release)
ODK Aggregate v1.4.15 on App Engine

What steps can we take to reproduce the problem?

I'm not sure.

What you have you tried to fix the problem?

Assuming this has something to do with the 60-second timeout issue, we are trying to make sure that people submit their forms with fast/good internet.

Anything else we should know or have? If you have a test form or screenshots or logs, attach here.

Hi, @Jeff_Davids!

Sorry for taking so long to respond. Normally we like to be faster :S

The attachments you're describing are indeed big, and with poor connection speed this can be a problem, especially on App Engine. Nevertheless, I can't see that there's a clear relation with the error stacktrace you've attached.

Have you tried to use faster connections while uploading big submissions? Did it make the problem go away?

In any case, I think I would need to look through your server's logs to get more information. If you can invite me (ggalmazor@gmail.com) to your google could project, I could take a peek.

Hey @ggalmazor!

Thanks for getting back to me. I just added you as an editor to our Google Cloud project. Let me know if you need different credentials to take a quick look and see if you find anything.

Yes, we always recommend to our citizen scientists to use the best internet connection available, but here in Nepal, so Internet Service Providers are so bueno. Anyways, it would be great if you could take a look.

Also, I would like to really understand the 60 second request limit a bit more. One simple question is if a file larger than 10 MB can be split into 2 or more separate 10 MB packages?

Thanks again for your help with this! We are so thankful for the ODK community! It is such a valuable tool! We are getting the word out to everyone we know.

Best,

jeff

Hi, @Jeff_Davids!

I'll take a look at your server next Monday. Thanks for giving me access.

Regarding your question, according to the docs, "ODK Collect splits submissions into multiple 10MB submission requests". More info here: https://docs.opendatakit.org/aggregate-limitations/#time-limit-may-be-exceeded-on-low-bandwidth-connections

I think that includes media attachments from submissions as well, but I'll try to confirm that as well.

I @Jeff_Davids!

I'm sorry I haven't got time so far to check on this. I just wanted to say that I haven't forgot about it.

Hey @ggalmazor! Any update? While the errors aren't occuring any longer, it would be good to close the loop on this, specifically regarding our understanding of the the 10 MB limit (understanding the 60 second request limit including if a file larger than 10 MB can be split into 2 or more separate 10 MB packages).

I have some other questions/issues regarding a planning Aggregate Upgrade, but I'll put those in a new thread! Thanks for your help with this.

1 Like

Thanks for your patience, @Jeff_Davids!

I haven't been able to check your logs yet (sorry!). In any case, since you're reporting that it's no longer happening, I suggest we wait until/when it does again to check it while it's fresh.

Regarding your question about splitting large submissions, I linked the docs where it's explained how these mechanics work. Does that document answer your question or is there anything else you'd need to know? We can pull in other people more seasoned to Collect to try to add info.

1 Like

Thanks for your follow up message @ggalmazor!

Yes, let's wait on doing any more investigation since the error has resolved itself for now.

After re-reading the link you provided a couple of times, it still isn't clear to me if a single 20 MB file (e.g. video) can be split and submitted as two 10 MB packages from ODK Collect. Sorry if I'm missing something. Understanding this isn't super urgent, but we do measurement campaigns that require video captures, and we have had a lot of issues with successfully submitting these forms, so I want to have a better understanding of what the file size limitations actually are on the GAE Aggregate side.

Thanks so much for all your help and support!

1 Like