Increasing ram for Aggregate on App Engine

I'm technical support for a research project that has decided on using ODK Aggregate on Google App Engine as a backend (with ODK Collect to collect the data). The project has collected data for almost a year, and now it is time to start analyzing it. Annoyingly, about the same time some limit was hit in Aggregate preventing access to the data.

What is the problem? Please be detailed.

Most of the time when trying to list submissions or publish/export data from Aggregate, nothing happens for about a minute, followed by the page reloading, and sometimes followed by an error page questioning if the bill to Google has been paid. That is; the app crashes and reloads.

Looking at the application error logs in App Engine, I see "OutOfMemoryError: Java heap space" errors at the exact same time the problem occurs. The class responsible for the error differs, but the three main contributors are BinaryContentManipulator.java, ImageUtilImpl.java and ProtocolSink.java.

Worth mentioning is that almost every entry collected has 1-3 pictures attached, usually full size pictures taken with the smartphone that is doing the collection.

When the instance class is F2 (max ram 256MB), the error is not much of a surprise when image manipulation is involved for hundreds of pictures. I would like to have the instance class changed to F4, but can't figure out how to do that.

What ODK tool and version are you using? And on what device and operating system version?

ODK Aggregate is version v1.4.15 (according to the log), running on Google App Engine.

What steps can we take to reproduce the problem?

Probably requires hundreds of entries with several full size pictures on each entry to reproduce, since the test system with fewer entries did not catch the issue.

What you have you tried to fix the problem?

Not much - the collection is still ongoing, so any major changes or upgrades are out of the question at the moment, and only the live Aggregate instance has data that triggers the problem - making testing difficult.

What I would like is to find a simple way of changing the instance class to F4, since that would probably "hide" the problem. Unfortunately, that appears to require a modified install of Aggregate since it is no longer possible to change instance type directly in the App Engine.

Hi @danielh, welcome to the community and thanks for the detailed issue. When you get a chance, please introduce yourself here. I'd also encourage you to add a real picture as your avatar because it helps build community!

To solve your problem, re-rerun the v1.4.15 installer and edit the ODKAggregate/default/WEB-INF/appengine-web.xml file to use bigger machines. The process id documented in greater detail at https://docs.opendatakit.org/aggregate-boost-performance/#web-server-size.

I'd also strongly encourage you to setup nightly backups with Briefcase via the CLI because, well, App Engine's performance with managing very large files can be miserable. Backups also make it easier for you to safely upgrade to newer versions of Aggregate.

And speaking of newer versions, note that Google will be deprecating the Java 7 runtime in Jan 2019 and I believe Aggregate v1.4.15 uses that runtime. And if that isn't compelling, newer versions of Aggregate have security improvements and improvements to data export.

Finally, if you don't need high resolution pictures, you might want to consider resizing them on the device. https://docs.opendatakit.org/form-question-types/#scaling-down-images has more.

Thank you very much for your reply!

Short introduction: done! :slight_smile:

I thought I had looked everywhere for a solution, but obviously not. Thank you for pointing me in the right direction. :slight_smile:

However, there appears to be a discrepancy between the version number that can be found in the App Engine log (which I used to find "1.4.15 Production"), and the version number found under the preferences tab in Site Admin (which I just found out says "v1.6.1"). Which one is to be trusted? I'm leaning towards 1.6.1 since I do not get any deprecation warning for Java 7 in the App Engine administration. Unfortunately I have no way of verifying this by looking at the original installation media, since I do not have access to the computer making the original install (I was thrown in to the project only two months ago).

Thank you once again for your help!

Don't trust the log. It's a bug :sob: and I've filed it at https://github.com/opendatakit/aggregate/issues/355. We'll take care of it in the next release.

v1.6.1 is pretty recent, so re-run the installer from https://github.com/opendatakit/aggregate/releases. If you are feeling a little adventurous, you can upgrade to v1.7.1 since it's pretty safe.

Thank you! I thought it could be a bug, but did not want to make such a claim without knowing for sure.

May I make a suggestion? Remove/hide the part about looking in the log on https://docs.opendatakit.org/aggregate-upgrade/, since it can't be trusted until 2.0 is released (according to the bug report you filed) - and the other versions where the version number is correct uses a deprecated Java that will be removed in less than a month anyway. Just so future readers decrease their risk of unintentional downgrades.

Is it possible to upgrade directly to 1.7.1 (changing to instance type F4 at the same time), or should I do 1.6.1->1.7.0->1.7.1 if I would like to upgrade?

Thank you once again for your help!

1.61 -> 1.7.1 is a safe jump.

@ggalmazor Given the version bug in current versions of Aggregate, is there any other way to tell the version in the log?