for some guidance.
With large datasets, the big question is:
Where and/or how are you going to analyze the results?
Once datasets get large, the Visualization tools in ODK Aggregate will stop
working, and the file export functionality will stop working (both require
holding all the data in memory), and the submissions list will cease to be
When that happens, the two options for working with the data are to either
(1) publish the data to an analysis server (e.g., Fusion Tables, Google
Spreadsheets, etc.) or (2) for you to use ODK Briefcase to download the
dataset to your computer, generate the CSVs on your local computer, and do
the analysis on those CSVs.
For smaller datasets, ODK Aggregate can be used as a datastore-of-record
for your survey efforts. I.e., it can hold all collected records. As
datasets get larger and as the analysis and visualization become too
complex for ODK Aggregate, it may be more appropriate to view ODK Aggregate
as a waypoint in the flow of these records into a more functional data
In that usage scenario, the "Purge Submissions" functionality found on the
Forms Management / Submission Admin sub-tab may be useful. You can remove
older submissions once they have successfully been moved onto the data
The very short answer to your question is that there is no limitation in
the software. It will continue to operate up until:
(1) it runs out of memory or
(2) runs too slowly to complete a form submission or other interaction
within 60 seconds (at which point ODK Collect will be unable to submit data
into the system).
Not using visualizations or file-export actions and handling the data via
ODK Briefcase or via publishers places very little demand on memory (making
(1) not a concern); but loading a server with 100's of forms does -- each
form definition is held in memory for performance reasons. The only remedy,
as you go to 100's of forms, is to go to a larger machine and larger JVM
Slow operations may begin to appear as the data access queries to filter
the data to a specific set of rows take longer and longer to execute.
There is very little that can be done on AppEngine to speed that up. If you
are running MySQL or PostgreSQL, there are many things a good DBA can do;
and purging the older collected data will also address this issue.
On Wed, Jun 10, 2015 at 11:12 AM, Makhate Makhate email@example.com wrote:
What is the maximum number of records that "Aggregate" can handle. We want
to do a survey and we have a population of about 2 million and about 500
000 households on average.
We're thinking that maybe we're going to have to use 8 forms per person;
that's 8 x 2 million = 16 million forms. Can ODK handle that?
You received this message because you are subscribed to the Google Groups
"ODK Community" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to firstname.lastname@example.org.
For more options, visit https://groups.google.com/d/optout.
University of Washington