ODK Aggregate future plans

Together with my team, we would like to share our ideas on how to improve ODK Aggregate and how the future development and maintenance might look. We've based our thoughts on the experience that we’ve gained during our work with ODK for Poverty Stoplight project for the last months and what we found on the ODK forum and Slack channels.

First of all, we need to simplify deployment process. That would fasten our work on the next improvements, and also it would make easier deploy for you after final version is released. When searching the ODK forum, a majority of questions and topics related to Aggregate are connected with problems during the deployment for both users and developers so we believe that it is an issue worth addressing.

Fresh UI - we could create simplified, more user friendly UI, for all aggregate users. We are aware that the design of the UI hasn’t changed much since it was first implemented. We believe that redesigning it would make the Aggregate more usable as the user experience has changed during the last few years and the current UI was often misleading and unintuitive. A great example of what can be done was presented by the Caden Howell’s work on ODK Hamster (https://github.com/benetech/odk-hamster and https://github.com/benetech/odk-hamsterball-java).

RESTful API is highly requested change when talking about Aggregate. It would enable easier integration with other tools by setting up a modern way of connecting services. Currently, Aggregate uses a lot of different technologies like servlets, REST API for ODK Tables, GWT. Using only one and the most popular solution would make a huge difference in terms of integration and maintenance of the code. We could use the JAX-RS and Jersey for achieving it just like it is used for ODK Tables. Additionally, when the RESTful API would be finished, we would suggest to document it with for example Swagger. Process of switching to the RESTful API would definitely be time consuming but the effort is worth taking.

We could also create offices, so there will be no need to use many deployed aggregates if you need to split the aggregate content (forms and instances). Admin would be able to see all instances and forms. The idea of splitting the content of a single Aggregate instance with permissioning depending on the user rights was successfully implemented for ODK 2.0 during our work with Benetech so it shouldn’t be difficult to achieve a similar result corresponding to our needs.

Apart from the above, we could think of changing the authentication to a token based one instead of the old basic authentication over HTTPS. I've seen this topic being raised here.

Another smaller things worth considering are:

  • Submissions counters to measure and visualize the growth of the number of submitted data records.
  • Filtering of the instance IDs based on submission date as currently there is no such functionality within the ODK.

At last we could create possibility to edit data on Aggregate side. If office manager/admin sees any mistake on the instance side, he can correct it. This in fact, should be discussed if it is even a desirable feature.

I would highly appreciate your feedback about our ideas.

Thanks,
Jakub

5 Likes

Three things that really popped out to me from the implementer end:

The ability to split the content of a single aggregate instance with permissions depending on user rights is immensely helpful for implementation in the field.

Submission counters and filtering (or ordering) on date would be helpful as well.

However, I am less enthused about editing data on the Aggregate side--this can cause as many problems as it solves. We really try to do any data cleaning as a separate step so that you can always undo any changes by redownloading the data or starting fresh from the downloaded data. Especially if skip patterns and programming were correct in surveys, it's quite possible that an "error" requiring a "correction" is just a misunderstanding of the survey structure that needs to be explained, not changed.

2 Likes

I agree with you about the splitting of the content on Aggregate. We've successfully done it in our implementation and as an effect we've been able to decide which forms belong to which office and regional offices managers could only see their ones and the admin was able to see them all and control who sees what by adjusting their permissions and assigning managers to certain offices. I believe that increases the usability and versatility of the Aggregate as such division of the content always stays as an option and it isn't a must to use it.

Speaking about the possibility to edit the collected data, to be honest I would rather keep the already stored data as uneditable on the Aggregate. Unless, there will be strong votes for such solution.

2 Likes

I agree with @redlar that simpler deployment, a cleaner UI, and a RESTful API would be huge wins for Aggregate, but before we can do all that, we need to be deliberate about how we move it from UW into the community so we can incentivize contributions from more folks.

I'd propose that we first do the things we've done to all the community-managed tools.

  • Document how to get started (readme, contributing)
  • Standardize on build tools (IntelliJ, Gradle, etc)
  • Introduce continuous integration and codestyle
  • Move and triage relevant issues to Aggregate repo
  • Move and triage the relevant documentation

I think the above will take a month or so, and once we have that in place, it will be easier to bring on contributors to add some basic changes.

  • Upgrade to latest version of JavaRosa
  • Add some basic usage analytics
  • Take on one or two bug fixes
  • Start to simplify deployment for users

Once that's done, we'll have developed some experience in the community about the state of Aggregate, we'll know what the long-standing issues are, and then we can plan a way forward as far as new functionality.

1 Like

UW is happy to hand off Aggregate support at anytime.

It is important to note that Benetech's ODK Hamster is designed for the ODK 2.0 tool suite which is already based on RESTful APIs. ODK 2.0 tool suite is a RESTFul protocol vs the OpenRosa standards used for ODK 1.x Tool Suite.

My personal opinion is a fresh UI should be done that abandons GWT, so I am in agreement with Jakub. I also personally do not think Aggregate needs to support both the 1.x and 2.x tool suites (which will eventually be renamed) as I think a mirco-services approach using something like Docker is preferable. The advantage of micro-services is that deployers will be able to run the subset of the cloud services a deployment actually needs by spinning up only the components being used. It also makes it much easier to be modular and easier to mix and match services. It also avoids huge code bases like Aggregate which over the years has become harder for people to understand because strict modular design was not enforced. A micro-services approach is modular by design and since the various parts run in separate process it forces developers to adhere to a modular system.

As part of the ODK 2.0 Tool Suite release scheduled in September the new Sync-Endpoints for ODK 2.0 will be released. If you would like to check out the RC1 version of the Sync-Endpoints see the RC1 announcement for instructions.

Additionally, @linl33 is currently modifying a scaled down version of https://github.com/benetech/odk-hamster to be part of the Sync-Endpoint micro-service cloud that will replace the GWT UI carried over from Aggregate. The ODK 2.0 community is appreciative of Benetech's contribution and hopes to have integrated a portion of ODK Hamster into the Sync-Endpoint micro-service cloud by the RC2 release in August.

3 Likes

Thanks for go-ahead, Waylon. I'll get started on moving the issues over from the opendatakit repo to the aggregate repo and work with Jakub on standardizing on the build tools and getting CI up and running!

I'll update this topic with any major changes to the plan I posted above.

Oh, and one more thing!

If you are familiar with Java EE and you'd like to help, please reply to this thread. The more people contribute, the faster we can improve Aggregate.

1 Like

Definitely time to hand off ODK Aggregate.
No plans for any changes to the codebase from the UW team.

Before any restructuring, I recommend:

(1) An end-of-life announcement for Google Standard Environment AppEngine and Datastore at the 1.4.15 release. This is an overly restrictive painful system to function in and has imposed many limitations on the code such as the inability to dynamically sort submission results by any data column, etc.

Users will then need to use ODK Briefcase to migrate data off of that system into future releases.

(2) Jettison Google Datastore support. Change the Google environment to use Google CloudSQL.

I would also encourage abandoning MySQL support. The archaic 65kB row-size limit in MySQL is burdensome to support. PostgreSQL and MS SQL Server offer 1MB row sizes, which are much more reasonable.

In particular, with a 1MB row and by jettisoning Datastore support, you can offer display and sorting of submissions by any arbitrary data column with the caveat that sorting would only be available for just the first 3000 columns.

With MySQL's 65kB rows, you are limited to as few as the first 256 columns without considerable codebase expansion and effort, which strikes me as entirely inadequate.

If you intend to revise the server to support dynamic form schema changes, then you should change to a key-value-store implementation. This would be a major re-write.

Regardless, you should change the VERSION column to be a text field, as per the spec (it is currently an integer field).

(3) move to docker packaging for the Google Flexible Environment AppEngine and for alternative hosting environments.

(4) Moving off of Servlet 2.5 to newer async Servlet technology (3.0 or higher) is highly desirable.

Coupled with the abandonment of Datastore, this would allow streaming processing of mime content, which would reduce memory requirements for binary attachment retrieval and form attachment submission and increase the memory efficiency of the server. This could also benefit the publishers (though the rewrite there would be trickier).

(5) Eliminate CSV and KML document generation on the server. This is complex and would be better supported in ODK Briefcase.

(6) RESTful API to replace the internal GWT RPC is useful.

GWT was initially chosen because there is no clearly-superior web UI layer toolset / language and it was a better use of developer resources to leverage Java coding skills for the UI layer than to choose a technology and maintain an entirely different codebase.

The downside of this is that you will need UI developers for the new UI layer built on top of this RESTful API. The plus side is that you could consider translations. If you did more form processing, you could also expose a RESTful api that would provide localizations of the choice-list values as display strings (rather than their stored value strings).

This multi-server architecture (server for UI layer; server for RESTful layer -- not yet a micro-services architecture) was not possible in the Google Standard Environment AppEngine deployments as they were first released.

(7) JAX-RS based RESTful API.

Note that this is problematic when used for data object annotations if you want to access this via Android native Java app. Android does not support most of the jax functionality because Android actively excludes XML support. That means most of these annotations cause compiler failures, at least for older Android API levels.

For the 2.0 tools, we needed to use the Jackson annotations to provide RESTful object library portability to the Android tools. While the server can vend both XML and JSON content, the Android implementation only handles JSON content.

I therefore advise avoiding JAX-RS annotation in the data object definitions for this reason. Use Jackson annotations instead.

That said, JAX-RS for the service layer definitions is OK. Don't know how this would impact Swagger.

I don't remember the specifics, but I think I encountered a problem with Jersey on the Servlet 2.5 container of Standard Environment AppEngine.

Also, AppEngine standard environment has a bad habit of screwing up the request headers, particulary around gzip content encoding. Hopefully the dynamic environment doesn't screw those up. For these reasons, the current Aggregate sync code uses Wink, which is sub-optimal.

(8) chunked (streaming) compression

Because of a slew of Google Standard Environment AppEngine issues and the documentation required for Servlet 2.5 deployments, the server code does not use chunked (streaming) gzip encoding, but an older-style full-gzip content encoding.

This should be changed to use chunked encoding.

If docker containers for the servers are provided, the container configuration to support this would be automatic and therefore the documentation burden could be avoided.

(9) Authentication. The server uses Spring Security for its authentication and authorization stack. This can support multiple authentication mechanisms. Aggregate 1.4.15 supports digest auth or basic auth (either/or, not both), non-secure oauth2 google tokens, and out-of-band authentication (Oauth 1.0 on AppEngine). Because it supports digest and basic auth, it also uses session cookies. You can't really change this unless you update the OpenRosa spec to move away from digest auth.

(10) Testing AppEngine is critical. In particular, testing in a high-latency/low-bandwidth environment will yield different store-and-forward interactions than testing in a high-bandwidth environment.

Good luck; it is a complete mess. Hopefully the dynamic environment is not as badly broken as the standard environment. Note, too, that the development server behaves differently from the actual appengine servers, so you need to test against the real servers to identify the real issues there, not just within the dev environment. Getting it to work in both places is not fun.

5 Likes

Thanks to everyone who has chimed in here and especially to @Mitch_S and @W_Brunette for sharing their invaluable experience!

Sadly for us but exciting for him, @redlar has been busy with travel. We miss you @redlar and hope you can share your experience with the ODK community again at some point!

I'd like to introduce everyone to @ggalmazor who has joined the Nafundi team and has fearlessly started making Aggregate improvements. His priorities are to make Aggregate easier to build and deploy and to investigate recurring issues. As he takes the lead on those aspects, he’ll be soliciting feedback and opinions here and on GitHub issues. Welcome, @ggalmazor!

The first big task @ggalmazor has taken on is to migrate the project to use gradle as the build tool and recommend IntelliJ as the development environment. These changes make Aggregate easier to build locally and provides some standardization across ODK projects. That work is available for review at https://github.com/opendatakit/aggregate/pull/112

3 Likes

Hi everyone!

I wanted to give a quick heads-up about an initiative we're working on. We are trying to mitigate data corruption problems users are currently having with Aggregate, especially when deployed in AppEngine.

We are working on two fronts:

  • First, stabilize Aggregate's behavior when data is corrupted, preventing what we've colloquially named "refresh hell" (the UI starts refreshing, and it's impossible to interact with Aggregate) and other adverse effects. (#154)
  • Second, create a new Database Repair tool that lets users know which kind of data corruption they're suffering and apply fixes. (#135)

Unfortunately, Aggregate can suffer from a range of possible data corruption causes, and we are much limited in our efforts to solve this problem. We can only work on things we already are aware of (through feedback, experience, etc.) or are self-evident in the code, but we suspect there are many more undocumented causes that we don't know about. We need the community's help here to go on.

We'd like to collaborate with Aggregate users who are currently having data corruption problems to test our work in new scenarios and improve our efforts to make Aggregate much more stable. That's why we're planning to start a beta program for Aggregate.

It's important to notice that participating in this upcoming Aggregate beta program will involve installing, well, beta versions of Aggregate that could damage or produce data loss.

We will write a post with all the details in the next days. In the meantime, it would be great to have your feedback/questions about this here.

2 Likes

To defend Aggregate a little bit, I don't think the database glitches that I'm aware of (constant refreshes, duplicate entries in repeats) ar all that common.

We see it show up occasionally on the forum, but I've seen it once or twice in the many years I've been installing Aggregate for campaigns. Further, Aggregate has been downloaded more than a million times and people do successfully get their data in and out. That is, I'm not aware of users losing data while using Aggregate.

I'm super excited about shipping the repair tool because:

  1. It'll give us real visibility into what, if any, problems exist in the production versions of Aggregate.
  2. It will also give users the ability to fix the common problems that Mitch and I had to occasionally fix manually.

And maybe that's how we structure the beta. The first beta is to show where there are inconsistencies in the database, then the second beta is to try to fix those inconsistencies.

For those following this topic, ODK Aggregate v1.5 Beta just shipped and your feedback would be greatly appreciated!

1 Like