Issue accessing data on Aggregate

Hi all -- I've been AWOL from the community for a while as year end work piled on accompanied with some personal items to attend to. I'm in a bit of a bind at the moment on a project I'm working on in Ethiopia. I appreciate any support which cant be lent -- I understand we are entering the holidays.

What is the problem? Please be detailed.
When trying to pull data from Aggregate using briefcase we experience a 'failed' after the same instance each time.

When trying to view/access data directly from aggregate we get the following error, but the uuid is different each time.

Error: Problem persisting data or accessing data ([RepeatSubmissionType.getValueFromEntity] SQL: SELECT * FROM opendatakit.PRIME_IE_HOUSEHOLD_V2_M6_M6R1 WHERE _PARENT_AURI = uuid:bd43de82-5fa7-4798-b19a-21a5c6d08fbd ORDER BY _PARENT_AURI exception: This request (c9344201b4e87ec9) started at 2017/12/20 21:12:23.531 UTC and was still executing at 2017/12/20 21:13:23.311 UTC.)

What ODK tool and version are you using? And on what device and operating system version?

We are running the survey on the most up-to-data collect, and using aggregate V1.4.9.

What steps can we take to reproduce the problem?

I can provide developer access to replicate this on the server to an ODK team member.

What you have you tried to fix the problem?

For uuids which we had already downloaded, but were showing up as an error, I deleteded them directly from aggregate using the 'filter' option under the 'submissions' tab. This did not work.

I found the following resource on github --> https://github.com/opendatakit/opendatakit/wiki/Aggregate-AppEngine-Troubleshooting which seems to highlight the issue, but when following the instructions I can not seem to locate the specific 'Kinds' which are noted under the repairing the form definition table.

Anything else we should know or have? If you have a test form or screenshots or logs, attach here.

The form is very large with a number of repeat groups and nested groups. Equally internet is very poor with teams often dropping and regaining connectivity while submitting forms.

Hi @Lloyd_Banwart, we have a fix pending for this issue, but it needs to go through more testing before we release. If you give me developer access, I'm glad to apply the fix manually. I'll send you an email with instructions in a few minutes...

2 Likes

This is me sending a digital hug and much thanks.

To bring this to a close. Yaw offered generous assistance to initially overcome this issue. It seems that if I had been running the most up-to-date version of aggregate the likelihood of this occurring would have been much lower. After the issue was addressed, I disabled submissions and updated aggregate to the latest version. This issue has not repeated during this time (about 2,500 more submissions).

Thanks ODK community!

~lloyd

2 Likes

Glad to hear it worked out, @Lloyd_Banwart!

You are correct that upgrading to 1.4.11 or a bigger machine (F4/B4) would have reduced the probability of his happening. And although it doesn't happen often, I wanted to write up a summary of what I did to fix the problem and what we'll be doing in future releases.

First, I verified this was a problem by downloading the relevant form with Briefcase. This also helped create a backup in case something went wrong. The Aggregate and Briefcase logs reported that there was indeed a problem. Then I followed the instructions at Aggregate-AppEngine-Troubleshooting to try to repair the form table.

The deviations to the above instructions were:

  1. The Google Cloud Datastore Admin page kept reporting an 500 error so I could not back up the Aggregate database! I'm not sure why this happened, but it made this repair a bit riskier. I didn't have any other options, so I had to be extra careful.

  2. The Google Cloud Datastore's query by kind UI did not have a complete list of all the tables on the server. This was really surprising. As a workaround, I uploaded the form to my local Aggregate install, then used that as a reference to identify all tables. Then I used query by GQL to list each one and check for duplicates.

  3. There were about 80 tables in all maybe 40 of them had duplicates! Most had 4-5 and the biggest had 20 or so. The rough process was to find the table in the DB that had the repeats, use the copytables Chrome extension to copy the table, put it in Excel, identify the dupes, then copy each of their UUIDs, and delete them one at a time. It's pretty dreary work.

  4. After that, I ran Briefcase again to make sure I didn't miss duplicates and that was that. The whole process takes a couple of hours...

So although this is rare, the fact that it takes so much time to fix makes it a priority for the next release of Aggregate. That release will have these fixes.

  1. Instead of crashing when there are duplicate or missing rows, we will export most recent duplicate. This change has been finished and is at https://github.com/opendatakit/aggregate/pull/153.

  2. Generate a report of any issues with submissions. Users can share this report with devs so we have better insights into what is going wrong with the data store. @ggalmazor will be working on this and he'll have an issue filed soon!

  3. Alert users when there is a new release. I've filed an issue here https://github.com/opendatakit/aggregate/issues/185 so we don't forget.

2 Likes