Problem persisting data or accessing data (Somehow DB entities for publisher got into problem state)

What is the problem? Please be detailed!

I recently upgraded from 1.3.4 -> 1.4.2 -> 1.4.3 -> 1.4.4 -> 1.4.14 as described in https://github.com/opendatakit/opendatakit/wiki/Aggregate-Release-Notes#general-upgrade-steps.

Soon after logging in to ODK Aggregate after the update, I’m getting the error: Error: Problem persisting data or accessing data (Somehow DB entities for publisher got into problem state)

What ODK tool and version are you using? And on what device and operating system version?

ODK Aggregate 1.4.14 on Google Cloud Platform.

What steps can we take to reproduce the problem?

Log into our server at https://pma2020gh.appspot.com/. If you would like to do that, let me know and I can provide credentials.

What you have you tried to fix the problem?

Anything else we should know or have? If you have a test form or screenshots or logs, attach here.

After upgrading the server, we wanted to delete all data but keep our 'appspot.com' URL. Since there is no programmatic way to delete form submissions, we deleted those manually. We also purged submissions.

This error appears as a popup alert message on login, and as a flash message at the top of the screen after clicking ‘ok’ in that alert. It goes away if I navigate or refresh the page, but sometimes comes back.

There’s one form on the server I can’t delete, and some submissions showing which I also cannot delete for that form.

I've examined the logs, but they didn't appear helpful to me. I also examined 'error reporting' on google cloud (different module from 'logging') and saw the following error, which I'm not sure is related, but also occurred after upgrading the server: Type 'org.opendatakit.common.security.client.RealmSecurityInfo' was not assignable to 'com.google.gwt.user.client.rpc.IsSerializable' and did not have a custom field serializer.For security purposes, this type will not be serialized.: instance = org.opendatakit.common.security.client.RealmSecurityInfo@1edab541

I've posted the .json from the log related to the first error that is the topic of this post, as well as the stacktrace related to this RealmSecurityInfo error here in an open Github issue: https://github.com/opendatakit/opendatakit/issues/1106

I have also tried to follow the instructions for manual resolution found here: https://github.com/opendatakit/opendatakit/wiki/Aggregate-AppEngine-Troubleshooting#repairing-the-form-definition-table

However I am seeing that there are no "opendatakit._" (e.g. 'opendatakit._form_info_submission_association') namespaced tables on my server. I scrolled all the way to the bottom and none of them are there. The only kind of tables that appear are:

  • _ah_SESSION
  • _STRING_FIELD_LENGTHS_
  • TASK_LOCK
  • _COST_LOGGING_
  • Many, many tables for what appear to be all form definitions, of a name like "opendatakit.<form_name>" where <form_name> is a form name string of character set [A-Z0-9_].

This leads me to believe that something went seriously wrong when the server was upgrading.

We would just delete the instance and create a new one, but then we would lose our URL ( pma2020gh.appspot.com in this instance), as it appears that disabling/deleting an app takes 30 days, and we can't wait that long. All of our people who use this server are familiar with this URL, so we'd like to keep it.

This is a hard one, Joe. You've done a lot of troubleshooting already and so thanks for being so detailed with your findings!

I don't know what, if anything, went wrong with the migration, but if you want to solve the immediate problem, here's one dangerous, untested, and foolish thing to try.

Create a new App Engine instance with a random URL that you don't care about. Install Aggregate on that new instance with a few forms and submissions. In the Cloud Console, delete all the DB tables that Aggregate has created. Your Aggregate instance should now to be very broken. Re-install Aggregate and I believe you should have a working install with no data and the same random URL.

If that process works, then you have a workaround. But because this is a dangerous, untested, and foolish thing to try, I would not do this on your production server without a confirmed working backup and being OK with potentially losing that URL.

To be even clearer, the point of doing what I suggested above on the random URL/server is to test this technique to make sure it works. If you are convinced that it works, then you can consider it for your production server.

I have had this problem a number of times and have resorted to the technique Yaw mentions. Essentially, I use the Google Cloud Console and use the database admin tool to delete all the tables pertaining to form definitions and form fields. These all have the same naming pattern, so they are easy to spot. I do not delte any other tables - so if they do not have the repetitive form (all starting with the same string), then I do not delete those.

As I said, I have had to resort to this a number of times and it has always worked and (touch wood), it has never broken anything........

1 Like

Thanks for the replies.

@Mark_Schormann Thanks for the comments. I would definitely try that, except for the fact that virtually all non-form-instance tables somehow wound up deleted in the process.

@yanokwa That sounds like a good idea, testing a DB wipe and a re-install, assuming we absolutely must keep the same appspot.com URL. I also agree with your recommendation that this is highly ill-advised, and it would be better for us to do a new deployment, so that is what we will do.

1 Like