More rapid automatic updates of forms

What high-level problem are you trying to solve?
Dynamically linking data across forms in near real-time.

Short Version
We want to be able to automatically check and update forms in ODK Collect using shorter-than 15 minute intervals. Right now the intervals are 15 minutes, 1 hour, 6 hours, and 24 hours. But a 1-minute interval would allow users to collect data about the same entity using different forms, linked by a common ID, with less lag.

Use Case 1 - Registration Data
Consider a health post that has an intake form which captures only very basic information (name, age, etc.). It generates a patient or case ID number, ${case_id}, which will link this case across everything that happens afterward.

Once a patient sees a nurse, the nurse fills a form describing the patient's complaints, etc. In that form, we use "select_one case_id" to select from case IDs that have been generated via the form.

With automatic form updates in ODK Collect, this works very nicely. But the fastest automatic updates check every 15 minutes, and it would be helpful if there were a faster option (i.e. every minute).

Use Case 2 - One HH, Multiple Respondents
Consider a HH survey that seeks information from a male and female adult. Interviews need to be conducted by enumerators of the same sex as the respondent; data needs to be linked between the male and female respondents and broader household questions.

We always avoid having respondents enter unique IDs -- that will later be used to link responses from two datasets -- manually, because errors are too high and difficult to resolve. So in this use we could have a household introduction survey with a household roster. Once submitted, that form generates a Household ID, linked to the name of the head of household.

Now, when the female enumerator selects the female respondent to interview, they can select from a menu of household IDs, which will link her response to the household. They can also select, within a household, her roster/member ID, to link her to information captured about her during the household roster.

Broader Usage
Both these use cases work off-the-shelf with Ona, which allows linking a dataset (updated dynamically) to a form. But the 15-minute lag is problematic.

This revision would benefit non-Ona users also -- allowing the same dynamic linking for any user if they are able to download data from one form and update the .csv attached to secondary forms via API.

Any ideas on how ODK could help you solve it?
Allow users to opt in to automatic form updates using a more frequent interval (1 minute).

Upload any helpful links, sketches, and videos.
This seems related to discussions and implementation of entity-based data collection here. However, I think it is a simpler solution for some specific use cases and users, and (I think) would work without an ODK server. From the little I've read on entity-based data in ODK, it requires a backend approval/review of the data, which is comparatively labor-intensive and also slows down the ability to use the Follow-Up Form.

Many surveys are done in remote areas, so solutions for form/entity references would need to work locally, without server/internet. For ex. basic data of a household (or household roster) and parallel interviews of males and females and even weighting/measuring children aged < 5 years.

Hint: Selecting a HH id is not a 100% security for correct data. For ex. we encountered a case where the interviewer switched 2 household ids, so the real member data were cross-linked. We think, IT will still be a (powerful) tool only, but your enumerators and local supervisors stay your most essential ressource for high data quality.

1 Like

Having this ability to work locally internet would be great, but that seems like a much bigger challenge.

This option uses some existing functionality and would benefit a number of cases. If you expect your enumerators might work in an area without internet, you could always have an option to enter the HH/case ID manually into the secondary forms. Then most enumerators can benefit from the pre-loaded choice list and it reduces cleaning time and data entry errors.

Thanks for sharing your need and describing your use case in detail!

The short answer is that higher-frequency automatic form updates in Collect is something we'll consider but is not high on our priority list at the moment. We are putting the pieces in place to do offline entity workflows and want to focus on that. As @wroos says, we know it's essential for many workflows.

People do certainly use entity lists or attached CSVs that update automatically to be accessed in follow-up forms just as you describe. What I've seen form designers do that is to add a note at the beginning of the form that says something like ":recycle: Make sure you have updated the form before starting". You can also put household selection as the first question so that the data collector can immediately verify whether they have access to the household they need before getting far into the survey. If they don't see the household they need, they can exit the form and refresh.

There are a couple of reasons that increasing the automatic refresh rate may not be as simple as it seems.

First, the Android mechanism we use to do these updates is limited to a frequency of 15mins. For higher frequency updates we'd have to add a different approach.

Second, downloads of attached CSVs are currently all or nothing. That is, if you have 20,000 rows in your CSV and add one, you'll have to download all 20,001 rows again. That makes the automatic updates heavy operations for the server. One of the things we're working on with entities is allowing progressive updates so that devices only need to download rows that have changed or been added.

1 Like

Hi Helene. Thanks for this response and for clarifying the challenges. I did not realize the update frequency was limited to 15 minutes by Android. And while the size of downloaded CSV files isn't an issue for us right now, it would be in the long run, so having the option to only download new rows of the CSV will be really helpful.

I think for our purposes, we can probably rig up an alternative using Tasker to trigger ODK to update the forms more frequently. If we use this option, I'll share the Tasker profile and other details here in case they are useful for others.

Thanks for your insights!

1 Like