Update an Entity from a form submission with server-side conflict detection and resolution

Entities are a way to represent people, places and things that can be shared between forms and updated over time. They make it easier to do things like longitudinal data collection or case management. You can learn more in the documentation.

Currently, forms can create Entities but any other Entity management must be done from Central. Over the next few months, we will be expanding how form submissions can interact with Entities to work towards fully offline workflows.

Our first step will be to make it possible for forms to update Entity properties so you can share information about an Entity's status across forms. For example, you will be able to update the Entity each time a workflow step is completed to enforce a sequence.

Updates are challenging when form users could be offline for some time! When users are offline, they could see out-of-date Entity data and even make an update that conflicts with someone else's. In this initial work, updates will only be applied on the server. We will then add offline updates to Collect.

:zap: As we develop our approach to Entity updates, we'd like your feedback. We'll use this thread to share our high-level approach and answer any questions you might have.

Key assumptions and context

Field and office distinction

Many projects that would benefit from Entities involve a strong difference between “field” and “office”. Often that’s because the individuals working in the office are different but it can also be different modes that the same individual is in (e.g. in the field, I want to capture as much as possible; in the office, I want take my time to make sense of what happened in the field across time and space)

There are a few common update types

We intend ODK Entities to be minimal representations of real world people/places/things and NOT 1:1 matches with form submissions. Entities are more about driving workflows and less about managing data. These are the update types we expect:

  • State update to drive workflow: updates a small number of Properties that represent an Entity’s current state. This will be used to drive workflows by determining which forms see the Entity, which questions get asked about the Entity within a form, etc.
  • Error fixing: updates an unchanging property that could have had an error in it (e.g. a birthdate)
  • Rare update: updates a property that changes rarely (e.g. a first name)
  • Non-overlapping updates: different forms update separate properties of an Entity. E.g. One form captures a tree’s circumference and another captures its height because the process is different and it makes sense to have separate forms

Broadly, we expect that Entities will generally have few properties and that it will be rare for a single form to use or update all of an Entity's properties.

Conflicts happen in specific contexts

Given the kinds of updates we anticipate, we believe it will generally be rare for two or more offline users to modify the same Entity. Contexts where we believe conflicts are likely:

  • Offline, multi-enumerator workflow: Entities go through multiple offline steps carried out by different individuals
  • Unpredictable encounters: highly dynamic workflows where field staff are offline for a long time and Entities can be encountered at any time
  • Process problem: a workflow error happens (e.g. a patient goes to two clinics the same day for the same reason)

Goals for Entity updates

  • Allow field staff to continue making progress even if they can’t get the latest Entities
  • Make an effort to provide the latest Entity data even when conflicts have been detected
  • Display detected conflicts to the Central user (both cases where a submission that updates an entity used an outdated version of the Entity without conflicting property updates and cases where multiple submissions using the same version of the Entity updated the same property(ies))
  • Give office staff the responsibility for making difficult decisions around conflict resolution. They can view full history and coordinate between field staff involved in the conflict

XLSForm specification

The entities sheet

list_name entity_id update
trees ${tree} true()
  • list_name: the target Entity List to update an Entity in.
  • entity_id: the id of the entity to update in the Entity List. Form designers are responsible for writing an expression that will evaluate to the uuid id of an entity in the specified Entity List.
  • update: condition for applying the update specified by this submission. Examples: a failed visit might not need to lead to an update or a form may specify non-overlapping conditions for creation and update.
  • label: optionally, an update can change an Entity's label (this can be very powerful! For example, you could add :white_check_mark: :yellow_circle: or :x: to the front of an Entity's label depending on its status after the current form is submitted)

The survey sheet

Entity properties to update are specified by the same save_to column used in an Entity creating form:

type name label save_to
select_one_from_file trees tree Which tree?
start start latest_visit_start
decimal circumference Circumference latest_circumference
text weather What’s the weather?

We are iterating on the XForms spec and more details around the XLSForm spec in this Google doc. If you are interested in diving into those details, please leave comments in that document!

Conflicts

We currently plan to automatically apply Entity updates in the order they are received, regardless of an Entity's conflict state.

A conflict is when a submission is made based on an Entity version that is out of date. We will additionally indicate when a submission is made based on an Entity version that is out of date AND there's overlap between the properties that this submission updates and the properties modified by other updates that were applied since the Entity version that the latest submission was based on.

5 Likes

Very useful explanation

1 Like

I have a question. In the future in a fully offline workflow when the field agent submits the data there would be consistency in the upload? For example, upload first the "Registration form" and then the "Follow-up form"?

The submission order from clients will not be guaranteed. The server will be responsible for ensuring that entity actions are applied in the intended order.

Here's what we wrote in the Google doc on that subject:

Sequential integer versions are easy to work with but are not guaranteed to be unique across clients and don’t uniquely identify the contents of updates. That means they do not work well in contexts where multiple offline clients are likely to make several updates AND submit those updates at the same time. For example, clients A and B get EntityA version 1. They both go offline and generate versions 2, 3, 4. If they then submit at the same time, the server may interleave versions from the two clients leading to a difficult history to understand. We believe that this kind of scenario is rare.

Likely approach: If a server received a submission with a baseVersion greater than the server’s current version for that Entity, the submission is held in a processing queue until earlier submissions are received.

We have since changed our thinking a little bit and are planning to have a stronger guarantee of order. We'll likely keep baseVersion as the base server version that was used and add an additional attribute to indicate what local client version the submission resulted in. There are a few notes at the bottom of the doc around that. We will finish that spec design once Collect has an offline entity representation and solicit feedback on it as well.

Entity updates are available starting in ODK Central v2023.5.

Additionally, we have

  • updated the "Additional columns" sheet in the XLSForm template to reflect the new entity_id and update_if columns for the entities sheet.
  • published a tutorial. At the end of the tutorial you'll have a form that community members can use to report problems and another form that staff members can use to see problems on a map and address them. Once problems are addressed, they are no longer shown on the map.
  • updated documentation to describe updates and conflicts.
  • added entity updates to the ODK XForms spec.

We hope you find this helpful and look forward to your feedback about this feature area!