Create and update Entities from repeats

Developers may be interested in the companion ODK XForms spec proposal.

Background

Repeats make it possible to capture data about multiple things of the same kind within the context of a single form. This means capturing the item data can be embedded in a broader flow and there can be logic used to perform validation between those items.

Users want to create Entities from those repeat so that they can follow up with them. Some practical examples:

  • In an agricultural context, it's common to have to capture information about many fields in the same farm. See @dast's Insiders lightning talk and showcase post.
  • Household surveys often involve capturing info about individual household members associated with that household. See Form linking

Currently, a single form submission can only create or update a single Entity. We propose making it possible for a submission to either create or update a single Entity OR create/update multiple Entities in a single Entity List from a repeat.

XLSForm spec

entities sheet:

We propose adding a repeat column in which a reference to a repeat can be specified. All other references used in other columns such as label, create_if, entity_id, etc, would be evaluated in the context of that repeat.

For now, exactly one Entity declaration would be allowed.

list_name repeat label
trees ${tree} ${species}

survey sheet:

When an Entity declaration is specified on the entities sheet, save_to values would only be allowed in that repeat.

type name label save_to
begin_repeat tree Tree
geopoint location Get tree location geometry
text species Species species
end_repeat

Future work

In the future, we expect to further expand to the following:

  • create/update multiple Entities in a single Entity List from a repeat AND a parent Entity in the same or a different Entity List
  • create/update multiple different Entities from the top level and from multiple repeats

Please let us know if you have any feedback!

4 Likes

Absolutely in support of this — great initiative!

This would be a great step forward. Do I understand correctly that this is not yet implemented and that only one entity from one repeat could be populated with data from the submission?

To bridge the gap until this feature is properly implemented in ODK we have developed a script for pyodk to retrieve selected data from repeats of a given form, save it in a csv which then is uploaded to Central to populate the form the next time the dataset is used. Would that be worth to share in a showcase?

1 Like

You can also do this with QuODK - you can load the submission data (repeats) into QGIS then use that as the layer to create (or update) an entity list (exported as a CSV). Followed by a pyODK script to talk to Central.

So certainly not as simple as your tool, but there is the opportunity to adjust any geometry [or any other attributes] if needed, so it has some 'strength'. You can also be selective as to which attributes to include in the entity list.

I'd like to say that I'd thought of this specific use case when designing QuODK, but I confess that I only realised it was applicable when I read your post. :slight_smile:

I would find that useful - any 'grown up' scripts for pyODK would be really useful. My kindergarten scripts are a little wobbly so I find examples a good way to learn.

1 Like

Hi @LN,

Exciting developments! A small clarification for implementation with this future spec, how would properties shared across all repeat entities be managed? For example, in a household survey you might first interview the household head, then enumerate household members in a repeat. If I want to register the household’s village (selected, say, from a predefined list in the main body of the XLSForm) as a property of the entity shared between all entities for this submission, will it be possible to save_to outside the repeat? Or do I understand correctly that instead I will have create a calculated field inside the repeat that retrieves this value, so it can be stored with save_to?

1 Like

You can also do this with QuODK - you can load the submission data (repeats) into QGIS then use that as the layer to create (or update) an entity list (exported as a CSV). Followed by a pyODK script to talk to Central.

I still have to test your tool. So many things going on at same time :collision:

I would find that useful - any 'grown up' scripts for pyODK would be really useful. My kindergarten scripts are a little wobbly so I find examples a good way to learn.

I’m not quite sure if our script really is grown up. AI was helping a lot. But yeah, it seems to work.

On the assumption that Collect beta functionality exists for entities from repeats, I made a quick test form in a similar manner to @LN's example at the top to see what would happen, for now only with save_to against fields in one repeat as also creating entities outside a repeat is incomplete. Using Central 2025.3.1

Spec is still at 2024.1.0 so I guess this thread is the spec discussion for 2025.1 in the current Collect beta?

I modified an existing demo entity form to put a repeat around the fields and added the repeat column to the entities sheet and found;

  • Enketo - allows the first repeat. In the second, third etc repeats the top note shows a read only text field under it and then the following question has no selections visible and the questions after that are not shown at all. :cross_mark:
  • WF - allows >1 values in the repeat, can create/update ok. :white_check_mark:
  • Collect 2025.4.0 beta 2 with experimental entities spec 2025.1 off: form doesn't load
  • Collect 2025.4.0 beta 2 with experimental entities spec 2025.1 on: can create/update multiple entities ok :white_check_mark:

@Tyler_Depke - looks like we can start testing repeat entities workflows :smiley:

1 Like

I spotted this in the docs, saving the entity ID in the submission that creates/updates it via this as a calculate:
/data/meta/entity/@id

This didn't work as is for entities from repeats, I expect I need to include the repeat name somewhere in the XPath, looking at a submission XML I can see <meta> and @id under the repeat and this works:

/data/your_repeat_name/meta/entity/@id

Will groups affect this for some future implementation where a single submission can create entities from repeats plus outside the repeat, either inside or outside a group that would need to identify where the meta block exists? Currently the group in my repeat is at the same level as the meta block.

Finally managed to test this new feature.

All worked as expected and described in the specs by @LN , although I didn’t test yet create_if and update_if. :slightly_smiling_face:

But yes, It only works with the Beta version of Collect with experimental entities spec 2025.1 on. Thanks to @ahblake. I would not have found this in the Project settings –> Experimental.

But this feature will only reveal its true power once the following are possible at the same time:

  • Entities of the parent area

  • Entities of one or more repeat groups

We continue with imperfect work-arounds until then.

1 Like

Thank you so much @ahblake :folded_hands: I also wanted to play with repeat entities and I initially missed the experimental entities spec. Your detailed post was super helpful (as usual) in helping me through how to use it.

Thanks for tying all the pieces together, @ahblake! Great to see how much enthusiasm there is for this functionality.

Yes, exactly. This means you have more fields in your submission that you don't really need but it makes the relationship between form fields and Entity properties easier to see in form design.

I would recommend using a relative expression. For example, if you have a calculate directly in the repeat (at the same level as the meta block as you said it) that needs to access the Entity system ID, the expression ../meta/entity/@id should work. That way you don't need to think about nesting outside the repeat, only inside. Could you please tell us how you're using the system ID? Is it to create links between Entities?

Here's an overview of the status:

  • pyxform as released in Central v2025.3+ can generate forms following the 2025.1.0 Entities spec as outlined in the top post, but only for a single declaration per form, either at the top level or in a first-level repeat (not in nested repeats)
  • Central v2025.3+ can use the new spec
  • Collect v2025.4 is still in beta, it can use the new spec when opted into in settings. In the next beta, it will create Entities from repeats offline, still requiring opt-in.
  • Enketo can't use the new spec because of this issue which we don't currently have plans to fix. Those who need web-based forms for Entities with repeats will be able to use Web Forms.

Here is a simple form to create multiple Entities in the trees list: [https://docs.google.com/spreadsheets/d/1bHnF-1W39P2uNrE3Od8BIh7DtF4ErB-sK4T4flPEJkw/edit?usp=sharing](Entities in repeats)

We initially thought we would start by releasing what we have now: limited pyxform support and Central support with opt-in support in Collect. However, as we've created more test forms and explored the problem space more, we would like to have a clearer path to having performant links between Entities before we release.

As @dast says, almost every context in which it's helpful to create or update Entities in a repeat also involves a parent Entity (and possibly grandparent!). Currently, form designers are entirely responsible for making sure they establish the links they need. Because the system has no knowledge of those links, looking up the children from the parent is not performant (as @dast and @ahblake have pointed out in other threads).

Changing the way those links are represented after they've been created would be a time-consuming operation for each data collector. So we're doing some design work now to see how we can do this thoughtfully.

One additional point that has kept us from releasing as things are is that as we write test forms, it "feels like" the links between Entities should be created implicitly based on the shape of the form and so it's easy to forget to define them.

We're working hard to get this right, thanks for all your input so far! For those of you who have tried the functionality, please consider sharing your test forms and the projected sizes of your biggest Entity Lists.

If you link Entities already or are planning to, it would also be helpful to learn about relationships you want to represent that are not simple parent-child ones. For example, if you want to represent links to multiple parents, that would be helpful to know. That would be something like having participants that you want to be able to access from households, schools, and districts lists. In that case, participants can be said to have three parents.

CC @laurespake from ODK Collect v2025.4 Beta: guidance hints shown by default, updated repeat dialog - #10 by laurespake; it would be great to know more about whether you need any linking between Entities and if so how you plan to represent it.

2 Likes

:person_facepalming: duh! Of course.

My thinking right now is that i want to include the id of the entity (especially in the create case) with the submission so that any report includes the entity it refers to uniquely, so rather than "finding #1 at this time on this report" which could then be finding #17 at a subsequent submission, it also includes/uses the entity id.

Likewise the entity might have a field for which submission created it, or a join of every submission that has touched it - but these values are likely better retrieved from the entity audit trail

1 Like