Selecting a map feature to collect data about

LN · July 20, 2020, 10:38pm

Thanks @danbjoseph and @TAB for driving towards a spec for selecting a map feature to collect information about. We know ODK is already broadly used to collect data that is spatial in nature and I think being able to use previously-defined map features in addition to creating new points, traces and shapes would be very powerful. I really look forward to reading specific scenarios that drive out the "business needs." I want to share a few things that are on my mind.

First, in the TSC meeting notes I asked @MartijnR about what Enketo users and developers might be interested in related to this general space. I ask because there’s a lot of user value to Enketo and Collect not deviating too significantly. So I think we should be aware of and at least consider Enketo priorities around these geo workflows.

Second, I see three broad approaches with various tradeoffs:

Stay within ODK clients’ form-centric model and communicate map feature data as a form attachment. I see two sub-options here: a GeoFeatures question type that lets users pick a feature within the context of a particular form instance or exposing a way to pass a feature ID into a form instance. In the latter case, Collect would add selectable features on the existing form map. The advantage of the latter is that data collectors would see map features in the context of previously-collected form instances.
Provide specialized functionality for communicating geo features to be selected outside of any form context. That means focusing on this case and accepting that there might be redundancy with future more general entity-based functionality.
Treat this as a subcase of entity-based data collection. That doesn’t necessarily mean having to answer everything about the more general case but it means trying to come up with APIs and workflows that can be used for it.

As @yanokwa mentioned in the last TSC call, anything that stays within the form-centric model is relatively straightforward. That’s why @zestyping and I had previously framed it as a stepping stone. My sense is that even if/when we do have an entity-based model, form-centric map feature selection would still be in use because it will be simpler and adequate for folks with less dynamic data collection workflows.

My understanding from the last TSC meeting call and notes is that y’all were starting to rally around something like the second option. This would require a little more designing than the form-centric option but probably not a whole lot. The implementation would likely not be much more complex. However, the concern I have is that it potentially adds another “data collection mode” to continue supporting in the future.

Let’s say we agree on some API for clients to get features data that’s outside of a form context and let users select features to launch a form list. As @danbjoseph and others have mentioned, there could be a different mode in “Fill Blank Form” that displays a map. Once we add a general entity concept, we’d be supporting a text list and map for “loose forms” (what we have now) and a text list and map for “entities.” Similarly, on the server side, there’d be an endpoint for pulling map features for “loose forms” and some way to pull entity data that also includes map features. No single component of this duplication is hugely complex but I think that the explosion of combinations of features that ODK already has makes software maintenance, documentation writing, support, and coherent evolution of the tools really difficult.

The third option would entail thinking of the map features as entities and having servers communicate entity lists and the forms that apply to them. Naturally, there are a lot of unknowns here. I don’t know how realistic it is to design APIs for entity-based data collection without considering the full workflow.

My current thinking is that trying to get a form-centric model out to users relatively quickly and then leveraging that implementation in a future entity-based model strikes a good balance between improving things in the short term and providing ideal functionality in the long term. I’m nervous about the second option but perhaps more concrete details around the path to an entity-based model and the API for communicating feature data would change my mind.

Related threads:

Xiphware · July 21, 2020, 12:50am

FYI, somewhat related (perhaps?) - and a concept I've used in iXForms/GoMobile - is the notion of a form submission being geo-referenced; that is, the submission data contains a specific (or multiple?) geopoint/geoshape/geotrace value(s). If so then those forms instances can being displayed and selected from a map view as well as the usual list view.

If the form 'submission' represents an entity then the UI mechanism(s) used for picking entities can display them as a list view or - for those entities which happen to be georeferenced - alternatively a map view. Similarly, if the form submission is more just the typical survey data capture then, again, the associated UI for browsing survey results can present a user selectable list vs map view for them.

(I use a very ad hoc approach to determining whether the submission data is 'geo-referenced' or not: find the first geopoint binding in the XML. This should be better and more formally spec'd). But anyway, its a pretty simple - and useful (for me) - approach: anytime you list things, if they're geo-referenced have a list-vs-map view that you can flick between to pick one, which I've found doesn't impose a huge cognitive leap to the UX.

LN · July 21, 2020, 4:18am

Thanks for sharing, @Xiphware. Collect does this for submissions as well but only on a per-form basis. That’s what the first option I outline could extend. That is, there’d be some way to attach map feature data to a form definition and those map features would be displayed and selectable on the map that currently displays submissions for that form.

Paul_Storry · July 21, 2020, 8:33am

@LN I read your post with interest. This is something we implemented in our fork of Collect sometime ago. The approach we took was essentially your approach 1. We use a standard field type with a custom appearance, and upload the maps as media. Features (polygons, lines, or points) can then be selected on the map.

LN · July 21, 2020, 6:42pm

Something you wrote made me a little uncomfortable/intrigued, @Xiphware, but I couldn't put my finger on it yesterday. I just realized it's this sentence. I think one of the really critical thing about an "entity" concept is that it can weave together several different form instances/submissions that were filled in about that entity. For example, there might be a form to collect basic swimming pool details, a swimming pool inspection form, a swimming pool incidence report form, etc. So in my mind an entity can't be represented by a single submission. Does this match the entity concept you've established? Do you do something like create a virtual submission that combines submissions from multiple form definitions that relate to the same entity? That question aside, it sounds like your mechanism for feeding in selectable map features is to create entities on the server and communicate those entities to the client, is that right?

That's great to hear. Is your fork open source and one you could share here? Do you feel like selecting the feature inside the form filling flow is desirable? Essential? What if you could instead/additionally see exactly the map in your screenshot, tap one of the features to launch a new form filling session, and see all the features you've collected data about marked in some way?

Xiphware · July 21, 2020, 8:45pm

In our context, an 'entity' (actually, ostensibly a DB table row) is, for convenience, represented as a 'form' instance; the XForm in question being used to both display the entity data in a nice way using the rich suite of data types, widget appearances, grouping related entity fields into logical subgroups, etc... In addition, this so-called 'entity form' can be used to both capture new entities (start a new entity form), as well as updating existing entities (pre-populate an entity form with an existing entity's record, and edit stuff).

So no, an entity can be represented by a single submission, but rather the current entity is essentially the latest 'submission' (for lack of a better word...) of the entity form having a specific pre-defined entityid. [although in reality you could probably toss out any old entity submissions with the same entityid, unless you wanted to keep an historical record of each entity].

Basically, rather than invent a new mechanism to persist (and render, and capture) entities, I just re-purpose XForms/instance XML and treat them a little differently in the framework; eg entity forms are specifically identified, there's a specific field in the instance XML that uniquely identifies the entityid, ...). Currently this is ad hoc to iXForms/GoMobile - taking this any further would require a more formal specification around these so-called 'entity forms'...

But my though for leveraging Central for doing this - eg for longitudinal swimming pool inspections - would be to have a specific form in each project that captures the entity data; each different type of entity (swimming pool vs public toilet vs restaurant...) would be a different 'project', with an entity form (having a specific field that identifies the entityid) and all associated forms that you might want to inspect or survey against these things (again, probably all these forms should probably contain some reference to the entityid against which the inspection or survey was conducted). All I'd need from Central would be an API that gives me all submissions of a form having a specific entityid: the latest one of these 'submissions' for the entity form would effectively give me that entity's current record data, and all submissions of other form types would give me the inspection history of that entity. [but I haven't implemented this API yet].

Paul_Storry · July 22, 2020, 8:44am

In this particular case, the select feature is inside a repeat loop, so selecting a feature inside the form is essential, although launching a form from a map might work well too. With regards to seeing "all the features you've collected data about", a using a theme on an attribute of the spatial data would be useful.

mathieubossaert · July 22, 2020, 11:54am

Hi @LN, great to see all the effort made for users !
Here about the geo enhancement, I have in mind 2/3 scenarios (I need to think more about it but I want to share it before leaving my computer for few weeks.

the first one is the ability to click/select an object to start a entire form for it.

For example I want to come back every year on a pond to describe it and list all the frogs I ear. So I can click on the pond and begin the form. When the census is done the pond appear on the map in a different way.

the second scenario is relative to our plant census method.

we collect data about targeted species for each cell of a square grid (10 to 20 meters). Actually I draw this grid over the map into an mbtile but there is no interaction with it. Once the map and GPS confirm us we are in the cell, we create a geo-point, answer the question, and once the data is in the database I can intersects it with the grid to attach each data to the right cell. A great evolution would be to select the cell where I am as a select-one list entry (AS we can do with svg), to fill the data for this cell and after that, move on the field to the next one and fill it. Previously inventoried cells should be shown in another way to show the effort.
The base of the form is close to it : Infinite loop and dynamically delete already selected options from list

A third one about crops parcels inventory with wine growers.

It is quite the same logic as the second one, with irregular grids.
How I drive the vine , do I use chemicals, do I cut the herb between the vines...

danbjoseph · July 22, 2020, 4:55pm

the use cases i've written down so far are here:

The scenarios in brief:

20 previously collected locations of households (and some additional identifying data such as the head of household name) to visit and conduct a long survey. the survey needs to access the selected household identifier and attributes to confirm at the right place and link the data later in analysis. the initial survey to locate the households and the 2nd more detailed survey are different forms and have a methodology/process step in between that is external to ODK Collect/Central.
status updates on a collection of points. like above but with a repeat survey at interval, just need the geo-part to support navigating to and confirming you're at the right place. like @Xiphware's example, the latest entry for a given feature identifier is the current status.
OpenStreetMap data collection. select a feature and answer survey being able to use the existing feature attributes in the survey logic. also need to be able to create new features. need to be able to see what you've edited and added while continuing data collection. it's assumed that during any "round" of data collection there is only 1 device being used in a given area (i.e. teams are assigned grid cells to stay inside) and a feature is only edited/added once. any round of data collection is followed by reconciliation with OSM. and any subsequent round of data collection is preceded by loading a freshly updated base layer of features to the phones.

Xiphware · July 22, 2020, 6:29pm

This sounds very much like a longitudinal survey usecase: the first 'survey' captures the essential entity household (aka 'entity') data - eg location, household name, etc - and the 2nd, or subsequent, surveys capture further additional or followup data. Where the original captured location of the entity is used to direct the surveyor where to return.

danbjoseph · July 22, 2020, 9:34pm

I guess all of my use cases are longitudinal, I'm just not looking for the ODK suite of tools to manage the version, merge anything, or help with the continuity between rounds of survey other than giving me an easier way to record the identifier and access some data. I think I mostly want something like pulling data from a CSV except I want to pull data from a GeoJSON (and be able to select which GeoJSON feature via a map... and track on a single device which ones i've pulled... and also be able to design a flow that lets me complete a survey for a new feature in addition to selecting an existing one).

I'm expecting to have to create a new base layer geo-file of features each time I want different or updated attributes available within the survey. If I'm visualizing the data, I expect to have to use my analysis workflow or BI software to do things like group by the identifier and find the most recent entry.

RubenFoquet · July 24, 2020, 7:42am

Hi LN,

Thanks a lot for opening this up. I find myself at the user-side of ODK geo tools and I have no knowledge of developing tools. I answer to your request to gather specific scenarios that drive out the "business needs".

I work for a forest restoration NGO in Zambia. I introduced ODK to the workflow of our field teams. Geo-data is essential. The following functionalities would be preferable:

Visualise an editable project map: when we go into the field, we want to have an offline map which visualises all features that were collected with that form in the past. For instance, we collect locations of all beehives established in the forest. I want to be able to move out in the field and navigate to these point features. I also want to be able to edit their characteristics in the field. Ideally, multiple layers, stemming from different forms, would be visualised in one offline map. This enables my team to have an overview of all farmers, all beehives and all restoration polygons in this project map.
Syncing a project map: it would be easy if the offline project map can be updated with one simple click in the office before the team heads back to the field. I am thinking some sort of map in the cloud could help as a synchronisation tool.
Longitudinal data collection per feature: we do monitor tree growth annually, beehive production bi-annually etc. It would be very useful to have a data flow integrated in the geo-tool to structure monitoring data. When in the field, a new update of some predefined fields can be added to the feature.

I currently store all my geo-data, after collection in ODK, in QGIS as geopackage layers. I was thinking to shift to QField to manage and collect data in the field, though the user-friendliness of ODK in form design is much appreciated. I am hence looking forward to further development of map features in ODK.

I have more ideas, but let's stick to these for now. I wonder whether these would be achievable?

Ruben

LN · August 3, 2020, 10:27pm

That makes sense. Virtually all usages of repeats that I've seen could be represented as multiple instances of the same form definition linked by something like a map or a hierarchical menu (e.g. household > people > pets are each a different form definition).

Is your fork one that you could share?

Yes, great, that is consistent with what I remembered from our in-person discussions at the convening. @Xiphware and I chatted a bit off-forum. And indeed, I think this comes back to one of the major questions we had coming out of the convening discussions: whether "the entity" is defined across a single form definition ("entity form") or across many. This may have significant repercussions on performance, complexity, and expressiveness (I say "may" because I haven't put in enough design time to really feel confident about the tradeoffs). To put this in very concrete terms in the context of this discussion, the question is whether "location" would be editable only through the "entity form" or not. Let's say I have patients with addresses. This is the difference between updating the patient's address in a scheduled visit form vs. going to the entity form to make that change. Swimming pools have the advantage of being relatively static so this isn't a very big deal. Some entity types will be much more likely to have their core properties change over time than others.

This seems like a really clear win to me and seems like something we can do by attaching geo data to a single form definition ("pond frog survey" in @mathieubossaert's example). Features defined in that geo data file would appear on the existing form instance map. Tapping one would start a form instance with fields populated from the geo data file.

Wouldn't it be even better to have each cell be a geo feature with a feature ID and be able to tap on a cell to start a form? That would be essentially the same thing as the frog case.

Let's help the wine growers.

I asked for some clarifications in the doc. While I agree with @Xiphware that these cases are longitudinal, as you say, it's possible to improve these significantly without supporting the full workflow. In all of the cases described, I think this can be achieved in the context of a single form definition.

Thanks for chiming in, @RubenFoquet. I think your first point is likely to be addressed in the relatively near-term. We'll make sure to loop you in to ongoing conversations.

Points 2 and 3 are squarely in the realm of entity-based data collection. I don't think we'd tackle those workflow aspects as part of this work but hopefully we get there sooner than later. It's a big piece of work and we have to figure out how to resource it.

mathieubossaert · August 4, 2020, 6:30am

Hi @LN, I understand what you mean, but in the example of plant monitoring If we start a form for each cell (geo object), a lot of data will be common for all cells (date, users, weather, water level...) and some steps will be repetitive and maybe time consuming (go to the end of the form, finalize it...). Geo improvement could be very useful to start forms and to loop over map objects in a repeat.

LN · August 4, 2020, 11:38pm

Ah yes, I see. So there's a hierarchical aspect. In an ideal world, I think this would be entity-based and there'd be two entities: area/sampling session and then cells. You could select an area, fill out some information about it and then fill out information about each of its cells.

Do you need to access the general information (e.g. weather) when you're collecting data about a cell? Can the first level of the hierarchy be defined ahead of time? For example, if the first level represents an "area", what I'd do is have two forms. The first form would have a map with features that represent areas. You could tap an area and fill out the metadata you mentioned (visit date, weather, etc). Then after you did that once for the current area, you'd exit out of that form and go into the cell form. You'd see a map with cells, tap a cell and fill out only data specific to that cell. You wouldn't have to specify anything about the area because that would be part of the properties for each cell. You'd then link the area data to each cell in analysis. How does that sound?

mathieubossaert · August 5, 2020, 7:30am

It sounds great to me, and greater if the link between "general / metadata form" and "cell/map object" can be as transparent as possible for the data collector.

qke · February 14, 2025, 8:41am

Hello, is feature 1 still in development? I have tested a lot of applications and each time I cannot see offline the polygons that I collected in previous forms. Do you have a method to display them when I collect a new polygon? This is to make sure that they do not overlap when I am in the field.
Thanks!

spwoodcock · February 19, 2025, 1:24pm

As far as I know there is no way to display existing geometries when you collect a new one.

It's possible to display the collected geometries in the form, just not in the same question as the one collecting a new geom.

To do that, the solution to this is to use Entities instead of external geometry datasets.

If you:

Create an entity list.
Create a geopoint question that has a save_to field pointing to an entity list property.
Use the create_if entity creation setting.

Then when you collect a new geopoint, it should save to the entity list.

Then the next form you open will sync the entity list data and a select_one_from_file question should be able to view the geometries saved to the entity list.