We are doing research to better understand why invalid polygons (self-interesting) occur during data collection. We’ve seen this as a challenge specifically in agriculture, environmental monitoring, and public health. Our goal is to understand the contexts and behaviours that cause these errors, so we can help data collectors prevent them and reduce rework for project managers.
Problem we’ve heard so far
-
Data collectors sometimes create invalid polygons due to GPS drift or user error, like crossing over already mapped areas.
-
Current workflows lack real-time validation or clear guidance on fixing geometry issues.
-
Errors are discovered after submission, requiring rework. This is very hard to do afterwards because the data isn't available and editing in Enketo is hard.
What we want to better understand
1. Where the errors occur
- When is the invalid point created? Was it the last point, the one collected before, or it became invalid because of a much earlier point or something else?
2. Recording behaviours
-
How do users decide which recording method to use?
-
During automatic mode, what are your data collectors typically doing with their phone? For example, is it in their hand and they are looking at it the entire time, or in a pocket?
-
If they are using placement or manual mode, where is their phone?
3. Strategies to prevent invalid polygons
- What strategies have helped you and the data collectors reduce these errors?
This work builds on a previous discussion about recording methods and whether those should be defined in the form design or remain in ODK Collect. From learned it’s important for users to choose their method and the use cases shared were incredibly helpful.
We’d love to continue that conversation as we explore how we can design a solution to prevent invalid polygons during data collection based on common behaviours.












