Hello everyone!
I am a geographer and technologist that works with Indigenous communities to build and use digital tools that help them achieve their goals in territorial defense and land management with as much autonomy and as little dependency on outside support as possible. I have worked in the field with Indigenous communities in South America (especially Suriname, Brazil, Colombia, and Ecuador), in Canada, and in East Africa (in particular Kenya), and remotely have worked with communities across the world.
I am a long-time ODK user that has gone from helping Indigenous people use ODK Collect in the field to now integrating the toolkit more closely into a biocultural monitoring system for communities, and I thought I'd share what I'm working on, which involves working with self-hosting ODK Central, integrating with the API, and incorporating some new features like Entities
Background with ODK and open source tools
I actually started using ODK back in 2015, when I was with a non-profit organization called the Amazon Conservation Team, working on a project to help communities in Suriname and Colombia to use ODK to streamline data collection for land management. The status quo at the time was that community members would collect GPS coordinates using a handheld Garmin GPS, and write down the UTM coordinates on a paper form with a description of the geopoint. As you can imagine, this process entailed a significantly high degree of error, a high learning curve for non-technical users, and a ton of manual processing later down the road. Once we found out that GeoODK and later ODK Collect could be used to collect exactly the same data but with better questionnaires and without needing to write down coordinates, we co-created ODK Collect forms with the communities in their own languages and with visual labels, and held a series of train-the-trainer workshops so that communities could be in a better position to manage their own data collection processes. You can read about that here.
While this streamlined the data collection process substantially and the community members were quite happy to have an easier tool to work with, the problem that we faced at the time is that these communities are located in very remote parts of the Amazon rainforest or in high altitude areas, so sending submissions to an aggregate server proved to be very difficult due to the low quality or total absence of connectivity. For villages that were fortunate to have a satellite WiFi, any submission with a media attachment would most often time out, leading to a lot of waiting and frustration. For entirely offline areas, community members had to actually send their devices to a connected area, and wait for it to be returned (this was before ODK Briefcase). And that was just for uploading the data. To actually receive the data back in a tangible, useful format (e.g. visualized on a map or referenced in a chart in a report), the communities had to wait even longer because that basically depended on us, the technical team of the project, to have to download the data from ODK Aggregate, process it, and find a way to bring it back to the community.
To solve this problem, we tried a couple of things. We found out about Portable OpenStreetMap (POSM), and actually implemented this in the field using Intel NUCs in a few places, pretty much exclusively to use the ODK Aggregate server on there. That worked out pretty well, but still required someone locally to know how to utilize the server, work with tabular data, and visualize it on a map, and that was still a lot for Indigenous communities with limited experience with digital technology. And well, around the same time, Esri's Survey123 came into the picture, and the Amazon Conservation Team embraced that tool since they had a non-profit partnership with Esri, and the ArcGIS toolkit lets you visualize the data on a web map and web apps instantaneously (at a cost, of course, and requiring usage of cloud services.)
For my part, these and other experiences in the field led me to take a career pivot from mainly helping communities apply existing technology to being more hands-on in building the technologies, and directly co-designing them with communities. Steeped in the realities of communities being entirely offline, and also increasingly demanding full control and sovereignty over their data and expressing concerns about using tools from big tech companies like Google or Esri, I started Terrastories, a free and open-source tool for mapping place-based oral histories that runs in the browser and can be hosted online, but also entirely offline on either a mesh network or via a WiFi hotspot. I also worked with Digital Democracy and contributed to the co-creation process of Mapeo, a FOSS tool for offline field mapping that stores data in a decentralized way on the devices themselves (i.e. no centralized server) using peer-to-peer sharing protocols, and with a UI that was entirely co-designed with Indigenous communities in the Amazon. At Dd, we also started the Earth Defenders Toolkit project, which is a platform with hands-on guides and case studies about the use of digital tools for an audience of community members, and now has an offline deployment that bundles together the platform and some of the most important tools for Indigenous communities, and serves those in a way similar to POSM.
A return to ODK: Central, Entities, and more for biocultural monitoring
Currently, I am working with Conservation Metrics where we are working with a US-based non-profit called Nia Tero and three of their Indigenous partner communities to build a biocultural monitoring system that will allow them to track their own self-determined vision of well-being via indicators and metrics about their communities and their territories. The system as currently envisioned will combine data collection tools (including both Mapeo and ODK) with data visualization tools such as Apache Superset and customized Mapbox/Maplibre maps to allow communities to instantaneously explore and make sense of their field data using different views. We are also working on a workflow to circulate change detection alerts (for example, about encroaching deforestation, logging, gold mining, or other threats) to the devices being used by Indigenous communities in the field, either in the format of offline background maps or as a GeoJSON attachment to a survey.
Some of our user requirements are working with primarily (if not free) open-source tools that are self-hostable, can work offline, are translatable, allow communities to own and control their data, and can be operated with as little dependence on outside support as possible.
While there are other mapping tools out there being used by Indigenous communities, ODK/XLSForm continues to be the go-to tool for non-geospatial data collection, so it will be part of our toolkit on that basis. However, one of the neat things for me as a returning ODK user is to see how many useful mapping features have been built into the toolkit since I last used it! For example, offline maps (in either raster or vector format) are a huge asset for Indigenous field data collection workflows. The ability to change the colors of past submissions on a map is really great and helpful as well. And then there is Entities, which I'll get to in a bit. I will look forward to seeing even more functionality for maps in ODK (and would also be glad to help think through future features) - for example, being able to add an mbtiles
via the app UI directly, style vector data (e.g. by adding a style.json
file), and maybe give the option to add a label to points on the map so they can be more easily distinguished when there are numerous points in one region.
One of the tools we are building is a messaging bus service that hits the APIs of the respective data collection tools, downloads all of the submissions and media attachments into a secure data lakehouse owned by the communities, and places them in an optimized format for retrieval by the other services (such as Superset and Mapbox). The goal of this is to essentially automate the whole process which I described above of downloading data from ODK Central and formatting the data for usage in a third party tool. Using the service, the ODK data once submitted will be download and appear almost right away in the third party services, much like how it works in Esri. One cool thing that we've already built for Superset in particular is taking the XLSForm translations and using them to create bespoke charts and dashboards in each of the languages, which may be Western or Indigenous.
We are still in an overall research phase for the biocultural monitoring system, where we are looking at what's already out there that we can use, before we start narrowing down and building more. Here are some of the things we are tinkering around with and considering that relate to ODK:
-
Entities: Entities can be a game-changer for monitoring for Indigenous communities, because it allows the user to revisit the same place or incident and report on the status of what happened there. Take the example of an oil spill (a sadly common incident in the Ecuadorian Amazon). Once encountered, community members can create a geopoint for the location where the incident occurred, and then report back on the status of the pollution on successive visits. We are also thinking about using Entities to circulate change detection alerts: once a new change detection alert is created, we can use our messaging bus service to update a GeoJSON file on an ODK Central server, so that ODK Collect users can download it, view the alerts on a map, and submit reports on whichever incident they are visiting in the field. Generally, I'm really thrilled about Entities and feel that it's a feature that most, if not all, data collection tools are currently lacking.
-
Other ways to visualize ODK submissions: according to our user interviews thus far, the most helpful ways to visualize field data are using maps, charts & graphs, tables, and a gallery of media attachments. (Notably, these are the services provided by the KoboToolbox server as well.) We would like to find open-source, self-hostable services that can do these things and that can be integrated with our messaging bus tool to automate the process of bringing in ODK and other data. Thus far, for maps, we are considering building our own Mapbox/Maplibre tool. For charts & graphs, we're looking at Superset since we have not found anything else for easily making charts and dashboards that is free/open-source (I know Tableau and Power BI are very often used by other ODK users, but these don't meet our user requirements). For tables or a media gallery, we have yet to find the right tools. If anyone has ideas for any of the above, or experience working with ODK data in a tool like Superset, we would love to learn more about that!
-
Form builder: In my experience, one of the big blockers for Indigenous communities in autonomously using ODK is the learning curve involved in making forms. XLSForm is very powerful but does requires an understanding of logic and how data values work, and depending on the complexity of the form, comfort in writing some basic validations with variables and curly brackets. This is usually asking too much, even for some of our Indigenous community users with a higher-than-usual level of experience with technology. Hence, form builders are a very important feature for us. The KPI formbuilder used by KoboToolbox and Ona is quite nice, but it's not yet clear to us how easy it will be to carve this out of the overall stack that either service uses. It's likely much easier to deploy ODK Build, and we're very interested in doing so but would like to see if there is openness or existing efforts to improving the UI to make it more intuitive and easy to use - we might be able to contribute.
-
ODK Central Docker images: For offline deployment, we are using Docker compose to manage the DevOps of all of the different services we are bundling together. ODK Central has a really great Docker deployment setup, but for our infrastructure and using Kubernetes/clustering we could benefit from having published images e.g. on Dockerhub. I see from previous threads that this might get tricky with authentication. But we may want to think through this with anyone else interested, to see if it's nevertheless possible.
If anyone has any thoughts or ideas to share about any of the above, they are most welcome indeed!
This is already quite long so I'll leave it there, but hoping it's interesting for folks (if you've made it this far, thanks for reading!) and that through our use case we can find ways to contribute, and generally I'm really glad to be returning to the ODK ecosystem (big shout out already to @LN who has pointed me to a number of the above features that were new to me). I'll also be happy to keep sharing how things are going as we build things out for the biocultural monitoring system