ODK to collect species and habitats localities, as pressure and threats to ecosystems

Primary Topic / Field of Application

Ecology, Nature conservation

Context

Who we are

logo_sicen_1
The "Conservatoire d'espaces naturels du Languedoc-Roussillon" is a nature conservancy NGO based in Montpellier in the South of France.
Our team consists of around 30 people contains ecologists, naturalists, agro-ecology specialists, project managers, administrative staff and GIS administrators, working in 7 sites in the region and managing approximately 12,000ha (about 30,000 acres). In September we will merge with our neighbors from Midi-Pyrénées to fit th boundaries of our new region and become together the Conservatoire d'espaces naturels d'Occitanie, employing 60 people.
At the national level there are 30 Conservatoires d'espaces naturels employing around 1000 people, federated under the "Fédéreation des C.E.N." and its dedicated team.

Traditionally, naturalists and ecologists write their field notes in a paper notebook, and then manually transcribe their data into a computer when they are back in the office, via a custom web tool, spreadsheet, or GIS file. With spreadsheets and GIS files, a further operation is then required to consolidate the data into our central GIS database.

As a consequence, it could take a month before the data was available to other colleagues for analysis. Approximately 10 years ago we developed a dedicated web interface for data entry, which saved considerable time by removing the need for action by a GIS engineer, and made data available as soon as it was entered.
We also had some prior experience (from 2007) around collecting data with a PDA and Arcpad, which was particularly interesting because it effectively made this data available in real time.

Since 2006, our Geographical Information System (GIS) located in Montpellier, has been organized around a central PostgreSQL/PostGIS database, connected to several open-source tools: QGIS, Lizmap, JasperStudio, Redash, and ODK Aggregate.
ODK and redash are the most recent tools we added to our IT infrastructure.

At the regional level, 25 colleagues and direct partners are now using ODK against our database. @nathalie_H and I are presently the principal form designers on the team.

This was our presentation at FOSS4G-fr 2018:

16mai_Cauchy_Bossaert-CENLR_0.pdf (3.0 MB)
(in it you can see our colleagues out in the field and the web tools we use with ODK).

How we met ODK

In 2015, a colleague from another region (@Remy_CLEMENT) , showed us their use of geoodk to help field technicians to report their field work (grazing, tree cutting...).
A year later we discuss creating a dedicated form for our common naturalist database. We spend two days in Lyon to create the form and generate SQL queries to pull data directly from ODK Aggregate into our own database.
Two weeks later, 2 colleagues became beta testers and subsequently adopted our tool. They estimated they saved 5 days per person with the form ; I spent five days to create the tool.

In 2016 @Remy_CLEMENT and I gave our first course to colleagues from other regions. This blueprint has evolved into standard course offering, which we have now conducted four times to approximately 25 people from National Parks, CEN, botanical conservatories, regional parks, and various NGOs. The course is divided into two parts: 1.5 days on form design, and 1.5 days on installing ODK Aggregate and creating SQL views and triggers to interact with other databases.

Why We Use ODK for Mobile Data Collection

Our field season is quite dense, with long days and little time to spend in front of a computer. So our computer time was typically postponed to the end of the field season (and not our most fun time of the year...). ODK gave us a good way too transform this previously unrewarding computer time into more actual days in the field, and more time for interesting data analysis and report writing.

Our days in the field are long, and conducted in poorly connected areas. So our mobile data tool had to allow unconnected work and provide a stable, trusted storage system. The tool also had to be as easy to use as possible and must constraint user input to ensure the acquired data was as reliable as possible (eg constraint vocabularies and defined input types). Our mobile devices themselves must provide for good field autonomy, and be protected against water and dust.

Form users

Our ODK users are mainly colleagues and specialist in a naturalist domain - plants, animals, mushrooms, naturals habitats - but in some cases or studies they could be farmers, wine growers, etc.

Form logic

The form described in this Showcase is our main form, initially created in 2016. The first version allowed users to collect basic information about species and habitats. Each subsequent revision - in 2017, 2019 and 2020 - improved upon it by adding more adaptive questions and choice lists according to the observations. This year we added three new features (described here) but the 2019 version probably represents our most improved version, and is the result of Jean Baïsez university work.

Our form is used as both a note book while collecting species and habitat data, and it can also record threats or pressure on nature, and record management advice. All these data types are geo-located using ODK geo widgets and and may be additionally documented with pictures taken from the phone.
Here is a logical schema of the form.

Tips and tricks

within the form...

1. Choices lists from big external csv file

Our reference taxonomic list from the National Natural History Museum
This files contains more than 400 000 taxa for animals, plants, mushrooms, etc.
So we have to find a way to easily find the species we want in this big lists.
We use the search() function in combination with startswith.

type name label hint constraint calculation required appearance default
text recherche_espece_animale Nom de l'espèce animale : au moins 3 lettres string-length(${recherche_espece_animale})>2 yes
select_one list_espece lb_nom_animalia Sélectionnez l'espèce : yes quick search('espece_animale', 'startswith', 'lb_nom_key', ${recherche_espece_animale})
calculate cd_nom_animalia pulldata('espece_animale','cd_nom_key','lb_cd_nom_key',${lb_nom_animalia})

2. Choices list styling

The form highlights the official species name in the species list, and shows its synonym in another format.

3. Personalized settings and metadata

The form can be easily customized or personalized; for example, I only want to use the form for collecting plant locations (lat+long). Each value from 0 to 9 represents a different setting: 1 to 6 for the thematic sub-forms, and 7 to 9 for geo-location types.

  1. animals
  2. plants
  3. mushrooms
  4. natural habitats
  5. threat / pressure
  6. general observation
  7. point
  8. line
  9. polygon

Such personal settings do not already exist in ODK. So we found a workaround by using the phone number from collect's personal settings.
In case the phone number from Collect's personal settings is incomplete we initialize it it with an "all options" combination "0123456789":

type name calculation
phonenumber phonenumber
calculate custom_setting coalesce(${phonenumber}, ‘0123456789’)

To implement this customization, we use a test in the choice filter or relevant column on the associated group or field. For example, the choices presented by the geo-location method select_one is configured from settings with this test :

contains(${custom_setting},filter)

and the choice sheet looks like this :

list_name name label filter
metode_geo point point 7
metode_geo long_lat coordinates input 7
metode_geo line line 8
metode_geo polygon polygon 9

4. Form adapts to the observation type

The form adapts its input fields according to the type of the observation; specifically, plant descriptions differ from animal descriptions. For example, it is not relevant to ask for the behavior (French: 'comportement') of a plant, but it is for animals. Drilling down deeper, behaviors can vary between birds, spiders, amphibians...

type name label relevant choice_filter
select_one comportement comportement Comportement ${type_observation} = 'Animalia' and ${groupe} != '' (filter = ${groupe}) or (filter =1)

where the "choices" are:

list_name name label filter
comportement 18-Nid vu avec un adulte couvant 18-Nid vu avec un adulte couvant Oiseaux
comportement construction toile construction toile Arachnides

The ${groupe} field is calculated based on species selection, obtained from the csv media file :

type name calculation
calculate group concat(pulldata('espece_animale', 'groupe', 'cd_nom_key', ${cd_nom_animalia}) , pulldata('espece_plante', 'groupe', 'cd_nom_key', ${cd_nom_plantae}),pulldata('espece_champi', 'groupe', 'cd_nom_key', ${cd_nom_fungi}) )

Here we check the 3 CSV species lists to find the taxonomic group. It could certainly be improved: instead of reading the three csv files, we should only check the relevant one corresponding to the one selected in ${type_observation}

on the database side

5. External tool integration

Uploaded pictures are automatically saved to files on our server (database task (french page to be translate)) and shown in other tools, such as QGIS and our custom-written web tool:

Note: a web tool account is automatically created if ODK user if they were not previously known to the system

Solved issues and unresolved "problems"

Map to show previous location where form was filled

ODK Collect can now show on the map the previous instances of a form with a fist question which is a geopoint. Although this feature is very interesting, it is not really helpful to our generalist form, but may be useful for others others like phytosociological surveys. In the future we will keep an eye on Geo widget evolution, sure they will help us a lot.
By the way we could use a start_geopint at the beginning of the form to show prvious instances on a map and help user to easilly find their data.

Add a summary of observed species

At the end of each location inventory, it would be very useful to show the list of all species observed in that place by the user. This is currently technically possible, but our colleagues only just ask for this feature very recently.

Rename each loop with the species name instead of rank

We have been asked recently to add this feature to help with form navigation. We need to investigate what is possible in group naming with a variable, and will keep an eye on ODK form navigation developments.

Relevant forum topics

About "autocomplete" search in select_one for species :

About custom / personalized settings :

Add more metadata field -> Custom form metadata

About form navigation:

About choices list with long labels and html styling:

Screenshots

User details (name / email)

Those values are obtained from ODK Collect's metadata. If not available or not set, the user can fill them in or change the default values. User may also add additional people who were along with them in the field

Geo/location method choice

This is the beginning question for the current observation. This list of choices can be personalized, as described previously.


We have since added the option to manually enter latitude and longitude decimal values, for people who may be filling in the form at home from written notes.

Geopoint example


In the future we would like to be able to move the map under the central traget to give the user more accuracy.

Observation type

What kind of observation do we want to fill for this first location ? This list can be personalized by the user, as described previously.

Auto-completion and list styling

Our taxonomy uses synonyms to name taxa. One of these synonyms is the valid name for the taxa at a moment. Knowledge of the taxonomy evolves over time so taxa can be split or merged. Here we use html styling to show both the valid name and synonym (if applicable) in the selection list. The user may type 3 letters to choose a species or genus in the list.

Observation details

Here is an example for an animal for each age group: adult, juvenile, undefined. The user must indicate the gender of the observed animal. If they select "male" the form will display a question to enter the number of adult males. At the end of the form we calculate and show the total number of animals seen in this observation.

Personal settings

Here is the classical screen showing user's settings in Collect, containing the phone number and other options. As we saw before, the phone number is used to store user's personal settings (1 to 6 for the thematic sub-forms, and 7 to 9 for geo-location types

Data processing tools

We integrate ODK a lot into our information system. We have used PostgreSQL and PostGIS since 2006 and retrieve data directly from Aggregate's PostgreSQL database server, using foreign data wrappers, views, and cron task to save new observations into our historical database.
Our main database connects to Aggregate's via a PostgreSQL FDW. Each form table is created as an foreign table in the central database. We then create a view to format data as needed and every X minutes we integrate only new data (data for which its "_URI" is not yet in our main database).

This database is iused by several tools such as QGIS to create maps, and redash and jasperstudio to generate web-based or static PDF dashboards

This is a redash screenshot showing how ODK (green line) is replacing web input (blue line) for data entry since our initial adoption of ODK 2015:

Resources (link to xls and multimedia files)

Here are the files (media files have been truncate) to try out this form: sicen_2020.zip (2.3 MB)

These are PostgreSQL functions to transform binary data (photo) stored in PostgreSQL into files: https://framagit.org/mathieubossaert/sql_divers/snippets/3620

Perspectives

on the data collection side...

we now cover 80% of our needs. Once an issue in javarosa is fixed, it will achieve 90% of our needs!

on the server side

Using the new ODK Central instead of Aggregate is going to be a big step forward:

  • tmanage users and groups with privileges,
  • will eventually provide a web interface to the form: Integration of Enketo into ODK Central,
  • we have to check if the tight SQL connection between the Central database and our own database will remain easy, or else we may have to make use of json PostgreSQL capabilities to integrate data in our GIS,
  • Central allows us to directly query data with redash through a JSON API and create attractive dashboards

Conclusion

ODK as became a core tool in our information system. The three most important factor are:

  1. it is easy to integrate into our existing database environment, due to having PostgreSQL backend,
  2. we don't need to expend time, money and effort on developing a custom mobile data acquisition app,
  3. we can instead focus on translating our existing field methods into XLSForm; minimal informatics skills are needed to create our forms.

Acknowledgments

Many thanks to @Xiphware and @LN for their advice.
Thanks to @Xiphware for the time devoted to proofreading this showcase and for the corrections and suggestions made.
Thank you to all ODK contributors for the tools you make and the quality of the discussions over the forum.

7 Likes

Thanks for this kind of stuff.

1 Like