Updating external datasets from another form's submissions data from within a PostgreSQL database

Background

As @Florian_May's showcase, the background of this is about turtle monitoring, but in France, and not about a marine species. In fact Caretta caretta is back on the french Mediterranean beaches since 2018 but does not yet required a heavy monitoring with ODK and authorities try to keep breeding places confident.

The turtle we study here is a small one, living in fresh water, called Emys Orbicularis. It is fun that I spend 2 years working on it 20 years ago just before I met computer sciences and became a GIS admin.

To monitor local populations, we put traps on the field the first day and days 2, 3 and 4 we catch turtles in the traps.
Once captured, if the individual is already marked we take some measurements, if not we mark it first.

Last few week @Vivian_Inereli developed a dedicated form she'll maybe expose here...

What came up in the discussion with "turtle collectors" is the fact that traps are moved every week and located on a map.
The user simply note the trap number on the paper when he catch a turtle.

We purposed to manage traps deployment with an ODK form.
Once that form OK we wanted to adapt this test and make it automated as possible.

Implementation

So let's go.
We developed 2 forms :

  • CMR_Cistude_pose_pieges wich is used to place the traps
  • and CMR_Cistude_captures wich is used to really monitor trapped individuals

CMR_Cistude_captures.zip (96,7 Ko)
CMR_Cistude_pose_pieges.xlsx (13,1 Ko)

All SQL queries below are ran in our "Organization" database. Nothing happens on Central's database.

We use a cron task (each day at 10 PM) to run queries below.

We use central2pg functions, located in the data_from_central schema to get fresh data about traps :

SELECT data_from_central.data_from_central_to_pg(
    'my_login','my_password','my.central.fdqn',5,
    'CMR_Cistude_pose_pieges', -- form ID
    'data_from_central', -- schema where to create tables and store data
    'point,point_auto' -- columns to ignore in json transformation to database attributes (geojson fields of GeoWidgets)
);

We created a view called cmr_cistude_pose_pieges_session_courante that returns only traps from the last (max) date.
We COPY the result TO a geojson file, somewhere postgres linux user can write.
At the moment we run that query even if there is no new trap.

COPY (
    WITH places AS (
        SELECT id_piege, label, value, geometry
    FROM data_from_central.cmr_cistude_pose_pieges_session_courante
    )
SELECT
  json_build_object(
    'type', 'FeatureCollection',
    'features', json_agg(ST_AsGeoJSON(t.*)::json)
  )
FROM places AS t
      ) to '/home/postgres/medias_odk/pieges.geojson';

We implement an informative version number for the form, containing the year concatenate to the number of the day in the year (DOY) coded on 3 chars and completed with '0' its the left.

concat(extract(YEAR FROM date_max),lpad(extract(DOY FROM date_max)::text,3,'0'))

We compare the current form version to the hypothetical newer one. If the "hypothetical" one is greater than the current one, we create a new draft.
To do so and following steps, we added new functions to central2pg set of functions.

SELECT data_from_central.create_draft('my_login','my_password','my.central.fdqn',5,'CMR_Cistude_captures')
FROM data_from_central.cmr_cistude_pose_pieges_session_courante
WHERE concat(extract(YEAR FROM date_max),lpad(extract(doy FROM date_max)::text,3,'0')) > data_from_central.get_form_version('my_login','my_password','my.central.fdqn',5,'CMR_Cistude_captures')
LIMIT 1;

Once the draft is created, we push the fresh geojson as a form attachment to the draft.
-> to be done : such a function need to be written for csv or this one need to be adapted.

SELECT data_from_central.push_media_to_central('my_login','my_password','my.central.fdqn',5,'CMR_Cistude_captures', '/home/postgres/medias_odk', 'pieges.geojson')
FROM data_from_central.cmr_cistude_pose_pieges_session_courante
WHERE concat(extract(YEAR FROM date_max),lpad(extract(doy FROM date_max)::text,3,'0'))::integer > data_from_central.get_form_version('my_login','my_password','my.central.fdqn',5,'CMR_Cistude_captures')::integer
LIMIT 1;

And then publish the new version :wink:

SELECT data_from_central.publish_form_version('my_login','my_password','my.central.fdqn',5,
    'CMR_Cistude_captures', concat(extract(YEAR FROM date_max),lpad(extract(doy FROM date_max)::text,3,'0'))::integer                            )
FROM data_from_central.cmr_cistude_pose_pieges_session_courante
WHERE concat(extract(YEAR FROM date_max),lpad(extract(doy FROM date_max)::text,3,'0')) > data_from_central.get_form_version('my_login','my_password','my.central.fdqn',5,'CMR_Cistude_captures')
LIMIT 1;
5 Likes

Amazing. Thank you so much for sharing.

-> to be done : such a function need to be written for csv or this one need to be adapted.

Any tips on CSV adaptation? Often csv data is updated in my case Rural dairy farm case studies e.g. license plates, cell phone #s

I will first simply copy/adapt the push_json_media_to_central() function for csv files. I will try it by the end of the day and update the repo to add push_csv_media_to_central() function.
In a second time will probably add a generic function to upload csv or geojson or xml attachments that checks the file extension to adapt the Content-type Header.

1 Like

@alios82 We finally now use a generic function to push any geojson, csv or xml media as attachment to draft form : push_media_to_central(). I made the modification in the showcase exemple.
And here you can find the new function's definition.

2 Likes

Hurray! Much appreciated.

1 Like