Central Entity uploads from file

Reply by October 2nd!

Hello all! We are working hard on Entities. Entities are a way to represent people, places and things that can be shared between forms and updated over time. They make it easier to do things like longitudinal data collection or case management. You can learn more in the documentation.

We are starting to look at bulk uploading Entity data directly into Central from data files, which we know is necessary for many of you to begin making use of Entities.

We have a lot of work to do and questions to answer in this area, but for now we are starting with a couple basic questions that we're curious for feedback on.

Question 1
Given the choice, would you prefer to upload Entity data files to Central as CSV files (Comma Separated Value) or Excel files? What's easier for you to imagine doing? If it doesn't matter to you or you're not sure, select both.

  • CSV (.csv) files
  • Excel (.xlsx) files
0 voters

Question 2
How often do you need to bulk load Entity data from a data file? Please choose all that apply.

  • Once, when the project begins, I will load all the data from a file
  • Sometimes I want to replace all the Entity data with the data from a file
  • Sometimes I want to add many Entities to the existing records in Central, from a file
0 voters

Thank you for participating and giving us valuable feedback! Your experiences help shape our decisions.

Please feel free to leave additional comments below with your needs for a bulk upload Entities feature. Do you upload on a schedule, for example every week? Do you have a strange file format you have to deal with? Does your bulk data come out of some other system or process? We can't address everything but hearing your individual stories and the context around your use of the feature can help us even more than the responses to this poll.

The polls will close in 2 weeks, on October 2nd. Thank you!

7 Likes

@tgachet you mentioned in another thread:

Would you be willing to say a little bit more about your use case? In particular, it would be helpful to know roughly what your entities represent. Are they values that change rarely but need to be shared between forms (e.g. county names)?

Hi!

In my use case, entity lists are subject to change regularly (e.g study id or name, observer's company, geojson localisation). These data don't have to be shared between forms because they are generated from a third-party web application and I want to share them to my forms in Central (no liked forms yet).
Actually, I use the media .csv function to update the lists but this requires publishing every time a new version of the form in Central (and downloading all media each time on the ODK collect side).

I would like to be able to automate the reporting of entity lists without having to publish a new version and without having to download all the entity lists in ODK Collect once in the field.

I hope I'm clear enough! :slightly_smiling_face:

1 Like

Thank you, @tgachet, that makes a lot of sense! In that case, you might actually be well-served already by the API I described in the other thread! As you've tried it, does it feel like it would work for your needs or do would you have a strong preference for a way to specify your updates in a single file? I imagine that since this would be done programmatically you'd have a preference for CSV over XLSX, right?

Hello @LN , I'm going to test the API and updating entities (which is available since version 2023.3 of central if I'm not mistaken?).

From what I understand, I will certainly not need to go through csv and the Central web application because I will directly interrogate the API via my ETL connected directly to my database which contains the data I want to share trought datasets (lists of entities!

1 Like

Hello @LN!
I finally tested the creation of entities through the API and it works very well!

This is exactly what I wanted : to be able to insert or delete entities without using forms.
I use the API from the FME ETL and can transform data from my database tables into entities invisibly to ODK Collect users!

To test and familiarize myself with the ODK Central API, I use the opensource Bruno software. Here is my public github repository allowing you to retrieve the working environment around the ODK Central API in Bruno.

4 Likes

Thanks so much for that feedback, @tgachet!

In the next Central release, we'll be adding the following to the API:

  • bulk upload from JSON (the frontend will accept CSV initially)
  • entity list creation
  • entity property declaration
3 Likes

@tgachet I think we'll talk about entities in April and I hope you'll introduce me to Bruno !

1 Like

Avec plaisir @mathieubossaert ! :wink:

Hello!

I continue my tests with entities :slightly_smiling_face: and I found a small flaw in updating entities on ODK Collect.
When I add entities to a list of entities (always with the API), no problem on the Collect side, I recover the new entities.
When I delete an entity (with the API), the update is not effective in Collect (I can still see this deleted entity).
Deletion in Collect is only effective if the deletion is accompanied by an addition (updating entities in Collect is therefore only forced during additions)!

You have discovered https://github.com/getodk/central/issues/599, sorry about that! @ktuite is on it and we should have a fix soon.

OK no worries! I didn't think to check the issues, I will do and post if necessary!

thank you @tgachet for your github repository using Bruno, I've used correctly with ODK Central!

thank you @LN for the great work on the API, I will be looking for next Central release for the bulk upload, so neat feature!, aprox when will this be possible? we have a data collection starting on april 11th that will benefit greatly from this

2 Likes

It's possible we'll have released by then but it's going to be tight so I would recommend making alternative arrangements such as setting up a script that can do your upload.

We have nearly all of the functionality for the release in place. We're now starting quality assurance and will fix all issues discovered before we start a dedicated regression testing pass. Once we pass regression testing, we'll be ready to release.

2 Likes

Importing CSVs into Entity Lists was added in v2024.1: https://docs.getodk.org/central-entities/#importing-csvs-into-entity-lists

We will eventually add update by CSV. You can do updates today by using the API, for example, using the pyodk merge function: https://getodk.github.io/pyodk/entities/#pyodk._endpoints.entities.EntityService.merge