1. What is the issue? Please be detailed.
On ODK central, i can only delete entities one by one. I would actually like to delete many more all at once. Is it possible?
2. What steps can we take to reproduce this issue?
Simply upload way to many entities.
3. What have you tried to fix the issue?
I deleted entities one by one... that's too long for 75K
4. Upload any forms or screenshots you can share publicly below.
I think pyODK is a good way here (probably for all of your workflow) as it has some functions to make entity management more straight-forward:
You can do this by creating a csv (on your PC) with all the entities that you wish to delete (e.g. download the Entity List from Central). Then run a loop to provide pyODK with the uuid for each entity that needs to be deleted (this is stored in central as __id).
I think at the moment there is no function to delete an Entity List, but I believe that it is in progress - this would be 'too easy'
My advice would to become familiar with managing submissions and Entities through pyODK (or ruODK if you already work with R) - working at scale of 75k+ will give you a big headache otherwise. And if you go to data a collection phase without these tools, it will hurt. A friend told me...
If you are working with geometry based entities, check out QuODK as it might help (assuming that you are familiar with QGIS!).
Other tools are available, and there are still gaps in functionality for that specific 'recipe' I describe above. But it will get you closer.
I know that this is not a 'cut and paste' answer, but it is a fishing rod, rather than a fish... Good luck.
I want to be able to manage this big list of entities over time, so i will rather use pyODK to manage the entities. Is there a way to filter the entities based on their proprieties - nothing is mentioned in the pyODK doc? So i need to maintain the all list also on my side to filter which ones to delete right?
Can you share a little bit more about the workflow? That context would help us make more helpful recommendations. Questions that come to my mind:
When and why is a plot deleted? Would a plot need to be undeleted?
You have an external system that has the plot data. Are you doing a one time import into Central and managing everything there or are flowing data in between ODK and some other system?
Yes, we have a large dataset of 75K plot mapped over the years. The project will carry on in mapping new plot but also updating the plots in case of change of ownership. For example an owner pass away and his 2 sons are inheriting part of the lands. We need to delete the father's plot entity and create 2 new entities.
In terms of implementation with ODK, we need a form to "deleting" the farther's plot. Then the field agent will create 2 entities through the usual new mapping form. Now once the mapping have been done and the 2 new entities created, the data engineer have to clean the geometry (remove self intersection, duplicated points, ...) this mean accessing the geometries fixing them with QGIS if necessary then updating the entities back in ODK with the fixed geometry.
Yes, you can have a form that deactivates the father, activates the children, and so on. Or use some Python script to do that.
You could use the status property to identify which plots need cleaning and have another script export just those Entities as GeoJSON for the data engineers to import and re-import.
That's essentially what I designed QuODK to be able to do (version 1.2). The last (or maybe latest) piece in my puzzle is using the pyODK merge function to allow update of specific properties within the entity (e.g. just the geometry or an owner ID) - work in progress!
Not disagreeing with @yanokwa on the use of GeoJSON especially if you have a dedicated data engineer to write scripts and transfer data. However, if you are already using QGIS, QuODK will load submissions and entities to the Canvas as temporary layers to allow editing (of geometry and attributes)... Then you can export any features as a CSV (with a subset of attributes if required) to send back to Central using pyODK. Not quite automated, but hopefully just enough checkpoints in there to prevent inadvertent data loss.