No filter in odk-central

I installed ODC Central today, because I like the ruODK package in Rt. When trying to find out how to retrieve a single submission by filtering it, I noted

"Currently, there are no paging or filtering options, so listing Submissions will get you every Submission in the system, every time."

Downloading all records is not realistic for me, I always want to retrieve a single record of one patient at a given visit, not the whole patient database. So something like query={"pat_no":"1234"}.

Is there a workaround? Back to old-style?

Could you do what you need using the $skip and $top parameters?

The $top and $skip querystring parameters, specified by OData, apply limit and offset operations to the data, respectively. The $count parameter, also an OData standard, will annotate the response data with the total row count, regardless of the scoping requested by $top and $skip. While paging is possible through these parameters, it will not greatly improve the performance of exporting data. ODK Central prefers to bulk-export all of its data at once if possible.

see https://odkcentral.docs.apiary.io/#reference/odata-endpoints/odata-form-service/data-document

Sorry, no, I have no clue where in the table I find patient 123...

What do you need to do with a single record? Please describe your high-level need in as much detail as you can (what answers are you trying to get from the data, what existing systems do you need to integrate with, etc).

Hi all, my answer to Dieter's question (from an ruODK perspective): https://github.com/dbca-wa/ruODK/issues/72

Downloading all data into one protected code repo / server and filtering locally works well for my use case, but I'm aware that my data is(threatened species in politically sensitive locations) is not on the same level of data privacy requirements and liability as human patient data.

Interested to hear about Dieter's use case!

3 Likes

Example: Create individual reports (1 patient per pdf, for example, generated by rmarkdown) of all laboratory values of patients last week who tested positive for Corona.

This is a standard SQL query, but as I see ODK is more geared towards summary statistics.

@Florian_May's response is an interesting read, but I feel uneasy to download all and filter locally. The "last week" part could be done on the server, as I read Florian's post.

Looks to me like this does not scale, working well at start and getting sluggish with time . I should test, not feel though.

See

how to get a subset in Kobo.

Interesting use case!

Until ODK Central supports OData queries, the options are limited to the "download all data first" scenario.

In favour of this approach, at 1 PDF per patient I'd expect the combined report render time to outweigh the total download time (including earlier records), especially if media attachments are non existent or their download is skipped (ruODK::odata_submission_get(download=FALSE)).

@issa might have insight into performance for larger amounts of data.
@dr_michaelmarks might be able to share real-world performance considerations with their famous 1 million+ Ebola records (Aggregate & encrypted forms & Briefcase).

1 Like

We've discussed the ability to filter by submission metadata, particularly submission date and submitter ID. That sounds like it would help for pulling all patients last week. Do you have other filtering use cases as well?

One other thought: I'm not sure this is an official part of the API, but I think the ….svc/Submissions endpoint returns submissions in creation order (so that newer submissions come first). If you know you need the most recent submissions, you might be able to leverage that fact.

I also wanted to link to a few related discussions:

Filtering by submission data is already possible currently in two steps by downloading the full list first, then individual downloading. Not more than a workaround.

Kobo has the fairly hidden mongodb query ("where" part)

https://kf.kobotoolbox.org/api/v2/assets/atuMmdddddt/data/?query={"pat_gruppe/pat_name":"Lukumbu"}&format=json

and in theory also the select part using fields which I did not get to work.

supporting strict equality filter in odata export is definitely not beyond consideration. the only issue is that we open the can of worms that is parsing odata filter expressions, and rejecting all the cases we can't handle (which will be most of them).

perhaps if our team grows we can entertain the idea of actually servicing a broad range of filter expressions. for now i think supporting strict equality filter over odata is a reasonable ask, but with our team as small as it is i'm not sure exactly when we will get to it. right now the focus is on web forms rather than on export.

1 Like