Retrieving Dynamic Media from Entity

1. What is the issue? Please be detailed.

Summary
Essentially my question is how do I create a select multiple or select one question that looks like this: https://docs.getodk.org/form-question-types/#select-widget-with-no-buttons-appearance but uses media collected in a different form using entities?

It is important to note that when tapping the answer shown as an image, an audio should play associated with that image. I only need this to work in Collect and don't need it to work in Enketo. I've made an example form with images and audio here:

Selects Image Audio.zip (3.3 MB)
Looks like this:


SStest.mov.zip (162.8 KB)

that uses images as form attachments and the choices are just listed in the choices tab of what I would want the final question in a follup form to look like, but I want the form attachments media to be retrieved from an entity.

Details of Desired Workflow Using Entities

I'm working in a context where a minority language is primarily used orally and doesn't have an official written form. I would like to use entities in order to quickly obtain auditory linguistic data in one form when new words, concepts, important examples are found. Later, I would like to be able to review that linguistic data in a different form. The first form would be used ongoing to create new entities for large quantities of linguistic data. Let's say for any given language project we would expect to have AT LEAST 5-10k words which would each be an entity and would contain at a bare minimum a short audio recording of that word, optionally another longer audio recording of that word being used in a phrase or sentence, and often an image, but if no image, a text string would be required as well as several other selects that could be any number of things (keywords, parts of speech, subject/category, etc. Over a period of years, this list would continually be added to and I don't know what the max submissions would be, but my guess would be in the 30k-50k submissions range.

The current goal of the 2nd form would be to review linguistic data by using filters/cascading where I could filter all the linguistic data using several different filters to limit responses in a select multiple (preferably) or select one question.

It seems like this issue may be similar to this: Dynamic media - video in choice list from dynamic CSV - #7 by LN

In the following discussion LN mentioned this on April 11 2024:

That is precisely what I'm looking to do.

In the ODK Question Types documentation it says "Selects from external datasets can be used in all the same ways as internal selects". However, in the Entities documentation it states, "Another difference is that there currently isn't support for media or translations in Entity Lists."

Are there any future plans to include media or translations in entity lists?

2. What steps can we take to reproduce this issue?
Download the two files below, and upload the forms to Central. Do a couple submissions with the add participant1 form and then wait 15 minutes for the entity to update for the participant followup.

Add participant1.xlsx (20.4 KB)
Participant follow-up.xlsx (77.3 KB)

3. What have you tried to fix the issue?
So far, I just get the text of the file name rather than the media itself seen in the screenshot below:

I'm new to entities and Central but I'm really wanting to make this work. Attached below are the example forms I've used to try to get the functionality I'm talking about. I tried adding an audio, image, and media::image column to the entities tab in my XLSForm which can be seen here:

Add participant_ex.xlsx (20.4 KB)
Participant follow-up.xlsx (77.3 KB)

Error I get when trying to upload the attached add participant ex XLSX form to Central. I was trying to do what I thought was explained here: Pull data with photos - #8 by TobiasMcNulty but in the entities tab where I added a media:image column.

The XLSForm could not be converted: The entities sheet included the following unexpected column(s): 'audio', 'image', 'media'. These columns are not supported by this version of pyxform. Please either: check the spelling of the column names, remove the columns, or update pyxform.

1 Like

This might be obvious, and it's not a seamless solution due to the extra steps required, but might get you through until entities have media support;

Have you tried generating a choice list of your entities with your media filenames (image/audio) that result from the submissions and including that in your followup form, plus include a non relevant select that uses this list. Central will then call for the attachments to be uploaded. If you create your dynamic filenames using a concat('jr:\\image\',${filename}) type expression (or store this as a value in the entity list) then the form shouldn't fail as it has access to those media.

I saw your post regarding this option. While I may look into that, the main reason I'm wanting to avoid this is because with my current workflows for this project (and another) small amounts of data will be added on a more frequent bases (daily). While that extra step seems like it would be doable if I only had to do it every month or so, this wouldn't allow me immediate (daily or near daily) access to data that I've collected without continually updating.

Another reason that I'm trying to avoid that is with regards to internet. My question with regards to this is that right now in ODK Central I see one button to download all media attachments. My internet situation is via mobile data only and as my dataset grows to lets say 5GB, would that mean if I incrementally grow it by 10MB/day, would I have to download 5.01GB on day 2, 5.02GB on day 3 etc.? I don't do any coding really, but was able to do my own ODK Central installation and it seems like there isn't an easy way to do incremental downloads of media on the backend.

Would there be a way to modify the current link below which is currently on the button to download all media attachments so that I could only download media attachments from a given day forward or some other filtered way?

https://odk.example.com/v1/projects/1/forms/FFF/submissions.csv.zip?splitSelectMultiples=false&groupPaths=true&deletedFields=false

I'm looking forward to offline entities even though they currently don't support media, but if entities worked with media offline in the future, this would be a game changer!

1 Like

There are better ways to do this, but two manual ways are to 1) filter your submissions to a date range (eg daily as yesterday), then download all, and the zip will only contain media from that day 2) change status once downloaded to 'approved' and then filter to exclude that and download the remainder

1 Like

Wow, that was more obvious than I'd like to admit. I guess this solves the internet issue for using this as a workaround.

@ahblake Is there anything else you think I should keep in mind while using this workflow as a workaround until entities gets support for media? I'm wanting to make sure that if I do start using a form and collect a large amount of data, that things are setup right so that the same form could be used down the line without a lot of work to get the 2nd form up and running by importing media from an entity created in the first form instead of the CSV file and all the media files as form attachments.

Thanks again for your help!

1 Like

I'm only dipping my toes into entities now and starting to find my own current limitations (media like yourself, sorting, etc), so I'm not aware of all the potential pitfalls.

My thought though, is that if you were setup your capture form to write the filename of the image and audio to values in your entity list now, eg filename_photo and filename_audio, then you have all the data you need to build your choice list (use the entity OData url to pull and transform your entity list regularly and update your form with a current choice list that has the entries and their filenames)

When media are supported in entities, hopefully there's a simple way to upload/link files to an entity, and your entity list will already have the filenames for matching. Presumably the media support would let you create new entities and directly store these media against the entity in the designated/reserved values for this, eg entity:image1 and entity:audio1 or attachment:image1 etc.

And your followup form would still point at the entity list, but instead of also needing an internal choice list that has each entity ID plus filenames, and uploading media as attachments, it would be modified to get the media from the linked blobs in the entity list.

1 Like

Thanks, we should make the no media/translation distinction clearer from the start.

Yes. For translations, we have a promising proposal here that we haven't had a chance to prioritize more work on yet.

I think @ahblake has cooked up the best workaround there is for now! If you haven't seen it already, this post may provide a little more context on the way things are now.

You may also find it interesting to take a look at this spreadsheet where we track the high-level functionality we believe we have ahead of us as we expand the Entities model. You'll see that story 0.09 is about attaching files to Entities. We track the work we have scheduled in the roadmap and that spreadsheet is a dynamic thinking and planning tool.

That's the intent! There's likely going to be some tricky work needed for Collect/Enketo to be able to access media that belongs to an Entity rather than a Form, possibly including some additions to the form spec.

I'm glad that sounds sensible and wasn't pure madness. I'll explore this some more myself when time permits.

@Tyler_Depke btw, if you add a powerquery to your form that pulls the entity list via odata and transforms it into the required columns and format, you can then dynamically spill the contents of that table into your choice list (at the bottom so it doesn't overwrite other choices) worksheet so that updating is;

  • open form
  • refresh query (spill will update your choice list automatically)
  • save and upload new draft
  • upload required new attachments
  • publish

Not a "good" solution, but an interim.

I haven't had time to continue working on this so I haven't gotten the workaround successfully working yet, but I'll be working on this next month and I'll be back if I have more questions!

@LN

Thanks for posting the Excel Sheet with all that info about high-level functionality for Entities as I hadn't seen that. The roadmap I saw before and is also helpful and thank you again for all the work you and the entire ODK team puts into this. Changing the world for the better, one line of code at a time!

1 Like

I thought of two more workflows for that both involve obtaining photos from previous submissions via entities. I thought I'd share them here in case y'all need more detailed user stories. If it would be better to start a new thread, let me know.

Workflow 1

This past week I was doing some tree inspections in our nursery and I found a tree that didn't have an ID tag. I was in the process of labeling several trees and this happens every couple of months when tags get damaged or lost over time. When trees are planted in a permanent location, this isn't as much of an issue because of GPS/mapping, but these are still in a nursery in buckets.

Based on the labels that I had, I knew that this tree was supposed to be one of several numbers. If I had entities that retrieved media from previous inspections, I could have scanned the tree ID and viewed images based on that potential tree ID from past inspections to verify/confirm that it was indeed the same tree. I have wanted to do this for years as every time I run into this issue, I am unable to resolve this in the field, and it takes several steps to resolve. I typically make a note, comb through the backend when connected to the internet on my desktop, download the images using the potential tree ID and then sometimes I even have to go back to the field, and do a comparison in the field to confirm using those downloaded photos. When there are multiple trees with this issue, or I go back and the potential tree ID wasn't correct, this creates several back and forth steps which could easily be solved in the field with media in offline entities.

Workflow 2
So we're planning to start tours of our demonstration farm in the future and we have many species of trees that people in this area are unfamiliar with and have never seen before. I would really like to be able to have a device that is either given to people for a tour that stays up to date with all submission data, or, participants would be able to download a form and be given the associated media files from previous submissions via an offline transfer from another android. Then, as they do a tour, they could scan trees they are unfamiliar with to view photos of the fruit(s) from previous harvests, see harvest data, and other data like, when the tree was planted, and even pictures from inspections each year for example. If video was possible, it might also be beneficial to have that as well to show people an overview of the tree, talking about it during different stages of growth.

I should note that I'd desire both of these workflows to use media in entities offline, which would likely create large local files, but as long as performance was acceptable, storage space shouldn't be an issue for both of these use cases.

I don't know what the differences are technically with regards to photos vs audio, vs video for adding this functionality to entities, but if I had to prioritize these three I would do so accordingly:

  1. Photos
  2. Audio
  3. Video
1 Like