Retrieving multiple values as string from csv with instance

Dear community,

Is it possible to get all values in a column of a csv or entity that correspond to the filter as a string?

As an example this standard string only returns the first value corresponding to the filter:

instance('external_csv')/root/item[search_column = ${matching_value}]/value

But the external_csv contains multiple rows that correspond to the filter:

The goal would be to return all these values in a string:

vale_1, value_2, value_3

The values returned would be the same as in a select_multiple_from_file question with a choice_filter applied but in our case the string would be used per example in a note or hint to guide remunerators.

Thanks for any suggestions.

You should be able to use the join function for this:

join(" ", instance('external_csv')/root/item[search_column = ${matching_value}]/value)

If you want it to be exactly like the result of a select multiple, join with a " " as I showed. You can also join with ", " or any other separator you would like.

If without a join or some other way of accessing a specific or multiple matches, the expression is interpreted as

instance('external_csv')/root/item[search_column = ${matching_value}][position() = 1]/value)

Notice the extra [position() = 1] -- the first match is automatically used. And yes, you can use a second expression in square brackets to index into a filtered result set!

Thank you @LN ! This works!

In a further step I want to retrieve just a part of the values as a number and save it in a comma separted string to retrieve the highest value in the string.

As an example:

The values to retrieve with

join(", ", instance('external_csv')/root/item[search_column = ${matching_value}]/value)

look like this: CRS-2344-DAF1, CRS-2344-DAF2, CRS-2344-DAF3

I would like to retrieve only the last number of the values in the string: 1, 2, 3

and then get the highest value in that string with max().

I tried it with this:

max(join(", ", substr(instance('external_csv')/root/item[search_column = ${matching_value}]/value, 12)))

But this returns just the first value (1 in the example) and not the highest (3 in the example).

A way out is to hardcode this with:

instance('external_csv')/root/item[search_column = ${matching_value}][position() = 1]/value)

and save each value in a seperate field and then do max() calculations on these fields.

This workaround works with a limited number of values. But not if there are hundreds of values such as:

CRS-2344, CRS-2345, CRS-2345, CRS-2347, CRS-2348, ...

with the aim to retrieve first the highest number (2348) and then generate a new code for p.e. agricultural plots based on the existing codes already stored in an entity-list (or a csv). In the example the new code automatically generated should be: CRS-2349

Is there any solution to this?

Thanks for any hints.

What you would like to do is apply the substr function to each result of your lookup expression and then compute the maximum value of that resulting set.

What you've tried doesn't quite work because calling substr on a result with multiple values doesn't work. You'd like to call it on each individual value instead but we don't have a way to express that. Similarly, you can't call max on a comma-separated string, only on a result set of multiple values.

You can kind of sort of almost do what you want with a repeat but currently there isn't a way to hide repeat from the user or prevent it from being submitted so it's not ideal. @Xiphware has built a number of interesting forms using this technique and you can see an example at ODK geofence (v1)

What I would recommend now is storing the individual components of the ID in addition to the full ID. It doesn't feel great to have that redundancy but it shouldn't have a big impact on performance or data storage since the values are so small and use a limited range of characters. To be explicit, I mean storing CRS-2344-DAF1, 2344 and 1 and ensuring through your form design that they are always synchronized.

As a side note, all of the computations that we do in forms use an expanded subset of XPath 1.0. XPath 2.0+ includes syntax like:

instance('external_csv')/root/item[search_column = ${matching_value}]/substr(value, 4, 8)

This makes it possible to specify a function/functions to apply to each value in a nodeset and would address this need. That's something we'd like to support one day.

Thank you so much for looking into this subject again @LN !

I am currently using a pragmatic approach by just counting the number of codes and then increase the count by 1. This can lead to errors if the highest number in a stored code is higher than the count + 1. The enumerator in this case can adjust the code manually. Not an ideal solution but it works.

I mean storing CRS-2344-DAF1, 2344 and 1 and ensuring through your form design that they are always synchronized.

This is a good idea. I wil try this in a new version of the form.

1 Like