When creating select options from external (and internal) data in XLSForm, the "name" and "label" data columns are hardcoded in the XForm output (and therefore required in the data), when using:
select_one_from_file filename.csv &
select_multiple_from_file filename.csv &
- (are there others?)
This means that you'd normally have to modify the data file to rename columns (or XML nodes), which is not great.
This is purely a pyxform/xlsform restriction because Collect, Enketo, iXForms etc are perfectly capable to deal with other names.
How about adding something outrageous (but very nice) support for:
select_multiple_from_file hh-data.csv using hh_number as value and hh_name as label (where either one or two column/node names can be changed)
What do you think? Implementation considerations/problems in pyxform would be very valid arguments, of course.
Sorry I didn't respond to this earlier. Agreed this is important to do.
How about using the
parameters column in XLSForm? We added it for question types that need any kind of specialized data. That would mean introducing
label parameters for those types, both of which are optional. There's no possible validation because the external files wouldn't be available to
pyxform. So whatever values are provided would just be passed on to the XML.
Yes, using parameters would be perfect to override
Thanks, @martijnr! I've been having a Briefcase-related conversation with a user that overlaps with this and which led me to an enketo-transformer issue and a Kobo forum post. He's @Freeedim in those places and I'm hoping we can unify the various threads and make progress.
He rightfully points out that it's important to be able to specify labels in multiple languages so columns for the underlying value and a single label aren't enough.
I don't find the
jr:itext multi language machinery very compatible with external secondary instances because it expects all the translations to be in the form definition. This defeats the purpose of pushing data to external files. Unless I'm missing something, I think we might need to consider extensions to the specs for
jr:itext and external secondary instances to fully support using external secondary instances as sources of select choices.
search() provides multi-lingual support by:
- having the user define column names in the
choices sheet (for example,
country in the
country_name in the
nom_pays in the
label::French (fr) column, etc.)
- using a static choice list with a single item in the form definition where the values for
name define the columns to query in the external CSV
- using a
jr:itext call for the
ref attribute of the
label if the form is multilingual (for example,
jr:itext('/data/produce/name:label') in external-csv-search-language.xml (2.0 KB) )
- having a
text item with a
value that represents the corresponding label column in the CSV for each language in the
itext block. (For example, the text block for
/data/multi_produce/name:label in the form above has the value
label::French in French).
@martijnr am I forgetting a straightforward mechanism for specifying multiple language labels in an external secondary instance? This is a big thing that has kept us from moving on from
Good point. Would be nice to figure that out too. Thanks!
Nothing defined afaik, but actually Enketo does support an undocumented, rogue and forgotten method that relies on using
::nl postfixes to CSV columns or
lang="nl" attributes in XML nodes. I wonder if that less flexible solution would be acceptable, as it's a very lightweight solution and I like it.
It would be exposed by a (new) translation function call. Enketo chose the function name
translate1 and it could be used in the above proposal like this:
resulting in the XForm output:
<value ref="hh_number" />
<label ref="translate(hh_name)" />
That new XPath function would return from its node-set parameter the first XML node (including transformed CSV to XML) with a
lang attribute that matches the current language.
1 Very bad name because there is an XPath 1.0 function with that name that does something entirely different. We could use something like
A related note around select_one_from_file:
I noticed that in the last version of Enketo a select_one with search() or a select_one_external renders the form without options. I then moved to use select_one_from_file (as suggested here: https://github.com/kobotoolbox/enketo-express/issues/545#issuecomment-242851549) which works excellent in Enketo but makes ODK Collect crash if I have labels in multiple languages.
So it seems that, at the moment, a form with external choices cannot be used in both Enketo and Collect.
To be precise, we don't have a specification for localized external choices.
select_one_from_file works the same in Enketo and Collect as far as I know and neither supports localization with forms produced by
pyxform at the moment. What @martijnr describes here implies it is likely possible to get localized files working with Enketo with modifications to the form definition XML but these modifications would currently be outside the ODK XForms spec. On the Collect side, the
search() appearance/function lets you specify columns to use for different languages' labels on the
choices sheet. This is also currently outside of the ODK XForms spec.
Could you please share a form that results in a crash? Here is an example that works the same in Collect in Enketo. The unlocalized labels are shown as expected and the localized columns are ignored.
To be explicit, I brought in the localization concept into this thread because I think it's critical and because I also think it may make us go down a different path for the column customization. Often when people have external CSVs, those come from external processes, and the ideal for them would be to be able to use their files unchanged.
For example, they may have columns
nom and they'd like to be able to specify that values in the
name column are to be used as the label in the default language and that
nom is to be used as the label in the
French (fr) language. Putting all of this in
parameters seems it could become quite cumbersome. I'm not sure what a reasonable ODK XForms spec would look like for this (maybe a special instance to represent the mapping?) but I think it's important to at least explore. We may end up having to go with a convention like the
::language (lang code) suffix for practicality reasons but in many cases it will be less convenient from a user standpoint.
I'd really like to take this on but it will likely be a sizable project that touches multiple tools so I'm not sure when it will happen. Spec ideas are most welcome in the mean time.
This is the revised proposal that includes translation support that @LN and I have come up with and want to present for further discussion. It includes additional label parameters with a language code (e.g.
label_fr) as shown in this example:
||value=id label_en=species label_fr=espèces
It works the same with XML external data files. It also works with choice_filters (so unchanged).
Compared to Enketo's rogue feature mentioned above this proposal does not require a specific CSV data structure, except that column headings should (ideally) be valid XML node names (we could work around that limitation though). So it provides the ultimate CSV flexibility. A small disadvantage is that it wouldn't work with XML files that use e.g
lang attributes to provide translations. However, since we require a specific XML structure (
root > item) for
select_from_file anyway, that does not seem a big deal.
The earlier example without translations remains unchanged.
Pyxform would produce the following output (ignore this if you're not an XForm enthusiast or developer):
<translation default="true()" lang="English (en)">
<translation lang="French (fr)">
<instance id="my_data" src="jr://file-csv/mydata.csv" />
<label ref="./*[local-name() = jr:itext('external_instance-my_data-0')]"/>
The reason for introducing
local-name() instead of
name() is to provide better support for external XML data which will likely be name-spaced.
local-name() will ignore the namespace prefix.
(Not 100% sure if this ref value is XForms-compliant. Tbc.)
Any feedback would be very welcome!
P.S. Thanks @LN! Note, that I made a few tweaks to change parameter
name back to
value in XLSForm, use
* instead of
node() and changed the example to not use
That's a good point. The only
ref with a predicate I can quickly find in a W3C XForms example is this one and I don't know how related it is. In our spec, I think
ref is always a simple path expression without a predicate with an exception for
jr:itext. The JavaRosa implementation definitely would need to be reworked to support an expression with a predicate and I can't immediately tell how difficult that would be.
That said, it continues to be the most flexible and simple approach I can think of so unless we get other proposals, I'd be ok with moving forward.
It's inconsistent with the
choices sheet but I don't know how many form designers would make the connection between the two. This seems fine to me.
Makes sense. And to confirm, if there were both
label::French (fr) columns, the accepted parameters would be
label_fr, right? I say "accepted" but I would be in favor of making defining labels for all languages otherwise declared in the form an enforced requirement. If the source document doesn't actually have columns for every language, a form designer could just do e.g.
CC @Xiphware who would probably find this one kind of fun, too.
Good find! Definitely would not be able to ever support predicates in
ref for anything other than querying an external data file (not primary instance/bind) in Enketo. So if we go ahead, we should probably clarify its limited use in itemset value/label children in our XForms spec.
Yes, I agree and also agree that we may as well make things easier for ourselves to be strict for translated forms.