1. What is the issue? Please be detailed.
Choices that begin with digits followed by a dash cannot be used as defaults with varying behaviour for single and multiple:
A single eg 1000-A123 will validate and upload but not select the default option
A calculate eg once(concat('1000-A123')) will upload and set the default in Enketo, webforms & Collect.
More than one eg 1000-A123 1001-B456 will not validate & upload
A calculate eg once(concat('1000-A123 1001-B456')) will upload and set the defaults in Enketo, webforms & Collect
2. What steps can we take to reproduce this issue?
Try the attached form
delete the default for testdefault9 to allow it to upload
3. What have you tried to fix the issue?
Created a test form with varying combinations of letter/number/dash/underscore
The docs do say that choices must begin with a letter or underscore, however this is not enforced on upload, and seemingly is no longer true, as entity name values will often (28%?) start with a number
name: the name of the field represented by each row. It may not contain spaces and must start with a letter or underscore.
4. Upload any forms or screenshots you can share publicly below.
Upload error for a field with a default as two strings beginning with digits
The XLSForm could not be converted: ODK Validate Errors:
XForm is invalid.
: Invalid XPath in value set action declaration: '1000-A123 1001-B456'
Problem found at nodeset: ${model}[@xforms-version=1.0.0]/setvalue
With element
I can be wrong, but I feel like it's the pyxform responsible for this behaviour. I guess, it's treating 6-g as 6 (minus) g and 7-h as 7 (minus) h, and trying to evaluate those expressions before assigning the default values (which obviously it fails to). I tried bypassing the pyxform and uploaded the .xml to ODK Central (instead of .xlsx) with the same defaults (6-g 7-h), and it worked! Attaching the same below:
Thanks for the detailed report and test form @ahblake. On XLSForm online and the webforms preview (staging) it seems to work if the value is wrapped in a "string" function i.e. string('6-g') chooses one of the multi-choice values, or string('6-g 7-h') which chooses those two values. It seems like for these values, the default value looks like an expression so it's validated as a XPath. Could you try this workaround and let us know if that is suitable? A similar issue that was resolved with this workaround was discussed here: URL widget does not allow sites that has dash(-) on the URL - #6 by Xiphware
As for pyxform behaviour, unfortunately this is an area where it can be difficult to guess user intent (is it an expression or static value?) and so has been the subject of many patches over the years. When pyxform tries to detect a dynamic default expression that needs to go into setvalue, it's parsing this 6-g as "the numeric value six, minus the value of an element called g, or in other words the token pattern "[number constant][math operator][element name]" which looks much more like a dynamic default expression than a static value. If it wasn't considered a dynamic default then the default would be emitted in the data instance as per the manual workaround @MinimalPotato added (as shown in code here). For more examples see the test cases here (also on L547, a-2 parses as [name], 1-1 parses as [number][number], a-b parses as [name]) and source function here. Arguably (argued here) it's reasonable to ask users to specifically indicate that problematic literal default values are to be treated as literal by wrapping them in string('') - or the alternative suggestion to implement parameter which could be used to put the value into the data instance instead.
Here's my analysis notes, please correct anything you think is off:
app behaviours
The form can be uploaded to Central if converted without ODK Validate validation
expected because Central generally trusts pyxform / ODK Validate to give valid XForms
Setting a default with an invalid XPath doesn't work in any form client (Enketo/Collect/Webforms)
expected since ODK Validate rejects it - though the docs are inconsistent (see below)
pyxform raises an error (from ODK Validate / Javarosa)
expected since it doesn't work but maybe pyxform could be a bit more specific about the actual problem or suggest the workaround
documentation
the ODK docs say choice names "may not contain spaces and must start with a letter or underscore"
seems like a copypasta from the survey question name section that says about the same thing
the XLSForm docs say only that choice names are "unique variable name for that answer choice"
there are examples of forms with numeric defaults but not with selects
form processing
error comes from Javarosa (via ODK Validate via pyxform) from code that seems to attempt to parse dynamic default values (setvalue) as an XPath source here
pyxform doesn't complain about choice names starting with a number, and actually has a test (added May 2021) that makes sure numeric choices are allowed, although the PR source changes didn't have anything to do with that specifically (it was about detecting dynamic labels) PR here
pyxform validates choice names when used in multiple choice, to not allow values containing a space (which is fine I think, otherwise in the above example it couldn't select 2 defaults) source here
So it seems like follow up actions could be
docs: update the ODK docs to match XLSForm docs i.e. no character rules for choice value names except that multi-choice can't have spaces, and the choices must be unique
docs: mention the requirement that dynamic default values have to be valid XPaths (possibly covered by this issue) so the string('') workaround might be needed if a static value looks like an expression
pyxform: add a test to document the current behaviour as expected (there's a lot of similar ones but not exactly the cases shown here)
webforms/collect/enketo: add a test (if not already) that the string('') workaround pattern works
I think knowing that the below operator is a valid value for the default column is helpful (I had tried other expressions like concat and not all are accepted), and remembering that values that aren't pure numbers or text should be wrapped in this expression to avoid the pyxform issues. Adding it to the docs would be helpful also.