XLSForm spec proposal: add syntax to make it easy to use a value from the last saved instance

This proposal provides user-friendly XLSForm access to the remembering previously entered values feature. Implementation in pyxform depends on a decision at Spec proposal: add first-load event to replace xforms-ready.

I propose adding a yes/no column to represent whether a particular question's value should default to the last saved value. For example:

survey type name label default_to_last
text state State yes
text street Street
select_one animal_type animal Animal yes

If the column contains yes for a particular question, that question's value will default to the last saved value.

As usual, the hardest part is naming. I think including the word "default" is helpful. This makes it clear that it's related to the existing default column in that it means the client will show a value that can be edited by the surveyor (or not). Some other ideas:

  • default_to_latest
  • default_to_previous
  • fill_from_latest
  • remember_value

I believe @cooperka and @tomsmyth will be adding this to a visual form builder and perhaps we could coordinate on naming.

Alternatives considered
An alternate approach would be to use the existing support for external instances and a more general dynamic_default column. This would generate a setvalue triggered on form load with whatever value is in the column. The form author could then enter something like instance('__last-saved')/data/a in that column to get the desired behavior. I think it's too hard to use and explain and so the added flexibility isn't worth much.

CC @Ukang_a_Dickson @yanokwa @martijnr @Xiphware @ggalmazor

2 Likes

I was in a private chat with @cooperka and he pointed out the fact that it's the "last saved" record as opposed to last opened or last created is important. With that reminder, my preferred name now is default_last_saved.

I'm generally OK with this, but had some questions.

  1. Why isn't this a new column and not a parameter? My guess is because parameters apply to a type of question and this applies to all questions, It'd be good to be explicit in the issue that's filed.

  2. Currently in pyxform, yes is an alias for true(). That is most (all?) columns are passthrough expressions and that would not be true (pun :laughing: ) in this case. Feels dangerous to me, but I don't really have a good alternative. It's not like we can use sounds_good or no_prob as the value...

I swear I think about these proposals deeply before publishing them!

Writing out a description of the feature and its usage here got me disliking the yes/no column idea. I think it's too limiting not to be able to easily do transformations on the last saved value. I now am preferring the dynamic_default column with the addition of some way to hide the complexity of instance('__last-saved')/data/a.

This proposal includes two parts:

  • a new dynamic_default column. The contents are passed through to the value attribute of the setvalue action (see documentation) triggered on first load.
  • a new transformation to hide instance('__last-saved')/data/a. In the example below, I suggest a __last suffix. For example, just as ${a} expands to /data/a, ${a__last} would expand to instance('__last-saved')/data/a

I didn't explicitly mention this in the prior proposal but either way, pyxform would add the proper XML to define the __last-saved instance if one of these dynamic defaults is used.

survey type name label dynamic_default
text street Street ${street__last}
date disaster_date Disaster date today()
integer patient_count How many patients have you seen today? if(${patient_count__last} == '', 0, ${patient_count__last} + 1
select_one yes_no same_street Are you still on ${street__last}?

This column would provide value beyond just making previously-entered values available. For example, form designers have been wanting to set a date to default to today's date for some time.

This definitely sounds like a better approach.

One proposed change (this may well be a naive question): would it be possible to just use the existing default column, and extend it so that you could enter values like the examples you provide (${street__last}, today(), etc)?

It feels like having columns for default and dynamic_default is a bit redundant, and I'm not sure that there's any precedent for having two separate columns for such closely related functionality.

2 Likes

Thanks so much for the feedback, @adam.butler, and very good point about default vs. dynamic_default. They do use different mechanisms "under the hood" so that's why I think of them as separate but I can't think of a use for specifying both. That is, any static default would be immediately overwritten by the dynamic default on form load anyway, I think.

So I do agree it would be ideal to combine them. From an implementation perspective, that means either all defaults need to use the setvalue mechanism or the XLSForm parser (pyxform) needs to be able to identify dynamic defaults. I don't like using setvalue for all defaults because it makes the XML harder to read and would make the form a little slower. I think regular expressions would be sufficient for pyxform to identify dynamic defaults.

I'm not a fan of the ${street__last} syntax.

I think it's hard to see how many underscores you have and it all feels a bit too magical.

I'd prefer something like last-saved(${street}), but not sure if this adds a lot of implementation complexity to pyform. Also, this would be the first example of a function that isn't passthrough, so maybe we don't want to go there.

One (probably terrible) alternative might be to introduce an entirely new syntax: !{street}

What do you think, @Ukang_a_Dickson?

1 Like

How about ${street:last}? I think this would make it still usable in the label column (which I don't think last-saved(${street}) would be), it avoids the need for a totally new syntax, just the addition of a kind of psuedo-class to the existing dollar sign notation...

I'm not sure what valid characters are for the thing inside the {} though...

Super fun discussion!

1 Like

Thinking about this some more, I'm tending towards Yaw's proposals. The ${...} syntax is clearly interpolating a variable value into a string using a common idiomatic syntax, but adding a __ or a : (or even a ::) diverges from that common idiom into the realms of magic. Another argument against __ is that it's explicitly mentioned in the spec section on Markdown as a way of bolding text.

Of course, if there was another part of the XLSForm spec that already used the same syntax to apply a magical function to a value, then it would be more OK (it's like rule #1 of improvising: if you make a mistake, repeat it - that way it's not a mistake anymore).

There might be an argument for saying that ${last_saved('street')} is the most logical syntax, since it says "get the value of the variable named 'street' from the last saved form and interpolate it here". It feels more logical to me, but others might see it as an abomination...

last-saved(${street}) is pretty consistent with selected(${favorite_topping}, ‘cheese’), so that could work, although I have to say that the whole string interpolation idiom falls apart with this usage.

And then !{street}... well the spec explicitly says:

Note the ${ } around the variable likes_pizza. These are required in order for the form to reference the variable from the previous question.

I could easily imagine a similar note to explain the meaning of !{ }.

I'd also support finding a way to keep this new functionality within the existing default column. It makes intuitive sense, which means it will be easier to explain to users.

!{street} is nice and short; we could consider other characters instead of ! (e.g., % or #) since ! often implies not.

Along the same syntax logic, a word instead of a character could work as well: data{street}

The advantage of not using last-saved as the prefix is that this implementation would work well for future extensions of this feature, i.e. towards case management. In that case data is not pulled literally from the last saved instance but could come from another source.

I think your broadened concept always requires specifying a record in some way, right? That is, to query arbitrary record you need to specify both the field you want and from which record you want it. I see how a short syntax like !{} can specify a specific record (e.g. the last one) but it's not clear to me how that could be generalized. Maybe you could share an example?

You're right, it broadens the potential scope for this new handle based on features we haven't built yet, i.e. case management, so it's probably premature to try and consider this use case. I suppose examples would be

recordID/data{street} or data{recordID/street}

in which case recordID could be the UUID or a different unique lookup field that specifies the record from which data should be retrieved. If and when we support case management, this would be a potential way of implementing this using the default column. It would be nice if the new syntax we're introducing now can logically be extended to accommodate this new feature. But of course it could do that also with a syntax of !{street}.

1 Like

I like the idea of consistent syntax between representing a value from the last saved instance and one from any instance as @Tino_Kreutzer is describing but I don't feel confident that I know enough about what the general case will look like to design for it.

My sense is that the ID of a record that needs to be consulted would always be dynamic (so ${recordID}) which makes it hard to use as a prefix and that the goal would be to fetch records based on many different characteristics, not just ID. I'd guess that advanced users will use XPath directly (as they already do) and there will be convenience functions for those who don't need the full flexibility (like pulldata or indexed-repeat).

I have a slight preference for something that leverages the existing ${} syntax because it feels like, to use @adam.butler's language, it's the same interpolation but with a qualifier:

  • ${street} goes to /data/street
  • whatever new syntax is agreed on goes to instance('__last-saved')/data/street

I agree that __ is hard to deal with in a user-facing context so let's take ${street__last} off the table. To answer @tomsmyth's question, only characters that are valid in XML element names are allowed in field names: https://www.xml.com/pub/a/2001/07/25/namingparts.html. If we do stay within ${}, then, we can use a separator that is not valid in XML to make absolutely sure it can't conflict with a user-given name. Something like ${street#last-saved} or ${last-saved#street}.

Introducing a whole new thing like !{} or #{} feels a bit heavy for something that probably won't be used in so many forms. I also think it's hard to remember which special character to use. I'm not deeply against it, though.

I'm not thrilled about something that looks like a function but isn't but I could be convinced.

Yes this is my thought exactly. It's quite like the psuedoclass concept in CSS. What about ${street|last}.

1 Like

My experience is that non-developers can spend a LONG time hunting for the pipe character on their keyboards and/or use a capital i or a lowercase L and get very confused. It's especially confusing because some keyboards have it labeled as a split pipe (¦).

An additional requirement: whatever characters need to be typed should be recognizable to anyone.

That's a good point about the split pipe, @LN - I also think that # is a good option. I've been thinking about @Tino_Kreutzer's idea of thinking forward to possible case management use cases. It seems like it would be helpful to come up with a generic way of saying "the value of field x of entity y". In this particular example, x = "street" and y = "the last saved form". In a possible case management scenario, we would probably want to reference fields on the root entity (where "root entity" means e.g. "the patient I'm reporting on", or "the tree that I return to measure every month") (NB I'm not saying that "root entity" is the best name for this, just using it for the sake of these examples!).

In the case management scenario, we would probably want to reference these fields in labels (e.g. "what is 's temperature?") or skip logic (e.g. "skip the next question if the tree is partially in shade"), rather than default values, so that's worth bearing in mind: it would be good to come up with a syntax that is also usable in both of those contexts.

It seems like we all agree on ${...} as being a reasonable way of representing "the value of", so now we just need to work out the best way to say "field x of entity y". The two options we've talked about are x#y and y(x), but I wonder whether it might also be worth considering something along the lines of y#x, as @LN suggested?

Possible values of y would be pre-defined, such as last-saved or root-entity (or more succinctly, last and entity). So in the examples that we have:

  1. Last filled street: ${street#last} or ${last#street} (it might be worth allowing an abbreviation if the field name being accessed is the same as the current field name, so then it would be ${#last} or ${last#}
  2. Patient name in label: what is ${full_name#entity}'s temperature? or what is ${entity#full_name}'s temperature?
  3. Tree status in skip logic: ${partial_shade#entity}=yes or ${entity#partial_shade}=yes

I think my vote would go to ${y#x}, i.e. ${last#street} / ${entity#full_name} / ${entity#partial_shade}

1 Like

@adam.butler I like where you're going.

Here's what I'm understanding for ${entity#full_name}:

  • there'd be some kind of standard identifier (e.g. recordId) linking the current form instance to info previously collected about the entity this form instance is concerned with. Presumably, the previously-collected info would be in an external secondary instance representing all entities and their info.
  • the XForm would define a __entities instance to give access to all of the entities' info
  • entity in ${entity#full_name} would expand to something like instance('__entities')/recordId/data
  • the entity# shortcut would only allow referring to values related to the entity this form is about, not other entities. For example, if I'm defining a form that will collect information about houses, I can use it to refer to information previously collected about a specific house but I can't use it to refer to information about the neighbor's house or an occupant of the house that info is being collected about.

Did I get that right?

I think it's a useful concept even with the limitations in my last bullet above. I'm on board for introducing ${last#<unqualified field name>} for now with the goal of expanding to other prefixed keywords like ${entity#<unqualified field name>}.

Yes, yes, yes and yes :slight_smile: - and thank you for making all of that explicit @LN !

Just as an addendum to your last bullet: if the occupants were somehow marked as being a direct attribute of the house entity, then I think that it ought to be possible to refer to them using this syntax. But that's probably a discussion for another day...

1 Like

A possible issue I see with this definition of #entity is that it appears (correct me if I’m wrong...) to tie entities to a specific instantiation of a specific form id+version (?). Whereas in the general case - and certainly in mine, you can perform multiple (and completely different) ‘inspections’ (aka fill in completely different forms) about the same ‘entity’, so this vague “entity” thingy exists independent of any particular form, yet alone specific form version.

In this context, What is an ‘entity’? Or should it instead be called say “specific form instance”?

“Entity” to me conveys a physically unique object. Whereas a form instance/submission is rather more an partial snapshot in time, unique only unto itself.

@Xiphware the way I see case management working in ODK is that you would have two different types of form: entity forms and report forms (this nomenclature is not yet written in stone). There's more information about the proposed approach here:

It's due for another round of TSC discussion in the near future.