Reconsidering jr:// scheme and external data URIs

(I only just started catching up on this, so please excuse any dumb questions that have already been answered/addressed elsewhere)

In https://opendatakit.github.io/xforms-spec/#uris, in these "jr://..." URIs that are being used to reference external filesets, I dont quite understand why the file metatype is necessary to be a component of the URI; eg in these URIs

jr://images/path/to/file.png
jr://file/path/to/file.xml

why are the 'images', 'files' prefix necessary? That is, why cant all external files (attachments, media files, external itemsets, etc) - irrespective of their type - simply be referenced as-is relative to the client sandbox root?

There seems to be a level of loose redundancy between the pathname and the filetype (and loose-ness can often cause issues down the road...)

I wasn’t there when the spec was designed so I can’t say what the thinking was at the time. My guess is that there may have been a reason at some point and in some implementation to have different stores for different kinds of files, especially given how different their sizes might be. Even then, clients could have done content type detection and I can't think if why it wouldn't have been done that way. The last conversation I remember related to this was here with @martijnr.

In addition, jr is not the best name for the URI scheme which should have been implementation-agnostic.

1 Like

If there's 'contentious' (unresolved?) issues around these URIs, should we try to sort them out before setting in stone how external datasets/itemsets should finally be handled universally (potentially deprecating a bunch of stuff as a consequence...).

Sorry, again, I'm new to the conversation so I'm unfamiliar with the history and nuances. But to a casual observer, it does seem like maybe we still need to solidify these URIs somewhat before building something long-term on top of them?

Of course, personally I'm not REMOTELY close to being able to support external itemsets yet (where-ever they may be stashed away!...) so please feel free to completely ignore my impertinent commentary :grin:

With URI's to reference external files, ie file://..., or http://... for that matter, could we just have the XForms client (Collect, Enketo, my iXForms, ...) simply treat them as-is? For example, a file:// will necessarily be resolved against whatever the local client's sandbox root happens to be (eg /sdcard/odk/...), whereas an http:// reference to an itemset (or media file, or...) would do just that - fetch it from a remote http server. That is, do away with the jr: namespace, and filetype path prefixes too (jr://images vs jr://audio vs jr://video ...). As you allude to, there doesn't seem to be a compelling reason to diverge to a custom jr:// namespace, nor do I see a compelling reason not to treat external - that is, outside of the contents of the XForm definition itself - standardized (aka uniform) resource reference to external data notably different than any other Uniform Resource Identifier [sic].

I've split this off to its own connected thread so it's easier to follow the different strands.

My sense is that there's no way the jr scheme (note that it's not related to the jr XML namespace prefix that's often used by convention) and type prefixes can truly go away since they predate even ODK and are in a lot of existing forms. Of course, they could be marked as deprecated but actually removing support seems it would be painful for no user-facing gain that I can see.

The file and file-csv prefixes were introduced by CommCare and maintained in the ODK spec and first in the Enketo implementation for compatibility with that.

For those reasons, I don't see designing changes to the URIs supported as a blocker for other ongoing work. But what do I know, I'm sure things have changed around these parts while I've been gone!

I agree it would definitely be worth supporting HTTP(S) URLs. @martijnr , I believe Enketo does this already?

Perhaps the TSC can explore whether it's worth additionally supporting standard local URIs. Someone might remember why it wasn't done that way in the first place.

In my understanding the custom scheme indicates a location in the OpenRosa XForms manifest, without having to provide the whole manifest (http) URI. I think was a very good idea as it allows flexibility of server manifest implementation and portability/copy-ability of forms, and supports the architecture in which the XForm and its resources are considered as one package.

The jr: name is not the best (and certainly not any more), but to me seems too much hassle to change now.

I agree (for use case where resources do not really belong to a single form, e.g a list of all countries) and yes Enketo supports this (I believe). Note that I believe some custom servers simply create an entry in the manifest to such an external URL and still use the jr: scheme to refer to this in the form.

For example, a file:// will necessarily be resolved against whatever the local client's sandbox root happens to be (eg /sdcard/odk/...)

Not sure about a local file:// URI.

1 Like

While the cat's away, the XForms mice do rather play... :grin:

1 Like

My guess is that the jr://images syntax is a remnant of the J2ME days.

Assuming the ecosystem will grow (and I think it will), it seems reasonable that we can mark jr:// as deprecated, and going forward we roll out file:// for any file. It is work, yes, but it gets us closer to an easier to implement spec and that does have user-facing benefits.

Is that convincing to you, @martijnr and @LN?

That makes sense to me for the custom scheme but do you also see a reason for the type prefixes?

That said, I agree that introducing a change is low-priority and maybe not worth doing.

Moreover, @martijnr made me think more deeply about the following:

That would be an unusual way to interpret a file:// URI, wouldn't it? Aren't they supposed to be absolute?

That seems like a good strategy. I'm not as convinced that supporting HTTP(S) URLs in the form is as valuable, then.

Generally speaking - eg serving pages for a website, iOS, ... - the filesystem presented to each 'application' is typically sandboxed in some way; apps simply aren't given access to the host root filesystem. So I think its perfectly natural that a file:// would be considered relative to whatever sandbox root the application happen to be running under. Its 'absolute' relative to the filesystem view presented to the app by its runtime environment.

do you also see a reason for the type prefixes?

At the moment for external data, it avoids having to determine whether the data is XML or CSV. Though determining this automatically is not hard to do, there may be some performance impact for a huge file.

For images/video/audio, I believe we always use <itext> with form="image", form="video", form="audio", so I don't think we need the type prefix.

Generally speaking - eg serving pages for a website, iOS, ... - the filesystem presented to each 'application' is typically sandboxed in some way; apps simply aren't given access to the host root filesystem.

So we could say for an online-only webform, the file system root is the XForms manifest? I just want to make sure we don't assume a client downloads a bunch of files that are then available locally, because that's already not the case at the moment. So if we'd use file:// for everything it should not be documented as referring to local file.

1 Like