Placeholder/alt text for images (eg screen readers)

What high-level problem are you trying to solve?

Address the desire to be able to display a text placeholder for images when they cannot be displayed, or doing so is not appropriate. For example, when using a screen reader or when needing to render a form in a purely textual manner (eg preview, automated testing, etc)

Any ideas on how ODK could help you solve it?

  1. Introduce a new (optional) feature to XLSForm to be able to provide a text 'description' alongside any image media attachment, in either the main survey or in choice options; eg image_description

  2. Introduce a corresponding new (optional) form= attribute option (for text strings) that - in addition to the existing <value form="image"> that today associates image media to the text - can provide an additional textual description of the image; eg <value form="image_description">. [I'll call this option A]
    Alternatively, instead of adding a new form= attribute, extend the existing <value> element, only for form="image", to include an optional description field [I'll call this option B]

Upload any helpful links, sketches, and videos.

This feature request is being driven by the desire to be able to associate a text description placeholder for any images shown in a form, primarily to support screen readers for web-based forms. Specifically, if this information is available in a form then the (web) renderer, eg Enketo or ODK's new Web-Forms can use it to appropriately populate a suitable <img src="img_girl.jpg" alt="Girl in a jacket"> alt attribute in the resulting HTML rendition, to then be picked up by a screen reader.

Further, as noted, there may be other circumstances where a purely text-based rendering of forms is desired - eg preview, printing, testing - so the availability of image descriptions in form definition could have uses elsewhere too.

Proposal: XLSForm image_description column

Add a new column to XLSForm for specifying an image description. This column will contain text to associated with the image, eg a description. This text will obviously need to change according to the selected language (which in fact can display different images according to the current language!).

For a monolingual form, this would look like:
MonoLingualImages_proposed.xlsx (18.6 KB)

For a multi-lingual form this would look like:
MultiLingualImages_proposed.xlsx (19.0 KB)

Proposal: XForm new form attribute [option A flavor]

The above new image_description column would - under option [A] - be translated into a new optional element alongside the existing optional media associated with text. In the above example of a monolingual form this would look like in XForm:
MonoLingualImages_proposed.xml (3.1 KB)

For a multi-lingual form this would look like in XForm:
MultiLingualImages_proposed.xml (4.7 KB)

I believe this is consistent with how related things like media and guidance hints are being expressed in ODK XForms presently:

These image descriptions are basically supplementary to existing images (in both form controls and options), which in turn are supplementary to text labels. Hence it seems natural to add them as a new additional form="image_description" attribute to the existing form="short" (unused?) and form="image" extensions for labels.

(an option [B] flavor of this is ostensibly the same; I summarize the difference in Unresolved Issues, below)

Unresolved Issues

  • select a suitable new name for the XLSForm column containing image descriptions. I propose image_description (see above XLSForm examples), but have no strong preference.

  • [A] select a suitable name for a new form='...' attribute in XForm with which to define the associated text description; eg <value form="image_description">eagle</value> (see above XForm examples)

  • or [B] select a suitable optional attribute name (only) for the existing form='image' attribute in XForm, with which to add additional optional text for the specified image file; eg <value form="image" description="whale">a.jpg</value>

I have a slight preference for [A] just because in [B] the manner in which you specify the image filename (XML element body) vs image description (XML element attribute) is quite different. [A] also lends itself well to being able to specify an image description alone, although I don't see this being particularly useful for screen readers or when such might be required.

If [A] is the chosen path, then another unresolved issue - for both XLSForm and XForm - is:

  • if an image description is provided but an actual image is not, is this a fatal error (in either XLSForm, or in XForm for that matter) or just a warning?

Other things considered

For the purpose of just populating alt text just for the images shown against form controls - eg logos at the start of a form - this could also be accomplished using custom body attributes.

However, this would not readily accommodate permitting multi-language translations for image descriptions (it would effective entail adding a custom attribute for every language!).

Nor can corresponding custom attributes be specified for choice options AFAIK (and even if you could, you are back to custom attribute for every language...). Meaning you couldn't provide screen readers an alt text description for image-only based select questions.

4 Likes

BTW, another thing considered (and rejected) - for the purpose of populating web-rendered <img alt="xxx"> text - would be to have the web client (ie Enketo) simply always populate the alt text with, say, the label text, which is already being translated. Or possibly repurposing the existing hint or guidance_hint to acquire this text.

The problem with this is that it precludes the existing desired usecase of wanting to display a logo at the top of the form (without any text!). Hence purely separate placeholder text for the image.

1 Like

Note, such image descriptions/alt text could also be useful for quickly previewing forms, without having to worry about the actual media files just yet. eg XLSForm Online will currently preview but only shows "image" for every placeholder.

1 Like

Thanks @Xiphware. Just to add that Kobo will happily would like to implement this new feature. But as always, we'd love the wise input of @LN and others since this would add an optional column to pyxform and would introduces an XForm spec change. :pray:

1 Like

Thanks for posting this! I support this feature from an accessibility perspective and would like to use it in my work on Kobo and Enketo.

Thanks for the thorough proposal and happy birthday! :birthday:

This feels more like a downside than an advantage to me. Specifically, like you said, it means we have to figure out what to do with the case where a description exists without an image. Can you elaborate a bit on why you like [A] more? The description does feel like attribute data to me so [B] feels a bit more natural.

Have you considered accessibility needs for video and audio? I think it'd be worth giving them a bit of thought to see whether it could change this proposal. Both questions and choices can have any or all of image, video and audio associated with them.

I think it's defensible to recommend that any necessary description of audio or video go in label or hint text but you may have other ideas.

The short form is used by Collect in the summary screen. It's not currently exposed in pyxform so is probably really rare.

So my interpretation of all the various permitted label form embelishments is that the XForm definition is basically providing a number of additional and/or alternative ways (ie different 'forms') in which to render its label - eg as abbreviated text, or visually, or audibly, ... - depending on the desired context and needs. eg

<text id="how-old-label">
    <value>How old are you?</value>
    <value form="short">Age</value>
    <value form="image">jr://images/b.jpg</value>
    <value form="big-image">jr://images/b_big.jpg</value>
    <value form="audio">jr://audio/goldeneagle.mp3</value>
</text>

so in that sense the text to be rendered in the context of a screen reader (when the control appearance or content doesnt permit rendering the regular label text...) is mostly just another flavor of a form embellishment. Hence "form=image_description", or equivalently "form=accessibility", seems a fairly natural extension of this rather functional approach to providing various alternatives to show instead the base text label.

It is also the case that with images we already have the situation where a form="big-image" alternative already effectively embellishes an existing one ("image") even further. So I dont see form="image_description" implicitly requiring an associated form="image" as necessarily divergent or incongruous (from media: "Specifying “big-image” alone has no effect, you must always include “image”"). Whereas introducing a description= attribute to the existing <value> element would be something entirely new to this sub-framework.

But again, its a slight preference and I certainly have no violent objection to the other.

1 Like

That's compelling! Sounds good to me.

Any thoughts around any potential future desire to add accessibility information related to video and audio? In HTML5, both can include fallback content within their tag. But I think there's also a general recommendation to include descriptive text around the media assets so they may not need to be addressed in a special way.

The obvious extrapolation - if we choose to go with a form="image_description" (or whatever name we want to call it) - would be a similar form="video_description" to provide text in lieu of a video snippet for such things like screen readers. And then of course, a form="audio_description" would seem to naturally follow.

Again, I think its also worth noting here again that these can also serve a useful purpose for things other than screen readers. This text can, for example, be used to provide a more meaningful placeholder for quickly previewing forms (without having to fetch everything in the manifest), or indeed especially in the case of audio when printing the form.

[It also just occurred to me that another potential issue with going with an option [B] approach - that is, introducing a new optional description attribute for all these media-related <value> elements - is there is now the potential of, say, having both image and big-image each having their own (different!) description [so presumably a screen reader may then need to flip between them?] Or what does a client do when the big-image has a description but the image doesn't? ...
A separate <media-type>_description form flavor - option [A] - would avoid this potential ambiguity.]

Bump. So is there a general consensus that this would indeed be a useful new feature to add to ODK forms, and that the proposal above ([A]) is a reasonable approach to it? So that we may begin sizing the effort that will be entailed to get a PR ready, initially targeted for Enketo browser-based screen readers.

Or are there other outstanding issues that you think still need to be addressed first before proceeding? Thanks.

I'd like to get a sense of how likely it is that there will be a desire to have targeted alternative text for audio and video, possibly for the uses you mentioned. If you expect it's likely, I think it's worth spending a bit more time trying to come up with a way to express it in XLSForm that doesn't require this explosion of columns. It could be something like a single alt_text::lang column with keys/value pairs like we do with parameters, for example (image="", audio=""). In that case we'd want to take that approach in XLSForm from the start.

Unless anyone else has anything to add I think the client work could start any time introducing new forms.

Not sure I fully undersand... Are you saying that the underlying proposed XForm representations - ie a new form attribute (eg form="image_description", or some other keyword) - is appropriate, but that there is still a question over how this should best be exposed in XLSForm (ie multiple columns for each media-type flavor vs single column with key/value pairs...)?

Client work can only meaningful proceed once a suitable XForm representation is agreed upon.

1 Like

I think the bar for adding more translatable columns should be relatively high. Currently there are many such columns available in XLS/XForms to meet a wide range of needs: label, hint, guidance_hint, constraint_message, required_message, image, audio, video. Maintaining / testing these with all possible feature regressions in pyxform is a challenge, and sometimes adding too many options can make it harder to learn and use the tools.

Can the use cases be elaborated with practical usages or motivating examples? The examples are quite abstract e.g. an option labelled "yes" having alt-text "eagle", and a question labelled "This is an English note" with "whale". If it is just about logos then maybe a different approach altogether is needed. Is it a realistic/likely scenario that a user would specify and translate alt-text, and require the no-buttons appearance such that this potentially useful content is not visible by default, and be deploying their survey exclusively to browser clients (in order to make use of the alt-text, assuming Android/Collect has no equivalent to alt-text)? I would assume any useful alt-text is good content that would be helpful for all users - in which case the existing text options of label/hint/guidance_hint could be used, and a screen reader would pick up these text fields. I mean, I don't really need subtitles yet but I always enable them in Netflix because they help comprehension sometimes. Is it realistic that a user would need to specify different alt-text for each media type? Has there been user research conducted with people that use screen readers, and people that design forms for those users?

Also I'm not clear on why the option of multi-purposing label/hint/guidance_hint was "rejected". If WebForms/Enketo could interpret a setting (survey-level, or question-level like the existing "appearance" column) to use text content as alt-text instead of displaying it normally, then the requirement is met. The WebForms/Enketo teams should probably have a look at this because it would be premature to add support to XForms/pyxform without knowing how, whether, or when those clients can use it - or what other solutions might be possible or feasible.