'lang' attributes

Hi,

I'm building a XSLT sheet that transforms an ODK form to a valid HTML5
form.

I noticed build.opendatakit.org is using 3 character (ISO 639-2?)
language codes for itext translations. I prefer this to the javarosa
guidance to use 'human-readable' language names because using ISO
codes provide a (much better) scope to match the UI language to a form
language. However, I was wondering if there was a reason to not use
the shorter 2 character (ISO 639-1) code instead (optionally with the
country subtag - so e.g. just 'en' for english with 'en-US', 'en-UK'
as options - as is done in HTML). Using the 'shortest ISO-639 country
code' seems to be more in line with what the XML spec mentions as well
(http://www.w3.org/TR/REC-xml/#sec-lang-tag and http://tools.ietf.org/html/bcp47).

Cheers,
Martijn

Martin:

will you be able to share this new tool with the community? Do you
have a timeframe for its release? I think many will find it useful.

Thanks,
Gaetano

··· On Wed, Mar 21, 2012 at 10:39 AM, Martijn van de Rijdt wrote: > Hi, > > I'm building a XSLT sheet that transforms an ODK form to a valid HTML5 > form. > > I noticed build.opendatakit.org is using 3 character (ISO 639-2?) > language codes for itext translations. I prefer this to the javarosa > guidance to use 'human-readable' language names because using ISO > codes provide a (much better) scope to match the UI language to a form > language. However, I was wondering if there was a reason to not use > the shorter 2 character (ISO 639-1) code instead (optionally with the > country subtag - so e.g. just 'en' for english with 'en-US', 'en-UK' > as options - as is done in HTML). Using the 'shortest ISO-639 country > code' seems to be more in line with what the XML spec mentions as well > (http://www.w3.org/TR/REC-xml/#sec-lang-tag and http://tools.ietf.org/html/bcp47). > > Cheers, > Martijn > > > > > > > -- > Post: opendatakit@googlegroups.com > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > Options: http://groups.google.com/group/opendatakit?hl=en

Hi Martijn:

We chose the 3 character code because it seemed to be more comprehensive, and given how varied our community is it made sense to be broad. There's not really a strong reason beyond that. If you feel strongly that a 2 character code is better, we can make the change, but otherwise I'd rather not have to go through and backport the existing data within Build. Anyone else in the community have an opinion here?

Thanks,
Clint

··· On Wednesday, March 21, 2012 at 4:47 PM, Gaetano Borriello wrote:

Martin:

will you be able to share this new tool with the community? Do you
have a timeframe for its release? I think many will find it useful.

Thanks,
Gaetano

On Wed, Mar 21, 2012 at 10:39 AM, Martijn van de Rijdt <mrijdt@gmail.com (mailto:mrijdt@gmail.com)> wrote:

Hi,

I'm building a XSLT sheet that transforms an ODK form to a valid HTML5
form.

I noticed build.opendatakit.org (http://build.opendatakit.org) is using 3 character (ISO 639-2?)
language codes for itext translations. I prefer this to the javarosa
guidance to use 'human-readable' language names because using ISO
codes provide a (much better) scope to match the UI language to a form
language. However, I was wondering if there was a reason to not use
the shorter 2 character (ISO 639-1) code instead (optionally with the
country subtag - so e.g. just 'en' for english with 'en-US', 'en-UK'
as options - as is done in HTML). Using the 'shortest ISO-639 country
code' seems to be more in line with what the XML spec mentions as well
(http://www.w3.org/TR/REC-xml/#sec-lang-tag and http://tools.ietf.org/html/bcp47).

Cheers,
Martijn

--
Post: opendatakit@googlegroups.com (mailto:opendatakit@googlegroups.com)
Unsubscribe: opendatakit+unsubscribe@googlegroups.com (mailto:opendatakit+unsubscribe@googlegroups.com)
Options: http://groups.google.com/group/opendatakit?hl=en

--
Post: opendatakit@googlegroups.com (mailto:opendatakit@googlegroups.com)
Unsubscribe: opendatakit+unsubscribe@googlegroups.com (mailto:opendatakit+unsubscribe@googlegroups.com)
Options: http://groups.google.com/group/opendatakit?hl=en

@Clint: Thanks. There is indeed the opportunity to add more languages
in the future if necessary when using a 3 letter code. If there is no
reason to have a stronger compatibility with HTML (doesn't validate
with 3 letter lang codes), I'll just add a replace step (or accept the
HTML5 validation errors).

@Gaetono: I hope to release everything as open source eventually but I
am still figuring out what the right moment would be. The XSLT sheet
also contains some stuff that is specific to the (offline-capable) web
app I'm developing and may not be a logical choice for a universally-
useful transformation and will not be very robust, i.e. only work out-
of-the box for 1 particular XSLT processor (but those are not
necessarily reasons to not release it). Once I've got something up and
running and have added CSS and JavaScript to provide the basic
validation, skip etc. functionality, I'll post a link. By the way,
there is also the much more advanced stand-alone XSLTForms (http://
www.agencexml.com/xsltforms). Although this is pure XForms (and
JavaRosa forms will therefore fail), I believe there is scope for
collaboration with this project. (I know the author is open to this.)

Cheers,
Martijn

··· On Mar 21, 5:59 pm, Clint Tseng wrote: > Hi Martijn: > > We chose the 3 character code because it seemed to be more comprehensive, and given how varied our community is it made sense to be broad. There's not really a strong reason beyond that. If you feel strongly that a 2 character code is better, we can make the change, but otherwise I'd rather not have to go through and backport the existing data within Build. Anyone else in the community have an opinion here? > > Thanks, > Clint > > > > > > > > On Wednesday, March 21, 2012 at 4:47 PM, Gaetano Borriello wrote: > > Martin: > > > will you be able to share this new tool with the community? Do you > > have a timeframe for its release? I think many will find it useful. > > > Thanks, > > Gaetano > > > On Wed, Mar 21, 2012 at 10:39 AM, Martijn van de Rijdt wrote: > > > Hi, > > > > I'm building a XSLT sheet that transforms an ODK form to a valid HTML5 > > > form. > > > > I noticed build.opendatakit.org (http://build.opendatakit.org) is using 3 character (ISO 639-2?) > > > language codes for itext translations. I prefer this to the javarosa > > > guidance to use 'human-readable' language names because using ISO > > > codes provide a (much better) scope to match the UI language to a form > > > language. However, I was wondering if there was a reason to not use > > > the shorter 2 character (ISO 639-1) code instead (optionally with the > > > country subtag - so e.g. just 'en' for english with 'en-US', 'en-UK' > > > as options - as is done in HTML). Using the 'shortest ISO-639 country > > > code' seems to be more in line with what the XML spec mentions as well > > > (http://www.w3.org/TR/REC-xml/#sec-lang-tagandhttp://tools.ietf.org/html/bcp47). > > > > Cheers, > > > Martijn > > > > -- > > > Post: opendatakit@googlegroups.com (mailto:opendatakit@googlegroups.com) > > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com (mailto:opendatakit+unsubscribe@googlegroups.com) > > > Options:http://groups.google.com/group/opendatakit?hl=en > > > -- > > Post: opendatakit@googlegroups.com (mailto:opendatakit@googlegroups.com) > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com (mailto:opendatakit+unsubscribe@googlegroups.com) > > Options:http://groups.google.com/group/opendatakit?hl=en