Dari and Pashto in ODK?

Pau_Varela · March 1, 2011, 12:50pm

thanks yaw,

I'll take a look to the conversion tool and the sed script you sent
me. The script option was always there but I wanted to be sure I
wasn't missing anything... now you verified me native2ascii is the
problem..

most important for me was to let the community know it's possible to
use some combination of ODK-Collect and Farsi characters, something
that, as far as I know, no one tested before..

best,

pau.

···

2011/2/25 Yaw Anokwa : > hey pau, > > the problem is that native2ascii doesn't provide xml-safe characters. > not much javarosa or odk can do about that. > > http://rishida.net/tools/conversion/ is an online tool, but it should > give you the hex ncrs you need. > > another option is to write a script that takes the native2ascii output > and makes it xml safe. i haven't thoroughly tested this, but it should > give you the general idea. > > # create ascii version of in.txt and write it to out.txt > native2ascii -encoding utf8 in.txt out.txt > > # replace \u with &#x. put ; after the four char code. > sed -i '' -e 's/\\u/\&#x/g' -e 's/\(x[0-9a-z]\{4\}\)/\1\;/g' out.txt > > > On Fri, Feb 25, 2011 at 01:44, Pau Varela wrote: >> Main problem, and this is a question for the list (although it's not >> 100% odk, sorry), was to convert characters to utf-8, since I used >> native2ascii to convert them from farsi to utf-8. Odk (javarosa?) is >> not able to read the produced output by native2ascii and I had to >> manually convert it to some xml-readable format.. (e.g: from "\u067e" >> to "پ" ) >