Hi all,
I've been trying to create a form which includes Amharic (Ethiopian)
characters. The uploaded form (as far as I can tell) is encoded
correctly using UTF-8, at least, it displays the characters correctly
when I open the file in my text editor (gedit on Ubuntu).
However after uploading the form if I view (on the Form XML viewer
page) or download it via ODKAggregate it doesn't recognise the Amharic
characters. You can see the form at: https://hew-datacollect.appspot.com/formXml?formId=Amharic_test.
I saw another post regarding issues with cyrillic scripts (http://
groups.google.com/group/opendatakit/browse_thread/thread/
d5620877b4e9cf05/d727d6e415c5695f?lnk=gst&q=language
+encoding#d727d6e415c5695f), so assume this is is the same issue. The
html headers when I download the form show that utf-8 is set correctly
and the xml header sets it to be utf-8 too ... for info the http
headers when I download the form are:
https://hew-datacollect.appspot.com/formXml?formId=Amharic_test
GET /formXml?formId=Amharic_test HTTP/1.1
Host: hew-datacollect.appspot.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20100101
Firefox/5.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/
*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
DNT: 1
Connection: keep-alive
Cookie: JSESSIONID=3h591gjuahT7_6WeZqfSlQ
HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8
Content-Disposition: attachment; filename="TestingAmharic.xml";
Content-Encoding: gzip
Server: Google Frontend
Cache-Control: private
Content-Length: 487
On the other side, I can submit a form response (via ODKCollect) which
includes Amharic script and these are displayed fine when I view the
form responses on ODKAggregate. So to me, it seems there is maybe an
issue with the encoding that the form uploader accepts?
I'd really like to get a fix for this, but not sure where to start
looking in the ODKAggregate code, if someone can give me some
pointers, am happy to see if I can figure out where the problem is.
Cheers,
Alex