I've been trying to create a form which includes Amharic (Ethiopian)
characters. The uploaded form (as far as I can tell) is encoded
correctly using UTF-8, at least, it displays the characters correctly
when I open the file in my text editor (gedit on Ubuntu).
However after uploading the form if I view (on the Form XML viewer
page) or download it via ODKAggregate it doesn't recognise the Amharic
characters. You can see the form at: https://hew-datacollect.appspot.com/formXml?formId=Amharic_test.
I saw another post regarding issues with cyrillic scripts (http://
+encoding#d727d6e415c5695f), so assume this is is the same issue. The
html headers when I download the form show that utf-8 is set correctly
and the xml header sets it to be utf-8 too ... for info the http
headers when I download the form are:
GET /formXml?formId=Amharic_test HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20100101
Accept-Encoding: gzip, deflate
HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8
Content-Disposition: attachment; filename="TestingAmharic.xml";
Server: Google Frontend
On the other side, I can submit a form response (via ODKCollect) which
includes Amharic script and these are displayed fine when I view the
form responses on ODKAggregate. So to me, it seems there is maybe an
issue with the encoding that the form uploader accepts?
I'd really like to get a fix for this, but not sure where to start
looking in the ODKAggregate code, if someone can give me some
pointers, am happy to see if I can figure out where the problem is.