Search csv with UTF

Hi everybody,

by testing a simple search and select example with a csv file i have get in
the last step of the form a unkonw column error.
If i use the original (before conversion to utf8) csv file then all works
fine ...

Any idea what is happening ??

I have attached the xls - xml file (for aggregate)
and both csv files (the original ascii) and coverted to UTF-8. For testing
change the name to dynamicdata.csv

When i choose as media file the dynamicdata.csv (original ascii) everything
works fine but with unknown characters in client
When i choose as media file the dynamicdata.csv (utf one) the last step
returns instead of area an unknown column hhid_key error...

To convert the file i use UltraEdit32 with conversion Ascii to UTF8
The original file is from an Excel Save...As as comma delimited and replace
";" with ","

Best Regards
Panos

PS Also pulldata in calculation file doesn't work well when i use the UTF8
file... Is something wrong ???

eco.rar (10.7 KB)

What i' trying to do is to select a customer after that select an order
number and finally pull data (typea, typeb, typec) based on hhid_key

Thanks in advance Panos

··· On Monday, June 2, 2014 4:23:03 PM UTC+3, Panos Papadatos wrote: > > Hi everybody, > > by testing a simple search and select example with a csv file i have get > in the last step of the form a unkonw column error. > If i use the original (before conversion to utf8) csv file then all works > fine ... > > Any idea what is happening ?? > > I have attached the xls - xml file (for aggregate) > and both csv files (the original ascii) and coverted to UTF-8. For testing > change the name to dynamicdata.csv > > When i choose as media file the dynamicdata.csv (original ascii) > everything works fine but with unknown characters in client > When i choose as media file the dynamicdata.csv (utf one) the last step > returns instead of area an unknown column hhid_key error... > > To convert the file i use UltraEdit32 with conversion Ascii to UTF8 > The original file is from an Excel Save...As as comma delimited and > replace ";" with "," > > Best Regards > Panos > > PS Also pulldata in calculation file doesn't work well when i use the UTF8 > file... Is something wrong ??? >

Panos,

It looks like the UTF-8 encoding your program is using is adding one or
more invisible characters to the beginning of the file, to indicate the
encoding. Perhaps we should handle and ignore those, but they are being
treated as part of the hhid_key name.

One work-around would be to add an extra, blank column at the beginning,
perhaps named "ignore". That would probably solve it, as only the "ignore"
name would be messed up.

Alternatively, use the UTF-8 encoder that's built into the latest version
of Briefcase. That's meant to make it easy to convert .csv files to UTF-8
format.

Best,

Chris

··· On Mon, Jun 2, 2014 at 9:23 AM, Panos Papadatos wrote:

Hi everybody,

by testing a simple search and select example with a csv file i have get
in the last step of the form a unkonw column error.
If i use the original (before conversion to utf8) csv file then all works
fine ...

Any idea what is happening ??

I have attached the xls - xml file (for aggregate)
and both csv files (the original ascii) and coverted to UTF-8. For testing
change the name to dynamicdata.csv

When i choose as media file the dynamicdata.csv (original ascii)
everything works fine but with unknown characters in client
When i choose as media file the dynamicdata.csv (utf one) the last step
returns instead of area an unknown column hhid_key error...

To convert the file i use UltraEdit32 with conversion Ascii to UTF8
The original file is from an Excel Save...As as comma delimited and
replace ";" with ","

Best Regards
Panos

PS Also pulldata in calculation file doesn't work well when i use the UTF8
file... Is something wrong ???

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1 Like

Christopher,

 your solution works with the first try.. 

I use a dummy column and everyting works fine. Search with a simple column
or a two fields pulldata and everything else.
Also UTF8 characters are absolutely perfect.

Thank you very much for your quick response.

Panos

··· On Monday, June 2, 2014 4:30:51 PM UTC+3, Christopher Robert wrote: > > Panos, > > It looks like the UTF-8 encoding your program is using is adding one or > more invisible characters to the beginning of the file, to indicate the > encoding. Perhaps we should handle and ignore those, but they are being > treated as part of the hhid_key name. > > One work-around would be to add an extra, blank column at the beginning, > perhaps named "ignore". That would probably solve it, as only the "ignore" > name would be messed up. > > Alternatively, use the UTF-8 encoder that's built into the latest version > of Briefcase. That's meant to make it easy to convert .csv files to UTF-8 > format. > > Best, > > Chris > > > > On Mon, Jun 2, 2014 at 9:23 AM, Panos Papadatos <ppapad...@gmail.com > wrote: > >> Hi everybody, >> >> by testing a simple search and select example with a csv file i have get >> in the last step of the form a unkonw column error. >> If i use the original (before conversion to utf8) csv file then all works >> fine ... >> >> Any idea what is happening ?? >> >> I have attached the xls - xml file (for aggregate) >> and both csv files (the original ascii) and coverted to UTF-8. For >> testing change the name to dynamicdata.csv >> >> When i choose as media file the dynamicdata.csv (original ascii) >> everything works fine but with unknown characters in client >> When i choose as media file the dynamicdata.csv (utf one) the last step >> returns instead of area an unknown column hhid_key error... >> >> To convert the file i use UltraEdit32 with conversion Ascii to UTF8 >> The original file is from an Excel Save...As as comma delimited and >> replace ";" with "," >> >> Best Regards >> Panos >> >> PS Also pulldata in calculation file doesn't work well when i use the >> UTF8 file... Is something wrong ??? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "ODK Developers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to opendatakit-developers+unsubscribe@googlegroups.com >> . >> For more options, visit https://groups.google.com/d/optout. >> > >

FYI: the offending characters are the UTF-8 "BOM" character sequence.

Notepad++, for example, can be configured to "Encode in UTF-8" or "Encode
in UTF-8 without BOM"

Some editors emit this as the first 2 bytes of a file to flag the file as a
UTF-8 encoded file.

Note that SurveyCTO contributed a UTF-8 conversion utility

which won't add a BOM character.

If you are using that, it means that whatever program was used to emit the
CSV is emitting this character sequence.

··· On Mon, Jun 2, 2014 at 7:03 AM, Panos Papadatos wrote:

Christopher,

 your solution works with the first try..

I use a dummy column and everyting works fine. Search with a simple column
or a two fields pulldata and everything else.
Also UTF8 characters are absolutely perfect.

Thank you very much for your quick response.

Panos

On Monday, June 2, 2014 4:30:51 PM UTC+3, Christopher Robert wrote:

Panos,

It looks like the UTF-8 encoding your program is using is adding one or
more invisible characters to the beginning of the file, to indicate the
encoding. Perhaps we should handle and ignore those, but they are being
treated as part of the hhid_key name.

One work-around would be to add an extra, blank column at the beginning,
perhaps named "ignore". That would probably solve it, as only the "ignore"
name would be messed up.

Alternatively, use the UTF-8 encoder that's built into the latest version
of Briefcase. That's meant to make it easy to convert .csv files to UTF-8
format.

Best,

Chris

On Mon, Jun 2, 2014 at 9:23 AM, Panos Papadatos ppapad...@gmail.com wrote:

Hi everybody,

by testing a simple search and select example with a csv file i have get
in the last step of the form a unkonw column error.
If i use the original (before conversion to utf8) csv file then all
works fine ...

Any idea what is happening ??

I have attached the xls - xml file (for aggregate)
and both csv files (the original ascii) and coverted to UTF-8. For
testing change the name to dynamicdata.csv

When i choose as media file the dynamicdata.csv (original ascii)
everything works fine but with unknown characters in client
When i choose as media file the dynamicdata.csv (utf one) the last step
returns instead of area an unknown column hhid_key error...

To convert the file i use UltraEdit32 with conversion Ascii to UTF8
The original file is from an Excel Save...As as comma delimited and
replace ";" with ","

Best Regards
Panos

PS Also pulldata in calculation file doesn't work well when i use the
UTF8 file... Is something wrong ???

--
You received this message because you are subscribed to the Google
Groups "ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"ODK Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to opendatakit-developers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Mitch Sundt
Software Engineer
University of Washington
mitchellsundt@gmail.com