How to Work With Large Data with Looped / Group Entries

What is the problem? Please be detailed.
I've used ODK to conduct a very large household survey. We've more than 10000 submission in our aggregate hosted on our tomcat server hosted on a VPS.
I can't filter groop / looped data entries in ODK aggregate. Also Group data is not visible in excel. How i can work with these sort of data? Is there any best tools to do that?

Please Help me with suggestions and best practises.
Thanks in Advance

Some options are:

The first will let you then use the data set with any analysis tool you need but will not update automatically as new forms are submitted. The last two will update as new forms come in.

But how i can display group / looped data in excel? There is only a link to aggregate in group columns.

Please read about pulling forms and exporting using Briefcase. That is how you can get Excel documents for analysis.

But the normal method will not export group / looped data. Is there any solutions for repeating groups?

All three options I listed above will allow you to get an Excel file with repeats. Pulling submissions with Briefcase is probably the simplest. Please try one.

None of these options is works for me :frowning_face:

  • My survey data contains Unicode characters. It seems ??? marks instead of actual letters when i try to export from briefcase. But aggregate displays all Unicode characters correctly. So exporting from briefcase is not a option for me.

  • I've tried to publish into Fusion tables and spreadhseet. But both of the methods only publish about 100 rows. I've 10000 more rows actually. Is there any solutions for the problem?

It's unlikely the unicode issue is caused by Briefcase. What program are you using to open up the CSV files? If it's Excel, you will need to specify the character encoding. The instructions at Arabic language in ODK - #17 by yanokwa will likely help with that.

Let's try Briefcase first and if that's still not working we can look at the publishers. My guess is that you are getting rate limited and that the rest of the records will appear over time. The publishers are better adapted to processing data as it comes in.