Briefcase CLI parallel download

Hi,

I am really excited about the new feature in briefcase for allowing parallel downloading. This will be very useful for our project. Thanks! Is it possible to use this feature when using the command line interface?

1 Like

Hmm. I haven't tried it on the CLI, but it might already work. Since this is an experimental feature, maybe you can help us experiment!

  1. Try the Briefcase 1.7.0 release on your server with parallel downloading in the GUI and see if it is indeed faster for your use-case.
  2. Then try leaving parallel downloading turned on in the GUI, but quit Briefcase and run a fresh test using the CLI.
  3. Report back on what you find. If it works, awesome, we need to document it. If it doesn't work, then we need to file an issue to add that to the CLI interface because I agree it would be very useful.

I will try this and get back to you. How are the options chosen in the GUI stored on the server?

The GUI options aren't stored on the server. They are stored in Briefcase in the settings tab.

I been playing around with this on and AWS server this morning

For downloading 500 submissions there is a significant speed up (from 4min to 1.30 min) with the parallel download enabled. This is similar both using the GUI and using CLI. Unfortunately when using the parallel download I do get quite a few 500 Internal error response from the server. I assume this is partly why the feature is still experimental.

It is great that the checking if a record exists before downloading it is now much faster. I played around with this my self on an earlier version of briefcase. For my use case we have maybe 200 000 already existing submission and want to download the 100 new submissions. I found that for this use case increasing the chunk size made a very big difference. I think it currently is at 100, and I increased it to 10 000. This provide a large speed up for me. So for the future maybe it could be possible to specify this as a command line argument.

Also just want to mention that on a linux system the preferences chosen in the GUI are stored in ~/.java/.userPrefs/org/opendatakit/briefcase. So it is possible to edit directly even without using the GUI.

1 Like

Gunnar, I'm glad to hear it's working on the CLI and thanks for the tip on the preferences.

I see that you have some code changes on your fork already for Briefcase to adjust the chunk size. We'd love to have this code in trunk, if possible, so you don't have to maintain a fork.

Can you file an issue so we can discuss if bumping the number or making it configurable is the right approach? Once we have some consensus, it'd be awesome to see a PR from you so we can get this into trunk.

Please also file another issue describing your setup and the kinds of internal errors that you are getting. And if you have any insights into how we reduce those errors (e.g., does it go away if you reduce the number of parallel threads), include those in the issue.

I will create an issue and try to submit a pull request this week. My java is a bit rusty, but will try to clean my fork up a bit first.

1 Like