Dear ODK Support Team,
Greetings from Chile. We are implementing ODK for field inspection/enforcement-style workflows (operational data capture, often offline), and we’ve encountered a scaling limitation we need guidance on.
Issue summary
We use a dynamic choice list populated from an external CSV.
The list appears to be capped at 14,998 records.
Our operational registry of “agents / inspectable entities” is ~178,000 records (target ~180,000).
What we need
A recommended approach to allow users to search/select an entity in the field (Android, offline-capable) without requiring custom development by a vendor.
Options we’ve considered
Split into multiple forms / smaller lists (e.g., macro-zones North/Central/South).
Cascading selection (macro-zone → region → municipality → agent) via filtering (choice_filter / itemsets).
Use select_one_from_file and/or search() appearance to avoid loading the full list.
Replace list selection with direct entry of an identifier (ID / RUT / code) or QR/barcode scan.
Evaluate ODK Central Entities (if appropriate) as a master registry for large catalogs.
Questions for guidance
Is the 14,998 cap a known “hard limit”? If yes, which component enforces it (Collect, Enketo/Web Forms, XLSForm/pyxform, Central, etc.)?
Is there any configuration (server or client) that can raise this limit in a self-hosted deployment?
For a catalog near 180k, which option(s) above do you recommend as best practice, balancing performance (Android), usability, and operational robustness?
Specifically: does select_one_from_file + search/filter bypass the limit in practice, or does the cap still apply? If it works, what is the recommended implementation pattern?
Any official documentation / guidance on large catalogs (suggested thresholds, performance considerations, do’s/don’ts)?
Environment (details available upon request)
We can provide:
ODK Central version (self-hosted)
ODK Collect and/or Enketo versions used
CSV structure and the XLSForm pattern we are implementing
At this stage, we don’t have a shareable minimal XLSForm/CSV package prepared, but we can produce and share one promptly if needed to reproduce the behavior.
We’d appreciate your direction on the most stable and scalable design path so we can move forward with rollout.
Kind regards,
It is not. Can you please share what you experience that gives that impression? There are no hard caps anywhere but depending on how you're accessing values in the CSV performance could be prohibitively poor.
Sharing a form with the structure you anticipate would really help with offering performance guidance. If you can share a minimal XLSForm/CSV package that would let others on the forum chime in. Or you can share a form in private message that I can analyze with the team.
@Carlos_Navarro_Jerez Welcome to the ODK community!
As @LN mentioned, there isn’t a cap on the CSV file length. I wrote a blog post about how I implemented a similar drill-down list using the search() parameters on a list of ~24K records. It may be of use to you. I haven’t had as good luck with select_one_from_file.
For datasets your size, I would recommend using the _key suffix to speed up lookups as discussed in the XLSForm documentation.
Best of luck!