Hélène Langet (@thalie)
Swiss Tropical & Public Health Institute
What contributions have you made to ODK?
My contributions mainly consisted in feedback provision to the ODK team from a user's point of view (e.g., feedback on application use cases and beta features such as the built-in audio recording for ODK Collect or data management in ODK Central, reporting and documentation of bugs especially when handling with encrypted data), and more marginally direct technical contributions to the ODK ecosystem (export of encrypted data in RuODK, supervision of a MSc student for developing a generic R package to help users generate automated reports).
These contributions were based on my experience as data lead in the multi-country research evaluation conducted for the “Tools for Integrated Management of Childhood Illness” (TIMCI) project, in which all research data are collected with Enketo/ODK Collect, stored in ODK Central and processed through RuODK and Rmarkdown to generate analyses, and operational or quality reports.
I also dedicated a significant amount of my time in the project to coaching collaborators with various backgrounds on the use of ODK Collect/ODK Central, and for those collaborators with the most advanced IT skills, also on RuODK and the ODK apiary (Python/R). Last, I have advised collaborators involved in other health data projects on when ODK is the best data collection tool depending on their needs.
How do you believe your contributions have benefited ODK?
I think the TIMCI project is a very good showcase of the potential of ODK for complex data collections in challenging environments, where – to my knowledge – no other tool can offer the same range of services and existing solutions dedicated to investigational product clinical trials would have lacked the flexibility necessary to collect all the required data to meet the project objectives.
Using the ODK ecosystem enables a remarkable consistency of all research processes, triangulation of data between studies, and the management of fully reproducible research pipelines. Whenever possible I have shared with my collaborators and the ODK community the know-how that can be transferred from my experience in this project, and especially concrete starting elements to implement its underlying concepts.
I also encouraged ODK users to use more advanced/new features (as lots of them remain unknown to those who do not regularly read the forum and the technical documentation). I believe that this exchange has benefited the ODK community and can be further cascaded to new users. In addition, the wide range of data needs encountered in TIMCI (which spans from pragmatic clinical trials with scheduled follow-ups to qualitative interviews, passing by clinical observation, time-flow and health cost surveys) have allowed me to report practical challenges encountered by ODK users to the development team while – I hope – avoiding the pitfall of being locked into a too specific use case.
What do you believe the top priorities for ODK should be?
First, it seems essential for me that the ODK solution continues to be as robust, as generic and as modular as possible while addressing the needs of a wide range of users. For me, these are key strengths of ODK and one of the main reasons for its success. Adding longitudinal features will definitely open new opportunities for all users – I also really liked the idea suggested by Florian – if I am not mistaken – of a parent (meta-)form and child forms, which could pave the way for an even more modular approach to designing forms.
Second, it seems to me equally important to continue the excellent work initiated by the team to strengthen the consistency (e.g., Enketo/ODK Collect) and attractiveness of the ODK ecosystem. In this sense, further developing generic data monitoring/quality services that can help users to quickly identify major flaws in the data collection or alternatively further developing interconnectivity with data analysis/visualization tools or pipelines that can provide these services without requiring advanced data skills would certainly benefit ODK users (e.g., exploiting what is available in the metadata and audit log, which are currently underexploited by most users).
Last, and probably more specific to data collection involving human subjects, proposing more granularity for data access and protection would enable manipulating these data more easily without compromising on data protection (usually only a subset of a full dataset requires the highest level of protection, e.g. personally identifiable information or sensitive information such as HIV status, ethnicity, etc).
How will you help ODK accomplish those priorities?
I have a strong innovation background (10 years of research with GE Healthcare and Philips) with a core technical expertise on data analysis/visualization/interpretation, and a keen interest for population health, preferably in very remote settings, so that I would enjoy any strategic/roadmap discussions for developing ODK further.
I can provide a system thinking approach, since I have hands-on experience with the whole data lifecycle from design/collection to analysis for generating public health evidence in the TIMCI project (as the quality of any data analysis depends on the quality of all the upstream components), which obviously is only one experience among others and must be weighted accordingly.
I also have good experience of data regulations and publication of scientific evidence at the technology/application interface to support/guide the development of new technologies. I am also very interested in more actively contributing to knowledge transfer and capacity building to increase the pool of ODK users and empower existing ODK users.
How many hours a week can you commit to participating on the TAB?
What other data collection projects, social impact projects, or open source projects are you involved with?
- Tools for Integration Management of Childhood Illness Tanzania and India (Uttar Pradesh)
- Kenya and Senegal (Myanmar discontinued due to the current political situation)
Please share any links to public resources (e.g., resume, blog, Github) that help support your application.
- https://github.com/SwissTPH/timci - The GitHub project that contains the code of the R package that is used by each research partner for all the data management activities (including de-identification, generation of follow-up logs and automated form publication, and the generation of operational and data summary reports). Notice: it is very project specific that cannot be reused outside of the TIMCI project as such, also of inequal quality as I am not supposed to be coding at all (...), but some of the ideas developed here and know-how (e.g., on how to parametrize, modularize RMarkdown documents, build fancy LaTeX tables, etc) can definitely be reused in any new project.
- https://github.com/SwissTPH/repvisforODK - The GitHub project that contains the code of the R package that was developed by Lucas Silbernagel during his internship to generate a generic HTML data summary from any ODK Central project (Rmarkdown, plotly)