Datasets in Central are now Entity Lists, please help translate!

We have removed the word "Dataset" in Central and replaced it with "Entity List." This means much of the text related to Entities will need to be translated again and we would greatly appreciate your help doing this before the next release in two weeks.

When updating text for French, I copied the previous translation and then only had small edits to make.

I found it generally easier to express the concepts using "Entity List" rather than "Dataset" and I hope you will too! (@mathieubossaert and other French speakers, feel free to make edits or discuss in this thread)

You should not feel like you have to literally translate "Entity List". In particular, if your language does not have a convention of capitalizing words to make concepts stand out, please don't match capitalization. In French, the most comfortable translation I found is closer to "list of entities."

We decided to make this change after many demos and conversations about this functionality. We initially believed that the very genetic word "Dataset" would be an asset but the feedback we've received is that it was difficult to connect meaningfully to "Entities." We previously had a broader view of what "Datasets" might do in the ODK world and we also thought we would differentiate more between Entities that are the subjects of forms (e.g. a tree you're collecting data about) and lists of values that act like metadata (e.g. a list of counties that trees could be located in). We now believe that focusing on lists of Entities no matter how they're used will make the concepts more approachable.

We will make a companion change to XLSForm to alias list_name in the entities tab to dataset. Forms that use dataset will continue working without any change.

These text updates are only end-user-facing. We will continue using dataset in the form specification, the Central API, and the internal Central implementation. We will update corresponding documentation to make it clear that end-user-facing systems in the ODK world use "Entity List" for this concept. Any other software that implements the specs could choose to continue using the more generic "dataset" or introduce other specialized language like "register", "task list", etc.

Thank you!

5 Likes

I do @ln . As you said, it is easier : only one new concept (the "entity") to consider individually or in a list (instead of a "dataset" that might be understood as another concept)
Maybe other French speakers (@thalie , @dickoah, @tgachet , @GuilhemD ...) could express their points of view :wink:

3 Likes

Hi!
I also validate the term "Liste d'entités", as @mathieubossaert said, "Dataset" ("Jeu de données" in french) can be confusing!
I haven't tested this concept yet because I would like to be able to update the entity list only via an import as an admin (not via forms) but I'm going to try it anyway :slightly_smiling_face:

2 Likes

If you haven't already, make sure to provide feedback in our poll about updating entity lists via import, @tgachet! [Official] Community POLL: Central Entity uploads from file

If you're feeling really impatient, I have bulk added entities with a Python script like this one using pyodk:

import csv
import json
import uuid
from pyodk.client import Client

client = Client()

with open('entities.csv', encoding='utf-8-sig') as f:
    reader = csv.DictReader(f)
    for row in reader:
        first_name = row['First']
        last_name = row['Last']

        entity = {'uuid': str(uuid.uuid4()), 'label': first_name + " " + last_name,
                    'data': {'first_name': first_name, 'last_name': last_name}}
        print(entity)
        r = client.post('projects/<projectid>/datasets/users/entities', json=entity)
        print(r.text)

You could also make it dynamically use the column header names and update entities in a similar way using this endpoint. Note that the API will continue using dataset!

2 Likes

Thanks @LN for all these resources! I'm going to test all of this :grinning:

1 Like