Hi All,
Apologies for this if it makes no coherent sense...
Whilst I understand the popular use case and desire for these features, I am concerned that implementation has the potential to compromise data integrity of data downloaded from the server to a local workstation. I'll preface this by saying that some good guidelines on how to use this effectively can probably solve a lot of the issues, so I may be worrying about nothing.
Whilst we will no doubt implement something that only keeps the correct, most recent file on the server (i.e. by keeping the link to the most recent photo), people will get duplicate photos on their local hard drives if they're appending new files to an existing media folder. Lots of these issues are not a thing to worry about if each download is to a fresh folder, but if photos from previous downloads have been used elsewhere (like in google photos or similar) then we have an issue.
Imagine this scenario where X6S9O9
is the hash/uuid attached to any version of all photos taken of me (in a single field on the XLS form)
A photo is taken and the new feature labels it as follows
Cristy_Roberts_X6S9O9.jpg
Then there's an edit because my name was wrong
Kristy_Roberts_X6S9O9.jpg
Downloading the data gives me a local folder with the newer file
media/Kristy_Roberts_X6S9O9.jpg
But then I take a new photo because the first was out of focus, and also do another edit because it is still wrong
Chrissy_Roberts_X6S9O9.jpg
and then download.
My new media folder now looks like this
media/Christy_Roberts_X6S9O9.jpg
media/Kristy_Roberts_X6S9O9.jpg
Then I share the data to my friend. They have no idea which one is the most recent one, unless they fish around in the EXIF data, so use the first one, which is out of focus. Assuming that there's also some names that start with D,E,F,G,H,I and J in the data set, these won't sort together lexicographically, so will likely be missed in any case.
media/Christy_Roberts_X6S9O9.jpg
media/Crusty_Robards_XK39DJ.jpg
media/Dev_Jamme_TW98OW.jpg
media/Dilys_Barnards_P0292K1.jpg
media/Eliot_Dillards_X6S8O9.jpg
media/Edwina_Currey_PL1KD9.jpg
media/Fredwina_Curtley_JSNAM1.jpg
media/Gustave_Rombards_B92KN1.jpg
media/Kirsty_Remmilard_L1K2MA.jpg
media/Kristy_Roberts_X6S9O9.jpg
media/Kristoffer_Mumbarlard_X87J1B.jpg
media/Krusty_Roberds_X6S9O9.jpg
media/Kyllian_Zemenides_XS92L1.jpg
Yes, I had a lot of fun making up those names, but did you spot the other duplicate that I snuck in there or were you too busy looking for Kristy and Chrissy?
So should we then put the hash first? It would then be easier to see multiple files for one person but hard to find the person because they'd no longer be in alphabetical order.
media/X6S9O9_Christy_Roberts.jpg
media/X6S9O9_Kristy_Roberts.jpg
media/X6S9O9_Krusty_Roberds.jpg
If I gave my friend the folder of photos and no data set, the only way they know that Kristy is the same person as Chrissy is by looking at the hash (which is random so hard to read), or by having insight in to the data. With a more common name than mine, you'd hit the problem of having potential to confuse this further with people who have the same name, but different hashes.
media/Chrissy_Roberts_2022-08-09_X6S9O9.jpg
media/Kristy_Roberts_2022-05-09_X6S9O9.jpg
media/Kristy_Roberts_2021-05-09_X2LJ1A.jpg
media/Kristy_Roberts_2020-01-14_PI14JX.jpg
media/Krusty_Roberds_2022-05-09_X6S9O9.jpg
If we used timestamps instead of hashes it would be marginally less problematic with regards which is the most recent file, but the current timestamp format is epoch time, which is no use to anyone (who wants this kind of simple to use feature) but is useful to the lexicographic view of the computer. If we're going this way, then a YYYY-MM-DD ISO format (also lexicographic but still meaningful to humans) should be an option, as the whole point is to make it easier for the user to understand what the photo is by looking at the filename.
media/Chrissy_Roberts_2022-08-09_X6S9O9.jpg
media/Kristy_Roberts_2022-05-09_X6S9O9.jpg
media/Krusty_Roberds_2022-05-09_X6S9O9.jpg
Which helps with dates, but not with the naming and sorting issues unless this way around
media/X6S9O9_Christy_Roberts_2022-08-09.jpg
media/X6S9O9_Kristy_Roberts_2022-05-09.jpg
media/X6S9O9_Krusty_Roberds_2022-05-09.jpg
What if we had two photos from different fields? UUID would be useful to link these, but hashes not so much. How do we make it easy to find all photos from one person? Add more fields?
For me, the only really useful way to use this feature is to organise by multiple fields.
Here, I have some photos that are organised by UK county, town, postcode district, surname, first name, date and hash. That's super useful for getting a really nicely organised folder of photos and I suspect is really the kind of way many people will end up using this (i.e. in household surveys, wildlife monitoring, entity based stuff etc). Having a hierarchical organisation for the name makes it highly searchable.
media/Sussex_Burgess_Hill_RH15_P0011_Wolfeschlegelsteinhausenbergerdorff_Hubert_2022-08-09_X6S9O9.jpg
You can also add in something to differentiate photos from multiple fields like
media/Sussex_Brighton_RH1_P0011_Smith_Harry_Face_2022-01-13_NHA9OK.jpg
media/Sussex_Brighton_RH1_P0011_Smith_Harry_Hand_2022-01-13_ASM8U2.jpg
media/Sussex_Brighton_RH1_P0011_Smith_Harry_House_2022-01-13_NKH4H1.jpg
but the bigger you go, the worse the issue for long file paths. PC users doing this may have to consider that there's a maximum limit of (I think) 255 characters in a path to a file on Windows.
In the above example with the very long surname name, we reach well over 100 characters just for the file name, so burying the folder too deep in a file tree could cause problems, though I expect this is not really an issue with Central where downloads are in a level 1 subdirectory and power users will be able to set the working directory to escape the issue. Briefcase users may find this more problematic as it is very nested.
Finally, there's an extension of the concept, where we could specify folders as well as filenames.
In my example I would want to specify the following file tree to organise my photos
media/county/town/postcode_district/household/
so my files would be like this
media/Somerset/Williton/TA4/P0099/Ollerton_Gustav_2022-09-31_LSA2KN.jpg
media/Sussex/Brighton/BN1/P0009/West_Kanye_2022-08-03_LKNM12.jpg
media/Sussex/Brighton/BN1/P0009/West_Lizzo_2022-08-03_67SAK2.jpg
media/Sussex/Brighton/BN2/P0015/Bojang_Muhammed_2021-02-12_MNM23S.jpg
media/Sussex/Brighton/BN3/P0027/West_Lizzo_2022-08-03_67SAK2.jpg
media/Sussex/Burgess_Hill/RH15/P0011/Iqbal_Saaida_2022-08-09_JNA872.jpg
media/Sussex/Burgess_Hill/RH15/P0011/Smith_Bridget_2022-08-09_KJL98S.jpg
media/Sussex/Burgess_Hill/RH15/P0011/Smith_Eliot_2022-08-09_J2JN42.jpg
media/Sussex/Burgess_Hill/RH15/P0011/Smith_John_2022-08-09_X6S9O9.jpg
media/Sussex/Burgess_Hill/RH12/P0012/Lee_Kim_2022-08-10_A8J79D.jpg
media/Sussex/Burgess_Hill/RH12/P0012/Lee_Sang_2022-08-10_LKJLAS.jpg
This is far more useful to me than including all the extra fields in the filename.
Admittedly it means that the integrity of the system only remains until you change the folder names. But that's true of renaming files, which brings me full circle to my original concern about orphaned files.
I really hope that this makes sense and is useful to the team when thinking further about this feature.
Chrissy