Community Wishlist Survey 2023/Archive/Find and delete similar photos in Commons

Find and delete similar photos in Commons

 N Requires community consensus

  • Problem: There are millions photos on Commons. On some subjects, there are hundred of photos very similar. For example here : Commons:Category:South facade of the Basilique du Sacré-Cœur de Montmartre. There is not added or encyclopedic value to have so much similar photos.
  • Proposed solution: Create a template and a system like suppression proposal. With the photo to preserve and the photos to delete. Administrators, moderators or experienced users look to the proposal and delete the photos to be deleted, or keep them.
  • Who would benefit: All users
  • More comments: There is a template named superseded, but it is not enough to delete similar photos.
  • Phabricator tickets:
  • Proposer: Tangopaso (talk) 22:00, 23 January 2023 (UTC)[reply]

Discussion

  • This could be useful, but the category you linked actually shows a nice variety of photos I wouldn't consider to be duplicates. They are of the same subject matter – the Basilique du Sacré-Cœur de Montmartre – but also they're not; one shows its architecture in detail, there's three examples of pre-1910 postcards of the Basilica from different angles, one is a face-on shot, etc.
    But some do seem close enough that you could maybe propose them for deletion under perhaps new criteria for deletion, that two photos are too similar and one is clearly the better option. I don't know if the problem is big enough to warrant a new tool or discussion board entirely, but I'd say a new criteria for deletion could work for this.Ineffablebookkeeper (talk) 23:11, 23 January 2023 (UTC)[reply]
  • Bad idea. We don't save any server space by "deleting" files, and many files on Commons are used externally by 3rd parties. Deleting such files would break the chain of attribution (e.g. in the case of CC-BY-SA) and cause potential copyright issues for external re-users who link back to Commons. -FASTILY 01:30, 24 January 2023 (UTC)[reply]
  • I agree with Ineffablebookkeeper: there should indeed be a possibility for deleting photos that are very similar but are not considered to be duplicates, and this might be implemented by new criteria for deletion. There are too many photos that would fit. The advantages are for end-users: people who are looking for photos and are overwhelmed by many similar photographs; if they see only a good choice/selection, they can find easier and faster what they are looking for. To meet the objections of User:Fastily, the files of the deleted photographs might get redirects (provided that the attribution and licence are the same). --JopkeB (talk) 04:42, 25 January 2023 (UTC)[reply]
    "the files of the deleted photographs might get redirects (provided that the attribution and licence are the same)". That's literally not how copyright/attribution works. -FASTILY 07:04, 25 January 2023 (UTC)[reply]
  • Strong oppose. WMF is doing just the opposite with picsome - making commons into a general repository for stock fotos. It is a terrible idea to force re-users to use an image that passes the aesthetic filter of an admin at commons. If there are a million of fotos from the same object, than it is the duty of commons users to help re-users identify the one, that is the "one" for this specific re-user by providing depicts-SDCs. --C.Suthorn (talk) 11:17, 25 January 2023 (UTC)[reply]
  • Strong oppose. Per above. Also similarity of files may be subjective, one can say that '2 files are near-duplicates' while other user can say 'quite different'. Юрий Д.К. (talk) 19:13, 25 January 2023 (UTC)[reply]
  • We have a method: you can suggest them for deletion but I don't know why it would matter. Commons shouldn't be the arbitrator of what is the "best" version of a photo. Different projects may find different ones useful. Why is this at Meta anyways? Is there a proposal to delete a version at English because a version at the Polish one (for example) is "better"? -- Ricky81682 (talk) 19:29, 25 January 2023 (UTC)[reply]
    • Correction. I meant to say how would having this at Meta solve the issue. A discussion (presumably) at Commons that says the version used at one wikis is to be deleted because it's bad compared to another one is not a productive discussion. -- Ricky81682 (talk) 23:39, 25 January 2023 (UTC)[reply]
  • @Tangopaso: You mention 'models' a couple of times here. Which models are you referring to? Could you add a link to more information, for people who aren't familiar with this system. Also you mentio "a system like suppression proposal", is that a separate or previous wishlist proposal that you mean? Could you link to it? Thanks! SWilson (WMF) (talk) 04:19, 27 January 2023 (UTC)[reply]
    Probably "Template".
    But: At Commons Media files can be nominated as a "valued", "good" or "excellent" file and categories come with a button to show only this v/g/e media files. So the proposal is actually fulfilled in a way. plus: Images are singled out in wikidata as the representative image of a wikidata item. C.Suthorn (talk) 17:31, 28 January 2023 (UTC)[reply]
    I'd agree with the strong oppose: For one thing, properly documenting a restoration means having the original, a PNG version (lossless, so it can be further edited), and a JPEG (as PNGs don't display well). How would we even begin to have something WMF creates deal with that unless it's just asking volunteers to? Adam Cuerden (talk) 01:27, 29 January 2023 (UTC)[reply]
  • This would require community consensus on Commons for the deletion of images of the type the proposer supposes are not wanted; no such consensus has been demonstrated. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:35, 1 February 2023 (UTC)[reply]
  • I am Tangopaso
Sorry, but I am French and my English is perhaps not fluent. I think I used the word model instead of template.
5 or 6 times I proposed the suppression of an image with the topic very similar to this one. Or with the template superseded. All of them were rejected.
I will respect the consensus of course. But in some years, there will be thousands of photos about a same subject. It removes the encyclopedic value of Wikipedia. For updating an encyclopedy, an adding process is necessary. But a cleaning process is also necessary. In my opinion. --Tangopaso (talk) 22:16, 2 February 2023 (UTC)[reply]
The different language versions of Wikipedia are enzyclopedias. Commons is not an encyclopedia. It is a media repository. There will not be thousends of images of the same subject. There are thousends of images of the same subject. That is intentional. And if you work on a language version of Wikipedia you do select one of this thousends of images for the article in that language wikipedia. But there are millions of other uses for this thousends of images. There is Wiktionary and wikivoyage. There is the Star Trek Wiki, that is not even part of WMF sites. There are millions of other web sites, news papers, books and so on. It is not your nor anybody else's choice to select, who should be allowed to use an image. There are ways to promote images: Good, useful, excellent images, the superseded template. But Wiki does not censor. C.Suthorn (talk) 05:37, 3 February 2023 (UTC)[reply]
  • @Tangopaso: Looking at the two aspects to this proposal, there is no community consensus for making it easier to delete similar files from Commons. The other aspect, of building a system to find the similar files, does I think sound interesting — but there is no actual use case that's supported by the community. If the Commons community was already engaged in deleting similar files, and wanted a quicker way of identifying them, then I think this this proposal would be great. But without that work being active (and in fact being mostly discouraged), there's no point in building the tools to help it. SWilson (WMF) (talk) 02:56, 6 February 2023 (UTC)[reply]