Commons:Village pump
This page is used for discussions of the operations, technical issues, and policies of Wikimedia Commons. Recent sections with no replies for 7 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; the latest archive is Commons:Village pump/Archive/2024/07. Please note:
Purposes which do not meet the scope of this page:
Search archives: |
Legend |
---|
|
|
|
|
|
Manual settings |
When exceptions occur, please check the setting first. |
|
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day and sections whose most recent comment is older than 7 days. | |
May 31
I'm unable to use the image I just uploaded.
Hi I don't seem to be able to use the file https://commons.wikimedia.org/wiki/File:M_F_Gervais_Holy_Roman_Empire.pdf It show up in Commons but in Wikipedia I'm not able to use it. Why? It happened for my last file and someone 'did' something... I don't know what was done but it worked. What should I do to fix it? — Preceding unsigned comment added by M F Gervais (talk • contribs) 18:45, 31 May 2024 (UTC)
- @M F Gervais: It is there and it functional however due to how big and unwieldy it is as a pdf it takes a while to render, especially whern it has to develop the image cache first:
- Now because PDFs are typically multipage document it can need extra formatting if you are trying to do it through standard wiki formatting. mw:help:images. PDFs should not be used if you want to display an image, please upload an image file per Com:File types — Preceding unsigned comment added by Billinghurst (talk • contribs) 07:59, 1 June 2024 (UTC)
June 11
New designs for logo detection tool
Hello all! We're happy to share that we will work on logo detection in the following months and that we defined an initial approach for this.
You can read more at the project page and you can have your say in the project's talk.
We want your feedback on it, and we need your insights on how to further tune the detection tool.
Thanks for your attention! Sannita (WMF) (talk) 13:54, 11 June 2024 (UTC)
- I'm rather confused. The general feed back seemed to me to amount to "logo detection isn't very useful." I was told by a couple of people when I asked informally, "Don't worry, it isn't like logo detection isn't the goal, this was just a side effect of work on something else that someone thought might be useful." And now you say that further work is proceeding on this front? What, exactly, put this on the front burner, especially given that we are constantly being reminded that dev has very limited resources for Commons? What is the problem we are trying to solve? - |Jmabel ! talk 22:25, 11 June 2024 (UTC)
- @Jmabel Our impression, to be fair, was quite the opposite: that it was something that could be useful in dealing with the third-most frequent rationale for requests for deletions (the first two being copyvios and FoP, which we found it was impossible to tackle in an automated way). There was more difficulty in defining how this could be implemented, but not on its usefulness. This is why we are re-opening the feedback period, to understand how it could be implemented. Sannita (WMF) (talk) 10:36, 13 June 2024 (UTC)
- @Sannita (WMF) "third-most frequent rationale for requests for deletions (the first two being copyvios" - This doesn't make sense at all. The only reason we would delete a logo is because it's a copyvio, not because its a logo. There are scores of logos which are in the public domain, either by age or by lack of creativity, while others get licensed under free licenses. I'm not sure why we should discourage people of uploading that specific content with such a warning, when those exact same rules apply to everything else. As it is, I tend to not support that implementation. And as JMabel mentioned, it's disheartening to see that resources were wasted developing such an apparently useless tool, when there are clearly established priorities (see the old wish lists, for instance). Darwin Ahoy! 16:16, 13 June 2024 (UTC)
- @Sannita (WMF), Jmabel, and DarwIn: I'll leave others to decide on the best or most suited UI for the logo detection. As for the feature, I am supportive of this, but conditionally. Suggest this feature should be mandatory for users who do not have the appropriate user rights; I suggest users who are not admins/sysops, license reviewers, and/or autopatrolled. Users who are under these three tiers of user groups are free to upload logos and should not be slapped with this filter, since they are already aware of copyright issues and TOO considerations for logos. If possible, the feature should effectively block uses of "FileExporter" and other cross-wiki file transfer tools. And one more thing, I suggest the filter can prohibit new users (those who are not autoconfirmed) from uploading or importing logos (even photos showing logos that are non-de minimis/non-incidental). Hopefully, this will trim down at least a third or less (my guess) of deletion requests that contribute to the perennial backlogs. There are many more areas in Commons that also need attentions and resolutions, like Commons:Categories for discussion/Older (some open discussions were from before the lockdown era of 2020). JWilz12345 (Talk|Contrib's.) 08:30, 14 June 2024 (UTC)
- @JWilz12345: I think the plan is for this to become a secret feature. It has no effect on the upload itself and nobody but the uploader will know about the warning. Possibly, the same effect could have been achieved by merely editing the current interface and noting "if it's a logo, follow logo guidelines". Enhancing999 (talk) 08:43, 14 June 2024 (UTC)
- Just my opinion, but having a specific warning to the uploader saying the image might be a logo seems rather pointless. If not borderline condensing towards users. People generally know what they are uploading images of. The less clear thing is what license to use in any specific instance and I don't really how this deals with that. A better thing would probably just be a specific checkbox for logos that automatically adds a license and puts the image in a specific category for images that need reviewing on upload. Otherwise people are just going to just ignore the warning just like they are already ignoring guidelines by uploading the image to begin with. What we really need is better ways to review and deal with problematic images on our end though. Not try to unload that on uploaders by over complicating the UploadWizard with a bunch of warnings, extra boxes, and the like. --Adamant1 (talk) 20:52, 15 June 2024 (UTC)
- @Adamant1: anything related to copyright is already complicated enough. That's perhaps a price to pay for establishing/creating a free media repository site like Commons, or more so, Wikipedia itself way back more than 20 years ago. Something that founders Wales and Sanger likely did not forseen or anticipate. (Note: just a part of my thoughts, and not a representative of my general perspective on Wikimedia movement, which I still support in the context of mandating global FoP). JWilz12345 (Talk|Contrib's.) 21:12, 15 June 2024 (UTC)
- Thanks everyone for your comments! @Adamant1 about the checkbox, we thought of that option too, but ultimately decided against, because we didn't want to clutter too much the UploadWizard and make it more complicated for legitimate uploaders to upload a legitimate logo or fall into the "I'll just ignore that" kind of case. Anyway, our scope is to get to a better and more seamless way of uploading medias, but this will take more designing, prototyping, and testing, so it won't happen overnight.
- To everyone, we're open to ideas for eventual moderation of logos in general, given that we don't want to unload a new bunch of work on volunteers without there being consensus. Sannita (WMF) (talk) 14:08, 20 June 2024 (UTC)
- @Adamant1: anything related to copyright is already complicated enough. That's perhaps a price to pay for establishing/creating a free media repository site like Commons, or more so, Wikipedia itself way back more than 20 years ago. Something that founders Wales and Sanger likely did not forseen or anticipate. (Note: just a part of my thoughts, and not a representative of my general perspective on Wikimedia movement, which I still support in the context of mandating global FoP). JWilz12345 (Talk|Contrib's.) 21:12, 15 June 2024 (UTC)
- @Sannita (WMF), Jmabel, and DarwIn: I'll leave others to decide on the best or most suited UI for the logo detection. As for the feature, I am supportive of this, but conditionally. Suggest this feature should be mandatory for users who do not have the appropriate user rights; I suggest users who are not admins/sysops, license reviewers, and/or autopatrolled. Users who are under these three tiers of user groups are free to upload logos and should not be slapped with this filter, since they are already aware of copyright issues and TOO considerations for logos. If possible, the feature should effectively block uses of "FileExporter" and other cross-wiki file transfer tools. And one more thing, I suggest the filter can prohibit new users (those who are not autoconfirmed) from uploading or importing logos (even photos showing logos that are non-de minimis/non-incidental). Hopefully, this will trim down at least a third or less (my guess) of deletion requests that contribute to the perennial backlogs. There are many more areas in Commons that also need attentions and resolutions, like Commons:Categories for discussion/Older (some open discussions were from before the lockdown era of 2020). JWilz12345 (Talk|Contrib's.) 08:30, 14 June 2024 (UTC)
- @Sannita (WMF) "third-most frequent rationale for requests for deletions (the first two being copyvios" - This doesn't make sense at all. The only reason we would delete a logo is because it's a copyvio, not because its a logo. There are scores of logos which are in the public domain, either by age or by lack of creativity, while others get licensed under free licenses. I'm not sure why we should discourage people of uploading that specific content with such a warning, when those exact same rules apply to everything else. As it is, I tend to not support that implementation. And as JMabel mentioned, it's disheartening to see that resources were wasted developing such an apparently useless tool, when there are clearly established priorities (see the old wish lists, for instance). Darwin Ahoy! 16:16, 13 June 2024 (UTC)
- @Jmabel Our impression, to be fair, was quite the opposite: that it was something that could be useful in dealing with the third-most frequent rationale for requests for deletions (the first two being copyvios and FoP, which we found it was impossible to tackle in an automated way). There was more difficulty in defining how this could be implemented, but not on its usefulness. This is why we are re-opening the feedback period, to understand how it could be implemented. Sannita (WMF) (talk) 10:36, 13 June 2024 (UTC)
- I just want to provide some context on @Sannita (WMF)'s post above ... what we're working towards here is an automatic process by which we reliably estimate the likelihood that an uploaded image will be deleted for any reason
- If we had that process we'd be able to inform users that their upload is likely to be deleted (and why) during the upload process, which would be a better (and more educational) user experience than we have now. Also moderators would be able to find (and deal with) potentially problematic uploads much more easily
- Our initial experiments with machine learning showed we can detect logos reliably, and they're a pretty common reason for DRs, so logo detection seemed like a promising place to start CParle (WMF) (talk) 14:36, 20 June 2024 (UTC)
- There may be a misunderstanding here: being a logo is not a reason to delete. We have tens of thousands of logos legitimately on Commons. Laying aside logos that are PD because they are very old, or created by certain governments that don't claim copyrights, etc., an enormous number of logos are below the threshold of originality for copyright, especially in countries like the U.S. where that threshold is quite high. False positives -- discouraging or (worse) preventing upload of content that could legitimately be hosted on Commons -- is at least as bad, and arguably worse than false negatives, letting a "bad" file through. We can always delete a bad file; we cannot conjure a file we don't get to see. - Jmabel ! talk 19:36, 20 June 2024 (UTC)
- > being a logo is not a reason to delete
- Absolutely, but being a logo is a signal that the upload is more likely to get deleted. We're not proposing to prevent logo uploads, just to alert the user if what they've uploaded looks like a logo, and attempt to educate them about the copyright implications (and also flag possible logos so that patrollers can check them) CParle (WMF) (talk) 10:56, 21 June 2024 (UTC)
- I'm not sure logos are actually among the things where the highest percentage get deleted. But maybe they are. Do we have any available statistics on this? - Jmabel ! talk 19:24, 21 June 2024 (UTC)
- @Jmabel Sure, the most recent statistics we have are available at phab:T340546. Sannita (WMF) (talk) 16:15, 24 June 2024 (UTC)
- @Sannita (WMF): I may be missing something, but I don't readily see anything there that even suggests what percentage of logos are deleted, compared to what percentage of uploads in general. Is it there and I'm missing it, or is it just not there? - Jmabel ! talk 18:22, 24 June 2024 (UTC)
- @Jmabel In this comment there is a direct quote of the last part of the analysis that breaks down reasons for deletion. Sannita (WMF) (talk) 09:14, 25 June 2024 (UTC)
- @Sannita (WMF): yes, I saw that. It says, in effect, "lots of logos are deleted" but with no indication of how many are kept, and how that ratio compares to other categories of files. - Jmabel ! talk 21:42, 26 June 2024 (UTC)
- To use an old joke, I'm pretty sure that roughly 90% of bad uploads are by right-handed people... - Jmabel ! talk 21:43, 26 June 2024 (UTC)
- I see, this could be in fact an improvement in data collecting, that I will be sure to share with the team. Sannita (WMF) (talk) 13:34, 27 June 2024 (UTC)
- @Jmabel In this comment there is a direct quote of the last part of the analysis that breaks down reasons for deletion. Sannita (WMF) (talk) 09:14, 25 June 2024 (UTC)
- @Sannita (WMF): I may be missing something, but I don't readily see anything there that even suggests what percentage of logos are deleted, compared to what percentage of uploads in general. Is it there and I'm missing it, or is it just not there? - Jmabel ! talk 18:22, 24 June 2024 (UTC)
- @Jmabel Sure, the most recent statistics we have are available at phab:T340546. Sannita (WMF) (talk) 16:15, 24 June 2024 (UTC)
- I'm not sure logos are actually among the things where the highest percentage get deleted. But maybe they are. Do we have any available statistics on this? - Jmabel ! talk 19:24, 21 June 2024 (UTC)
- There may be a misunderstanding here: being a logo is not a reason to delete. We have tens of thousands of logos legitimately on Commons. Laying aside logos that are PD because they are very old, or created by certain governments that don't claim copyrights, etc., an enormous number of logos are below the threshold of originality for copyright, especially in countries like the U.S. where that threshold is quite high. False positives -- discouraging or (worse) preventing upload of content that could legitimately be hosted on Commons -- is at least as bad, and arguably worse than false negatives, letting a "bad" file through. We can always delete a bad file; we cannot conjure a file we don't get to see. - Jmabel ! talk 19:36, 20 June 2024 (UTC)
@Jmabel, DarwIn, JWilz12345, Adamant1, Enhancing999, Alachuckthebuck, and SCP-2000: First of all, thank you for your comments, and apologies for pinging you directly. Considering the possibility of moving forward on this topic, we have a couple of questions to ask you about it:
- Do you think the user interface notice in the UploadWizard about detecting a potential logos should be limited only to certain classes of users (i.e. exclude explicitly autoconfirmed users and higher)?
- Do you think a template should be created and added automatically when a logo detected by the logo detection system is uploaded to Commons with an “own work” option selected for the ease of moderation?
Let me know what do you think about it, and thanks in advance for your time and patience! Sannita (WMF) (talk) 10:04, 2 July 2024 (UTC)
- @Sannita (WMF): Hi, Just a comment regarding data from phab:T340546 (P49530 Commons deleted pages most frequent edit messages, only the 20 most frequent reasons). I summed up the deletions by reason, and I get: 94,459 for copyright violations (lines 3+10+12+13+14+16+17), 76,385 for "Personal photo by non-contributors (F10)" (lines 2+5+9), and only 6,693 for logos. So logos are far down among the reason for deletion. This also reflects my personal experience as an admin. "Per COM:SPEEDY" usually means empty categories. Yann (talk) 10:41, 2 July 2024 (UTC)
- @Yann Thanks for your consideration. Logos might be down among deletions, but they can also be a stepping stone towards finding a solution to other, more frequent reasons for deletion. For now we decided to focus on a limited but easily detectable part of the deletion process, and we got a tool that can provide a more than decent result at that. With the experience built on this section, we can try to tackle other reasons for deletion. Sannita (WMF) (talk) 16:48, 2 July 2024 (UTC)
- @Sannita (WMF): IMO there is an even easier target for detecting copyright violations: all files with an external link as source (or anything like Google, Facebook, etc.). They should be included in a category for checking if the uploader is not an experienced user. Yann (talk) 17:05, 2 July 2024 (UTC)
- What we would need is some kind of AbuseFilter that adds categories or templates after the upload process. One of the many parameters the filter can use could then be the result of the logo detection tool. Many parts for already exist the only the ability of the AbuseFilter to make edits would be needed and the new logo detection tool needs to hand over the information the AbuseFilter. GPSLeo (talk) 18:12, 2 July 2024 (UTC)
- We hadn't thought of this @Yann, it's a really good idea. Added phab:T369273 CParle (WMF) (talk) 10:42, 4 July 2024 (UTC)
- Not sure who the "we" is, but at Commons talk:WMF support for Commons/Upload Wizard Improvements/Logo detection I proposed 11 April of this year that what we want here is tagging, especially for logos claimed as "own work". - Jmabel ! talk 19:03, 4 July 2024 (UTC)
- @Yann Hey, just to follow up on your comment about external links: we got some statistics on phab:T369273#9965672, that show that a high percentage (90%+) of medias showing an external link get deleted. They account for ~7k deletions in 2023, compared to ~8.5k for logos. Sannita (WMF) (talk) 13:33, 11 July 2024 (UTC)
- @Sannita (WMF): IMO there is an even easier target for detecting copyright violations: all files with an external link as source (or anything like Google, Facebook, etc.). They should be included in a category for checking if the uploader is not an experienced user. Yann (talk) 17:05, 2 July 2024 (UTC)
- @Yann Thanks for your consideration. Logos might be down among deletions, but they can also be a stepping stone towards finding a solution to other, more frequent reasons for deletion. For now we decided to focus on a limited but easily detectable part of the deletion process, and we got a tool that can provide a more than decent result at that. With the experience built on this section, we can try to tackle other reasons for deletion. Sannita (WMF) (talk) 16:48, 2 July 2024 (UTC)
June 28
New York Public Library
Does anyone know what accession number we should be using for items from the Collections of the New York Public Library? I'm using this image as a template example. -Broichmore (talk) 11:06, 28 June 2024 (UTC)
- FWIW, just for one example, https://digitalcollections.nypl.org/items/225503f5-ae89-40e1-91e3-2f8cf6cc8297 gives three identifiers: RLIN/OCLC, NYPL catalog ID (B-number); and a UUID.
- Our own File:Great Falls of the Potomac, from Robert N. Dennis collection of stereoscopic views.jpg gives three identifiers, but they don't seem to correspond to these in an obvious way: Catalog Call Number (which is clearly not a "B-number"), Record ID, and Digital ID.
- https://archives.nypl.org/mus/24078#c1553136 gives IDs like "b. 164 f. 25" and "b. 110 f. 9-10"; it would not surprise me, though, if those were strictly IDs within the Lou Reed papers, and not in the NYPL collection as a whole, especially given the low numbers.
- Looking at this, I'd be surprised if they have a single system that spans their entire archive.
- @Broichmore: Have you considered contacting NYPL to see if they have a suggestion as to what we might use? - Jmabel ! talk 20:28, 28 June 2024 (UTC)
- No, I haven't contacted them, as yet.
- I suspect, that like the Library of Congress (LOC), this was one of the first GLAMs uploaded en-masse, consequently, it varies in the templates employed. So far, I've discovered that the accession number might be the NYPL catalog ID (B-number), think that's their shelf number.
- There is a source template, I stumbled on, based on the image ID. Also, a default sort based on that NYPL catalog ID (B-number) again. See image for an example. - Broichmore (talk) 21:18, 28 June 2024 (UTC)
- @Broichmore: I work at NYPL and am happy to help make sense of the identifiers as much as I can, although as Jmabel suspected there is a lot of variation in cataloging practices and identifiers between libraries, curatorial divisions, and systems.
- Does Commons have a specification for "File information" metadata where "Accession number" is described? I was unable to find any documentation, but my initial sense is the call number (displayed as "Shelf locator" on NYPL Digital Collections records and "Research Call Number" in NYPL Research Catalog records) is the most akin to an "accession number," which is a term more common to museums than libraries/archives.
- Some of the other identifiers mentioned by Jmabel are external to NYPL (RLIN/OCLC), system-specific (NYPL catalog ID (B-number)), or are not identifiers (the "b. 164 f. 25" and "b. 110 f. 9-10" displayed at archives.nypl.org is a box number and folder number that describes the location of a folder within a larger archival collection).
- UUIDs and Image IDs are automatically assigned to any individual file available on NYPL Digital Collections and are (for now) the most reliable way to link back to the source of an file.
- NYPL staff are not doing much work in Commons yet, and much of what was uploaded was done years ago/by the community, but we do have a WikiProject for Wikidata that may also be a useful reference. --Infopetal (talk) 20:45, 5 July 2024 (UTC)
- The official template is Template:Nypl, accessible from the front page at Category:New York Public Library.
- It's not really suitable for casual, occasional use. That I need.
- Using this image as a template example.
- For accession number I'm now using the UUID number.
- I have used a template for source/photographer employing the IMAGEID number. Clicking on it takes you only one step away from the PERMALINK url. So, I've added the PERMALINK url. It's not exactly ideal, but will do me for now.
- Someone in the past has added a defaultsort using the b number with NYPL as a prefix, and the plate number or sequential number in a set as a suffix.
- Until better comes along I'm inclined to use this as a temporary fix. -Broichmore (talk) 10:34, 6 July 2024 (UTC)
- Where is the "Accession number" field specified for Wikimedia Commons? I would not consider a UUID to be an accession number, but I don't know the context within Wikimedia Commons. --Infopetal (talk) 15:13, 8 July 2024 (UTC)
- It's supposed to be the most unique number (the permanent specific number) for the item. Broichmore (talk) 19:43, 8 July 2024 (UTC)
- For the physical item or the digital item? UUIDs only correspond to digital objects (which can represent a collection, container, item, or image) and are generated by the software used for NYPL Digital Collections. Based on the template you linked, my sense is the `ID_uuid` parameter should be used for an item UUID. The call number (displayed as "Shelf locator" on NYPL Digital Collections records and "Research Call Number" in NYPL Research Catalog record) would be a more permanent, system-agnostic identifier for a physical object akin to the library/museum/archive definition of an accession number, but there may be multiple digital objects (each with their own UUID) that correspond to a call number. --Infopetal (talk) 20:15, 9 July 2024 (UTC)
- It's supposed to be the most unique number (the permanent specific number) for the item. Broichmore (talk) 19:43, 8 July 2024 (UTC)
- Where is the "Accession number" field specified for Wikimedia Commons? I would not consider a UUID to be an accession number, but I don't know the context within Wikimedia Commons. --Infopetal (talk) 15:13, 8 July 2024 (UTC)
Finding my own "published" content
Is there any way to do a sane query for files that I've uploaded (and/or for files where my account name occurs as part of the wikisource text on the file page) and where the {{Published}} template is on the talk page? - Jmabel ! talk 20:17, 28 June 2024 (UTC)
- Still hoping for some ideas here. - Jmabel ! talk 19:19, 2 July 2024 (UTC)
- PETScan would be the obvious idea, and why it can find files whose talk pages contain certain templates, it does not seem to be able to find files uploaded by certain users. Maybe this is the kind of application that a hidden user category of your photos would be useful for? Felix QW (talk) 15:25, 5 July 2024 (UTC)
- @Felix QW: I want to do this precisely so that I can mark these published files with a hidden category.
- No, I do not particularly want to place all of my 70,000 or so photos here on Commons in a single hidden category. I can't even think of a way that would be done without the process of doing so becoming a problem. - Jmabel ! talk 16:08, 5 July 2024 (UTC)
- PETScan would be the obvious idea, and why it can find files whose talk pages contain certain templates, it does not seem to be able to find files uploaded by certain users. Maybe this is the kind of application that a hidden user category of your photos would be useful for? Felix QW (talk) 15:25, 5 July 2024 (UTC)
June 29
Technical needs survey proposals
As somebody asked here, it would be fine to have any news about the plans for implementing the proposals selected in technical needs survey months ago. Seemingly, nothing new is publicly known after the survey finished. MGeog2022 (talk) 21:03, 29 June 2024 (UTC)
- Interesting. In the meantime, the annual plan for 2024/2025 was discussed and I don't think any of this was brought up. Enhancing999 (talk) 22:59, 29 June 2024 (UTC)
- This would be really disappointing, as some topics were already discussed for years --PantheraLeo1359531 😺 (talk) 11:10, 30 June 2024 (UTC)
- This isn't good news. I hope they are eventually implemented, and the survey ends up being worthwhile. MGeog2022 (talk) 14:16, 3 July 2024 (UTC)
- I try to annoy/bring up the topic until it's added :) --PantheraLeo1359531 😺 (talk) 17:57, 4 July 2024 (UTC)
- Not sure if that would work well. For the next annual period, you could try to prepare something that can fit into the plan (with the level of generality required by the plan up to the level of detail that can bring actual change to the Commons community). Rather than requesting specific tools (which is a technical implementation question for devs), I'd attempt to phrase it in terms of functionality that should be available (users don't really care how it's done), highlighting how that fits into the general objectives.
- Once that prepared (ideally before the end of 2024), you could send it to the WMF board (a Commons contributor sits there) so they can request the staff at WMF to include it in the plan. Enhancing999 (talk) 07:53, 5 July 2024 (UTC)
- Maybe it would help if phabricator issues were created for the proposals.
- Has the survey been brought up somewhere and/or addressed by WMF people by now? Prototyperspective (talk) 16:32, 7 July 2024 (UTC)
- we should watch out for Commons:Village pump#The Community Wishlist is reopening July 15, 2024 and m:Community_Wishlist_Survey/Future_Of_The_Wishlist/Preview_of_the_New_Wishlist#July_1,_2024:_The_Community_Wishlist_is_re-opening_Jul_15,_2024._Here's_what_to_expect,_and_how_to_prepare. RZuo (talk) 21:33, 7 July 2024 (UTC)
- Didn't we already have that with the result mentioned above? Maybe time to attempt a different approach. Enhancing999 (talk) 21:46, 7 July 2024 (UTC)
- we should watch out for Commons:Village pump#The Community Wishlist is reopening July 15, 2024 and m:Community_Wishlist_Survey/Future_Of_The_Wishlist/Preview_of_the_New_Wishlist#July_1,_2024:_The_Community_Wishlist_is_re-opening_Jul_15,_2024._Here's_what_to_expect,_and_how_to_prepare. RZuo (talk) 21:33, 7 July 2024 (UTC)
- I try to annoy/bring up the topic until it's added :) --PantheraLeo1359531 😺 (talk) 17:57, 4 July 2024 (UTC)
July 02
Categories by day: date format?
Dates that clarify or disambiguate categories, such as categories for individual sports matches, should they be written as 6 October 2021 or 2021-10-06?
I would assume that the all-numeric format is the better option, since it automatically sorts similarly named categories by date, and since categories for specific days use that format (e.g. Category:2021-10-06). But both formats are in use and I haven't been able to find any instructions as to which one should be used in these cases.
Sinigh (talk) 11:38, 2 July 2024 (UTC)
- In the HTML script stick with 2021-10-06 format. Broichmore (talk) 18:24, 2 July 2024 (UTC)
- Yes, definitely! But what about the name of the category? Sinigh (talk) 18:33, 2 July 2024 (UTC)
- If it's down to an individual day, I'd strongly favor ISO notation (2021-10-06 in this case). If it was just the month, then June 2021. - Jmabel ! talk 19:12, 2 July 2024 (UTC)
- Agree. I think both of those choices make the most sense for category trees. Sinigh (talk) 20:28, 2 July 2024 (UTC)
- If it's down to an individual day, I'd strongly favor ISO notation (2021-10-06 in this case). If it was just the month, then June 2021. - Jmabel ! talk 19:12, 2 July 2024 (UTC)
- I've been using the format '2023-01 text' for categories where only the month matters (if not only the year). I think that is the most reasonable standard so that the cats are sorted chronologically on category pages. Prototyperspective (talk) 21:48, 2 July 2024 (UTC)
- You can also solve that with a DEFAULTSORT. - Jmabel ! talk 22:45, 2 July 2024 (UTC)
- I know but they are (more) unreliable and why not simply use this format putting the verbatim month into the title isn't useful and just causes problems due to people not adding a sortkey and the title longer. Prototyperspective (talk) 11:40, 4 July 2024 (UTC)
- Yes, there are defaultsorting navigational templates for month categories. Sinigh (talk) 13:19, 3 July 2024 (UTC)
- You can also solve that with a DEFAULTSORT. - Jmabel ! talk 22:45, 2 July 2024 (UTC)
- Yes, definitely! But what about the name of the category? Sinigh (talk) 18:33, 2 July 2024 (UTC)
- "both formats are in use"
- where is "6 October 2021" systematically in use?
- in any case, prefer yyyy-mm-dd over other formats. RZuo (talk) 16:33, 4 July 2024 (UTC)
- Good question, I should have provided an example in my first post. I'm referring to categories like those you'll find in this one: Category:Women's association football matches in Sweden. In this particular case, there seems to be a preference for a comma followed by "M Month YYYY", but dates in brackets are not uncommon elsewhere. (This category also showcases the typically arbitrary mix of "vs", "v", and hyphens both with and without spaces, and not a single unspaced en dash to be seen.) Sinigh (talk) 18:39, 4 July 2024 (UTC)
- Tyresö FF-Umeå IK, 16 April 2013 should be the agreed format IMO. This is the date format used by Wikipedia too, in the main. The numerical date in reverse, can be hidden from view to force sorting. Broichmore (talk) 09:35, 7 July 2024 (UTC)
- If so, please note that the first punctuation mark should be an en dash: Tyresö FF–Umeå IK. As for the date format, that soccer game is actually a very good example for this thread. The category for the date itself is numeric: Category:2013-04-16, which contains two sports games with Day-Month name-Year dates, but also 60+ categories, in and including Category:Photographs taken on 2013-04-16, that instead use the all-numeric format. At the same time, there is indeed a general preference for the Day-Mn-Year format across Wikimedia. So you understand why I saw the need to ask my initial question.
- Maybe it's intentional that date formats vary depending on the subject on the category? Sinigh (talk) 13:31, 7 July 2024 (UTC)
- sports matches is a topic that needs a standard the most, but it seems commons never had a standard?
- i just made some considerations:
- there're two kinds of matches: two sided matches and "first past the post" matches (like track and field, borrowing an election jargon).
- i think for two sided matches it's easy. the category title should include both teams' names and the date, that's all.
- in the rare case when on the same day there are more than one match between teams with the same names (e.g. both a football match and a handball match between both france and japan), then we can add the actual sport as the disambiguation, e.g. France vs Japan (2022-11-11, U17 women's football), France vs Japan (2022-11-11, women's football). in the rare case when two sides face each other more than once on the same day, titles can be John Doe vs John Smith (2022-11-11, 1st), John Doe vs John Smith (2022-11-11, 2nd)
- vs, v., -, or something else, which symbol to use, is up to community to decide.
- for matches that're not between exactly two sides, i guess titles can just be the name of the event, from big to small, e.g. 2024 summer olympics - women's 100m freestyle - qualifier - group b (2024-11-11).
- absolutely no reason not to use yyyy-mm-dd. other formats are messy for everything -- appeareance, sorting, intelligibility for non-english speakers, problem of preference between en-gb and en-us...
- RZuo (talk) 21:26, 7 July 2024 (UTC)
- Tyresö FF-Umeå IK, 16 April 2013 should be the agreed format IMO. This is the date format used by Wikipedia too, in the main. The numerical date in reverse, can be hidden from view to force sorting. Broichmore (talk) 09:35, 7 July 2024 (UTC)
- Good question, I should have provided an example in my first post. I'm referring to categories like those you'll find in this one: Category:Women's association football matches in Sweden. In this particular case, there seems to be a preference for a comma followed by "M Month YYYY", but dates in brackets are not uncommon elsewhere. (This category also showcases the typically arbitrary mix of "vs", "v", and hyphens both with and without spaces, and not a single unspaced en dash to be seen.) Sinigh (talk) 18:39, 4 July 2024 (UTC)
Script for deletion sorting?
Is there a script/tool I can use to assist deletion sorting? I regularly come across DRs that should be categorized, or which were never updated with the result. There are a few such scripts on enwp, but deletion sorting works differently there. — Rhododendrites talk | 14:17, 2 July 2024 (UTC)
Apparently not, so now there's User:Rhododendrites/drsort.js. Still a work in progress. Please leave feedback/advice on the talk page. — Rhododendrites talk | 13:26, 6 July 2024 (UTC)
July 03
We need someone to maintain CropTool
It would appear that CropTool is once again broken to the point of unusability. The person who got it working again in February 2024 did not take on responsibility to maintain it over time, which is of course their prerogative. We really need someone to maintain this quite valuable tool. - Jmabel ! talk 17:58, 3 July 2024 (UTC)
- I hope this comes across without any challenging or derisive tone, Jmabel, but I have no idea what you're talking about. I've used CropTool hundreds of times and very rarely had any problems. I've used it in the past 48 hours. What kinds of issues are you experiencing? Is this a browser thing? —Justin (koavf)❤T☮C☺M☯ 18:10, 3 July 2024 (UTC)
- @Koavf: A variety of issues, ranging from not getting it to load the image at all to going all through the process and not having it saved. I literally cannot remember the last time it worked for a rotate-and-crop (not counting a multiple of 90°), but it's been months. After half a dozen times in a row that it didn't work for me a couple of weeks ago, I've just switched to download, use GIMP, upload. We keep getting reports here and the VP (etc.) that it isn't working, and you are actually the first I've heard from in a month saying that is not a 100% experience; I'm actually a bit surprised to hear it is sometimes working. - Jmabel ! talk 20:12, 3 July 2024 (UTC)
- I don't use it very often, but I've used CropTool a few hundred times in the last year and never noticed any issue. However, most times I don't rotate the image. Pere prlpz (talk) 20:34, 3 July 2024 (UTC)
- For rotations, I just use a little tool to request User:Rotatebot make the change. —Justin (koavf)❤T☮C☺M☯ 21:13, 3 July 2024 (UTC)
- Unless I'm mistaken, Rotatebot will only do multiples of 90°; it will not do (for example) a 1.27° rotation and corresponding crop, nor do I readily see how anyone could know that is exactly what they want without a visual tool. - Jmabel ! talk 22:25, 3 July 2024 (UTC)
- It can do rotations down to 1 degree. From the tool I have:
- If you request a rotation by 90, 180 or 270° Rotatebot will do this in a few hours. If you request a rotation by any other angle it will probably take longer.
- So as long as you can translate whatever you want into 360 degrees, then you should be good. (I should also point out that I've never asked for a rotation other than 90/180/270.) Again, I don't want to downplay if you or anyone else is having problems, but between CropTool cropping and the RotateLink gadget, I don't see the immediate issues. Agreed that long-term maintenance is certainly highly important tho, as these tools will break somewhere down the line. —Justin (koavf)❤T☮C☺M☯ 22:36, 3 July 2024 (UTC)
- It can do rotations down to 1 degree. From the tool I have:
- Unless I'm mistaken, Rotatebot will only do multiples of 90°; it will not do (for example) a 1.27° rotation and corresponding crop, nor do I readily see how anyone could know that is exactly what they want without a visual tool. - Jmabel ! talk 22:25, 3 July 2024 (UTC)
- @Koavf: A variety of issues, ranging from not getting it to load the image at all to going all through the process and not having it saved. I literally cannot remember the last time it worked for a rotate-and-crop (not counting a multiple of 90°), but it's been months. After half a dozen times in a row that it didn't work for me a couple of weeks ago, I've just switched to download, use GIMP, upload. We keep getting reports here and the VP (etc.) that it isn't working, and you are actually the first I've heard from in a month saying that is not a 100% experience; I'm actually a bit surprised to hear it is sometimes working. - Jmabel ! talk 20:12, 3 July 2024 (UTC)
For the record, I tried the CropTool again, since several people above said it was working. I started from a 4000 x 6000 px image already on Commons. The tool spent somewhere upwards of 30 seconds failing to fetch the image, then prompted me again for a URL. Repeated twice, at which point I'm comfortable in saying it is still broken for me. - Jmabel ! talk 03:34, 5 July 2024 (UTC)
- I just tried to crop and rotate and image and it worked: File:Edifici Catalana Occidente (cropped).jpg. Maybe I was just lucky or maybe there is a problem related with the image you tried, your configuration or your equipment. What file did you try to crop?--Pere prlpz (talk) 08:43, 5 July 2024 (UTC)
- I think mere corp is more likely to fail for larger file sizes and during busier hours. Other features are completely broken. Enhancing999 (talk) 09:13, 5 July 2024 (UTC)
Recently I've been unable to overwrite using CropTool, getting the error "The overwrite option is disabled because this is a multipage file", for example when trying to crop File:Galerius Arch (Thessaloniki).jpg. I've never seen this error until recently. It seems to work correctly when saving a new file. Consigned (talk) 18:03, 5 July 2024 (UTC)
- Occasionally, I get that message (for single page files or files with a blank second page), but it still overwrites. Enhancing999 (talk) 18:06, 5 July 2024 (UTC)
Now I got what I think may be the same error as you. When trying to rotate 270º (without cropping) file:Diagonal 441 - Muntaner 223-225 - 20240618 173626.jpg, I got "[Error] undefined". Cropping the same image without rotating seems to work fine.--Pere prlpz (talk) 22:06, 7 July 2024 (UTC)
July 05
German currency files without machine-readable license
Category:Files with no license using PD-GermanGov-currency has 3,409 files which reside in Category:Files with no machine-readable license due to the fact that they use {{PD-GermanGov-currency}} ex-license which was decommissioned some years ago. Some previous discussions:
- Commons:Village_pump/Copyright/Archive/2012/07#German_currency (2012)
- Commons:Deletion requests/Template:PD-GermanGov-currency (from 2013)
- COM:CUR#Germany
Those files were nominated for deletion in 2013, but saved because it was determined that due "the response from the German government, which is now an OTRS ticket, it would appear that the hosting of these files is in line with Commons' policies." That is great, news but the files still have to have some valid license. Can some German speakers or people understanding nuances of German law help us determine if we need to create some new license templates, resurrect {{PD-GermanGov-currency}}, or delete those files? Jarekt (talk) 03:45, 5 July 2024 (UTC)
- Per COM:CUR#Germany, German currency units are "Not OK except for Deutsche Mark bank notes". They can still be ok though if they're some kind of PD-old or PD-ineligible. The Deutsche Mark bank notes would probably need some new specialised license tag. The others will need to be examined one by one if they're ok for PD-old or PD-ineligible reasons, else they will need to be nominated for deletion. This is an ongoing process, just like the one concerning German stamps. I've looked at files showing German currency every now and then and either nominated them for deletion (Category:Currency related deletion requests) or replaced the PD-GermanGov-currency tag with the proper license tags. --Rosenzweig τ 07:46, 5 July 2024 (UTC)
- @Rosenzweig: That is a great work you are doing in Category:German currency-related deletion requests/deleted, and a bit sad, as nobody wants to delete good files. The issue is that last time we looked at this in 2013, people were debating about 500 files in Category:PD-GermanGov-currency but 11 years latter we have 3697 files in the same category, so it seems like we gain German currency files with no license much faster than we loose them. Maybe we can start with new PD template for Deutsche Mark bank notes. What would be the rationale behind it? --Jarekt (talk) 12:56, 5 July 2024 (UTC)
- I've also removed the deprecated tag from quite a few and added proper PD license tags, so it's not just deletions. The problem is that the PD-GermanGov-currency tag wasn't really properly deprecated until November 2023, so uploaders kept using it and adding files. The DM bank note conditions seem to be in VRT ticket:2012081410006029, someone would have to look at that. Though reading through Commons:Deletion requests/Template:PD-GermanGov-currency I'm a bit skeptical if they would be enough for today's Wikimedia Commons. --Rosenzweig τ 13:55, 5 July 2024 (UTC)
- @Rosenzweig: That is a great work you are doing in Category:German currency-related deletion requests/deleted, and a bit sad, as nobody wants to delete good files. The issue is that last time we looked at this in 2013, people were debating about 500 files in Category:PD-GermanGov-currency but 11 years latter we have 3697 files in the same category, so it seems like we gain German currency files with no license much faster than we loose them. Maybe we can start with new PD template for Deutsche Mark bank notes. What would be the rationale behind it? --Jarekt (talk) 12:56, 5 July 2024 (UTC)
@Rosenzweig: , According to Krd, the ticket:2012081410006029 contains an e-mail from the German Federal Bank stating:
- They cannot answer if DM notes are PDGov ("Amtliches Werk"). This has to be decided by court if required, but they are not aware of any precedent. They do not object the use of the images if they are unmodified and used in good faith.
- They don't have any business in Euro, GDR currency or Reichmark and refer to the department of finance, or the KFW regarding GDR.
That does not seem like a good basis for PD template. So we would have to assume that all the files in Category:Files with no license using PD-GermanGov-currency are copyrighted unless we can prove otherwise. Is PD-old-70 our best option or are there some other exceptions which can be used? --Jarekt (talk) 13:30, 8 July 2024 (UTC)
- I suspected it might be something like that, sadly.
- We might be able to use {{PD-Germany-§134-KUG}} for some files. That is a very tricky template with several requirements:
- Can only be used for works published for the first time before 1966.
- Those works must be at least 70 years old, so as of 2024, only works before 1954 are eligible. Next year, works from 1954 will become eligible, etc.
- A personal author/artist MUST NOT be named on/in the work. Which should be usually the case with bank notes (not always with 1920s emergency money though), but coins might contain initials of the designer, which is enough for them to be named.
- A corporate entity of a specific kind (a legal entity under public law, de:Körperschaft des öffentlichen Rechts (Deutschland)) MUST be named on the work. The German Federal Bank, the German state itself or one of its subdivisions would fulfil this criterion.
- That would need to be examined and decided on a case by case basis. For any works past 1965, we could not use it.
- And, per Commons:Licensing, we also would have to consider US copyright (the URAA), which would mean only works which are at least 95 years old are ok. Unless we can find some provision in US law that (foreign) currency units are generally in the PD, which I'm not aware of right now. ---Rosenzweig τ 13:54, 8 July 2024 (UTC)
- @Rosenzweig: , That would also mean that coins and banknotes published after 1953 can not use {{PD-old-70}}, {{PD-anon-70-EU}} or {{PD-Germany-§134-KUG}}. Those can be isolated and proposed for deletion. As for US laws, I have not seen any deletions of works in PD in home country but not in the US in last decade or so, so it is less of a priority, but it would not hurt to add {{PD-US-expired}} to currency from before 1929. Another possible approach would be to rewrite {{PD-GermanGov-currency}} as a "previously considered PD" but now no known restrictions license tag before nominating for deletion. Those files do not have known restrictions and seem no worse than other files with no known restrictions license tags. --Jarekt (talk) 16:50, 8 July 2024 (UTC)
- @Jarekt: I disagree about US copyright, COM:Licensing is an official policy here at Wikimedia Commons, and files are still deleted because they're not yet in the public domain in the US. If you doubt that, just take a look at current deletion requests. So we should stick to the official policy. The German wikipedia only applies German/Austrian/Swiss copyright law, so some files could perhaps be reuploaded there on demand.
- As for a "No known restrictions" license tag, are there really no restrictions? The Federal Bank more or less declares that they won't interfere (similarly here), but is that enough and free enough for a license tag? --Rosenzweig τ 17:35, 9 July 2024 (UTC)
- @Rosenzweig: , That would also mean that coins and banknotes published after 1953 can not use {{PD-old-70}}, {{PD-anon-70-EU}} or {{PD-Germany-§134-KUG}}. Those can be isolated and proposed for deletion. As for US laws, I have not seen any deletions of works in PD in home country but not in the US in last decade or so, so it is less of a priority, but it would not hurt to add {{PD-US-expired}} to currency from before 1929. Another possible approach would be to rewrite {{PD-GermanGov-currency}} as a "previously considered PD" but now no known restrictions license tag before nominating for deletion. Those files do not have known restrictions and seem no worse than other files with no known restrictions license tags. --Jarekt (talk) 16:50, 8 July 2024 (UTC)
July 06
POTY (Picture of the Year) competition needs help!
POTY desperately needs new volunteers who can do the things required to run the competition. With the current state of the committee, it is likely that there will be no POTY this year, as the main member who ran scripts for the competition has burned-out from doing wikipedia tasks and isn't up for it. Others on the committee are also missing in action.
Check out the discussion here. Shawnqual (talk) 04:46, 6 July 2024 (UTC)
- In the past few months there was a conversation on Commons with someone from the WMF about WMF prioritization of Wikipedias vs Commons this year. Does anyone remember who that is, could they be pinged to make them aware of this issue? I recall part of that conversation was on how Commons is not particularly focused on disseminating and sharing its content - POTY is probably the most visible initiative for Commons to share its top quality images, including outside Wikimedia, and widely engage with Wikimedia contributors across all projects (are there any stats on how many people engage with POTY, and from which wikis?). I'm sure resources for 2024 are already prioritized but visibility could help for 2025. cc @Rhododendrites who I believe was part of that conversation. - Consigned (talk) 10:11, 6 July 2024 (UTC)
- This has now come up in several places. The conversation you're referring to is probably this one. There's also some background on the POTY talk page, and on Jimbo's enwp page. — Rhododendrites talk | 11:35, 6 July 2024 (UTC)
Please someone save POTY ! This is a very important matter. The POTY contest has been completed successfully every year since 2006 and we can not let it die ! Any help is welcome. -- Giles Laurent (talk) 09:18, 8 July 2024 (UTC)
Long term preservation of media files
In relation to the point Long-term structured digital preservation of humanity's media that has been included here, I'd like to comment on a general idea. I know that it isn't feasible at the moment, but I think about it as a wish (or dream) for the future. There are new very promising storage technologies in development, that will be great for very long term data storage. Once they are available at an affordable cost, many people or institutions will use them to ensure their data is never lost. But what data deserves that more than... human knowledge put together? This goes beyond Commons: all Wikimedia text dumps should also be included there. Text dumps are currently "only" about 25 TB in size (even if including last 5 versions; each one of them covers full history for all Wiki pages). All media files in Commons are currently 543.48 TB in all. 5D optical data storage disks could store up to 360 TB of data, so 2 of them could store all Commons media files, plus full text history of all wiki pages from all Wikimedia sites (including Commons itself). I don't know when it will become affordable, but, once it is, I think is absolutely desirable that WMF produces, each year or each few years, those 2 (or 3 or 4, no more should be needed, even if total size grows a lot) disks of human knowledge. Each pair/group of disks could be stored in different places, so very long term preservation of all content is ensured. If cost ends up being really low, its use can also be very promising for Internet Archive, so the earthquake risk for its contents (that I have expressed several times, both here and in Wikipedia village pump, and even at their own forum) would be virtually eliminated. MGeog2022 (talk) 13:04, 6 July 2024 (UTC)
- It is very important to think about how human knowledge and documents of it can outlast many Millennia. Right now, there are several researches going on, which media might be the best for storing many hundreds of terabytes of data like "Project Silica" or "Cerabyte". Researching takes a lot time until respective storage media is available for the market. For writing and using the said 5D medium, we need a machine with a femtosecond laser, which is very uncommon. We need storage media where writing and reading will also be possible in the far future. Preserving data amounts under 1 PB isn't that expensive and can be assured easily. Next to classic HDDs, files could also be stored on LTO tapes, which are rather cheap and are good for archiving. Apart from this, we need distributed archiving and redundancies, and storage media that is not exposed to (human) failures and errors, or malware, or other threats. But as storage medisa gets better from year to year, we already are to store many many Petabytes on small space (the Internet Archive has a high storage density with its petabox). And in theory, you would need only fourty 26 TB HDDs for one petabyte, which fit in any (smaller) room. :) --PantheraLeo1359531 😺 (talk) 14:21, 7 July 2024 (UTC)
- But the biggest problems are errors made by human users, disasters, wars, and people who don't think about or don't care about archiving their works (photos or videos of events etc.) they created. --PantheraLeo1359531 😺 (talk) 14:22, 7 July 2024 (UTC)
- Another problem is how organisations like Internet Archive or Google, or Meta or whoever cares about preservation. The Internet Archive stores very many petabytes that have to be managed and to be maintained. Companies like Meta or Google hold high relevant files (look at images contributed to Google Maps; many people use Instagram to post relevant scenes) alive. What happens after their liquidation? How are their files treated and how to handle issues like copyright and privacy? What about archives of administrations, of towns, of newspapers, etc. pp... Many questions, but unfortunately too few answers. --PantheraLeo1359531 😺 (talk) 14:27, 7 July 2024 (UTC)
- According to the Wikipedia article about 5D optical data storage, it was used in 2018 to store a copy of Isaac Asimov's Foundation trilogy. That made me think that it was a matter of time (let's say 15-20 years, for example) for it to be available at an affordable cost to WMF. Of course there are lots of storage technologies available; the point with 5D storage is that the same physical disc could last for (theoretically) billions of years, so, once a backup is created, it needs no further maintenance. Well, up until 2021, there were no true backups of Wikimedia Commons media files (only several production copies); now, there are backups in both 2 main WMF datacenters. Before 2013 or so, there was only 1 main WMF datacenter (no copies or backups outside of it). I've read about plans to also implement offline backups that can provide even bigger security. Being optimistic, if the vast majority of files survived for many years when there weren't even proper backups (and even when all copies were at the same datacenter), they are much more likely to survive now that they are backuped following established standards. When the day arrives that high capacity media with an indefinite lifetime is available, it would be great to also write those backups to such media, but probably it isn't strictly needed, since well-maintained, frequent backups would in fact get almost the same level of security.
- By the way, I think it is worth publicly recognizing the work by Jaime Crespo of WMF, in backups, and particularly media backups. They were an absolute need, and it seems that almost all work was made by him; it's really great.
- What about archives of administrations, of towns, of newspapers: this is a complex issue. Data that enters public domain can be uploaded to Internet Archive, or, if applicable, to Commons. Governments of all levels should take care of their historical data, and eventually, to have a kind of world central repository, most of it could also end up in Archive or, in some cases, Commons. Archives of important newspapers are usually well-kept, and, if they enter public domain, could also be centralized in Internet Archive or a similar site.
- Internet Archive or Google, or Meta: Google and Meta have enough money to store an enormous size of data, but their purpose is to monetize it, not to preserve it indefinitely. On the other hand, Internet Archive's problem is the combination of its small budget with the really big amount of data that it stores, probably along with some wrong decisions they have made. MGeog2022 (talk) 20:27, 7 July 2024 (UTC)
- Sorry if I am trying to go too far too soon: thinking more about reality and less about dreams :-), probably the suggestion I should make is to implement offline backups (for example, on tape), if there are no immediate plans for it just now. MGeog2022 (talk) 20:38, 7 July 2024 (UTC)
- For me, talking about these topics and possible scenarios is interesting, and worth to discuss :) --PantheraLeo1359531 😺 (talk) 06:44, 9 July 2024 (UTC)
- Sorry if I am trying to go too far too soon: thinking more about reality and less about dreams :-), probably the suggestion I should make is to implement offline backups (for example, on tape), if there are no immediate plans for it just now. MGeog2022 (talk) 20:38, 7 July 2024 (UTC)
- But the biggest problems are errors made by human users, disasters, wars, and people who don't think about or don't care about archiving their works (photos or videos of events etc.) they created. --PantheraLeo1359531 😺 (talk) 14:22, 7 July 2024 (UTC)
- Watch some of the videos on Iron Mountain, Boyers, Pennsylvania facility storing digital and physical media, and Granite Mountain Records Vault and compare to 2008 Universal Studios fire original master recordings; and w:Double Fold where archives discard/sell original material after microfilming. --RAN (talk) 15:15, 7 July 2024 (UTC)
July 07
Distributed_by
See: File:Onorio Moretti (1881-1939) obituary in The Boston Globe of Boston, Massachusetts on October 24, 1939.jpg in the structured data. I want to allow Distributed_by to be used in structured_data. AP UPI news articles are distributed to news outlets. Recordings and movies have distribution companies involved. I tried to make a change at Distributed_by but it did not work. RAN (talk) 15:05, 7 July 2024 (UTC)
- What I am looking for is to allow distributed_by in structured data. --RAN (talk) 20:00, 8 July 2024 (UTC)
Negative boosted AI images
Should Template:Negative boosted template be added to Template:AI upscaled and Template:PD-algorithm? You could argue that most people who are using Commons would rather prefer to find non-AI images when using the search function.--Trade (talk) 21:04, 7 July 2024 (UTC)
- It depends on the subject, for some subjects (and not only AI-specific ones) some AI images are the most relevant, especially when it comes to recent art genres. It's more the quality of the image that matters than how it was created. There may not be very many cases of such but I oppose it being added to PD-algorithm. I'd support it being added to items in Category:AI misgeneration. Prototyperspective (talk) 21:30, 7 July 2024 (UTC)
- For actively bad images it would make sense to also include the supercategory of Category:Poor quality AI-generated images, which can look superficially useful as a thumbnail but have some fundamental error at full size. Belbury (talk) 13:22, 9 July 2024 (UTC)
- (Wrt Template:AI upscaled) Overwriting images with upscaled versions (whether with AI or otherwise) is against Commons policy. Should these edits not be reverted? ReneeWrites (talk) 23:00, 7 July 2024 (UTC)
- Commons is chock full of improperly overwritten files. The way the regulars talk about it here on VP, you would think something's being done, but not really. RadioKAOS / Talk to me, Billy / Transmissions 00:04, 8 July 2024 (UTC)
- In fairness that's why COM:RFR was created. I don't know how effective it is in curbing destructive overwritings but it's not like there have been no attempts. The issue is that judging by what pages the template links to, the vast majority of these upscalings were done by an experienced Commons user. I assume this was a blind spot for them and they're just not aware of this policy. ReneeWrites (talk) 07:10, 8 July 2024 (UTC)
- @ReneeWrites: they absolutely should be reverted.
- @RadioKAOS: I certainly revert these on sight, unless it is genuinely own work by the uploader and they make that decision themselves to overwrite. - Jmabel ! talk 02:41, 8 July 2024 (UTC)
- The template is to flag that an image may contain speculative AI-hallucinated content, not that the Commons image has been overwritten. Most AI-upscaled images are uploaded as separate files (sometimes, frustratingly, instead of the original, where a user pulling historical photos from old newspapers thinks they look better with the AI treatment). Belbury (talk) 10:57, 8 July 2024 (UTC)
- Commons is chock full of improperly overwritten files. The way the regulars talk about it here on VP, you would think something's being done, but not really. RadioKAOS / Talk to me, Billy / Transmissions 00:04, 8 July 2024 (UTC)
- "You could argue that most people who are using Commons would rather prefer to find non-AI images" We can argue lots of things of course, but is this actually the case ... I'd avoid this kind of reasoning, we do not know why people look for the things they look for. Better to make actual judgements on proper arguments. The negative boosted template is a pretty aggressive downrank (similar to that which deletions get) which I would not advise using for anything but literal trash, and thus I would not advise in this case. If it should be downranked with a different value is something that I'd be open to (based on arguments), but we would have to define and add extra levels of derank templates for that to work. —TheDJ (talk • contribs) 14:33, 9 July 2024 (UTC)
- Some Commons users won't mind being served AI images. A number of Wikipedia projects seem very relaxed about using AI-upscaled or entirely AI-generated images for biography articles, apparently preferring them to lower-quality historical images. If I myself was looking for a drawing or an icon I wouldn't dismiss an AI-generated image out of hand.
- But there is definitely an issue where a user may find it hard to distinguish real photographs from artificial ones, when searching. Right now, a search for "astronaut" is okay for the early results being a mix of photos and cartoony artwork - the photos are all (I think!) genuine, the cartoons are either AI or hand-drawn. There's nothing misleading there. But a search for "london steam train" has a completely fake AI image in the first few results, which only becomes apparent if you click through or hover to read the filenames. Belbury (talk) 13:01, 10 July 2024 (UTC)
- Good points, in your second link the problem is mitigated to some extent by the good file naming so when hovering over the AI image one sees a tooltip over it and the URL in the bottom left with "AI-generated" in the title (even when at VP/Proposals many have opposed a policy guideline for descriptive file naming). I think the best approach would be to have a tag in a corner of the thumbnail image that clarifies the image is entirely/largely made using some AI tool, and so even when glancing over the search results. Prototyperspective (talk) 20:09, 10 July 2024 (UTC)
July 08
License template request: AGPLv3 only
Could someone create the license template Template:AGPLv3 only? We currently only have Template:AGPL, and that's not acceptable for files that don't use the "or any later version" clause.
We already have Template:GPLv3 only, which makes this distinguishment for Template:GPLv3.
I'd create the template myself, as I've recently uploaded an AGPLv3-only file, but I have zero experience with template creation and wouldn't even know where to start.
It also occurred to me that there might be some files incorrectly marked as AGPL, since the correct license tag doesn't exist and apparently never has, so the logical step for the lazy is to just use the AGPL template and be done with the upload (which I'll also do, though I'll change the license to the correct one once someone creates the template).
Dunno if this helps, but the Wikidata ID for AGPLv3 (ie. AGPLv3 only) is GNU Affero General Public License, version 3.0 (Q27017232) and AGPLv3 (containing the "any later version" clause) is GNU Affero General Public License, version 3.0 or later (Q27020062). --Veikk0.ma (talk) 01:25, 8 July 2024 (UTC)
- I've created Template:AGPLv3 only, but I'm waiting for a translation admin to mark Template:AGPLv3 only/i18n for translation. I might have done something wrong with the i18n stuff so if a more experienced user/translation admin can check that would be wonderful. —Matrix(!) {user - talk? -
uselesscontributions} 10:25, 9 July 2024 (UTC) - @Veikk0.ma: Done, see Template:AGPLv3 only, you can use/translate it now. —Matrix(!) {user - talk? -
uselesscontributions} 17:11, 11 July 2024 (UTC)- Many thanks! --Veikk0.ma (talk) 00:00, 12 July 2024 (UTC)
Voting to ratify the Wikimedia Movement Charter is ending soon
- You can find this message translated into additional languages on Meta-wiki. Please help translate to your language
Hello everyone,
This is a kind reminder that the voting period to ratify the Wikimedia Movement Charter will be closed on July 9, 2024, at 23:59 UTC.
If you have not voted yet, please vote on SecurePoll.
On behalf of the Charter Electoral Commission,
RamzyM (WMF) 03:45, 8 July 2024 (UTC)
ISO 24138 - International Standard Content Code - ISCC
What is the position of Commons on the International Standard Content Code (ISCC - ISO 24138) https://iscc.io/ ?
As neither the english-languaae Wikipedia nor the german-language Wikipedia have an article on ISCC, don't mention it in the ISCC-disambiguation page and do not even mention it anywhere, I fear, that there is no postion about ISCC at Commons? @Legoktm: --C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 10:30, 8 July 2024 (UTC)
- I wasn't aware of ISCC up until now. Was there a specific reason you pinged me? Legoktm (talk) 02:20, 9 July 2024 (UTC)
- Yes, someone tagged you about ISCC on Mastodon a few days ago (in connection with an annocement about ISCC and the expressed hope that Commons would use it). C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 02:33, 10 July 2024 (UTC)
How about adding a SDC prop for ISCC? --C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 10:31, 8 July 2024 (UTC)
Identifying and categorising special building in Japan
This building is just outside de Category:Hakata Station main station. It has green plants and even a waterfall.
Smiley.toerist (talk) 11:37, 8 July 2024 (UTC)
- It’s the Miyako Hotel, Hakata, at (33.58977,130.42283). Dogfennydd (talk) 17:22, 8 July 2024 (UTC)
- @Dogfennydd@Smiley.toerist done categorizing! It is a very good thing that Japanese copyright law does allow commercial uses of images of their architecture. JWilz12345 (Talk|Contrib's.) 00:42, 9 July 2024 (UTC)
Potentially confusing page naming
We have both Commons:Primeiros passos and Commons:First steps/pt with the level-1 heading "Primeiros passos". This seems potentially very confusing. Any thoughts? - Jmabel ! talk 18:26, 8 July 2024 (UTC)
- Easy, like it or not, English is the default language for the project. It's a pragmatic decision, and obviously assists search. Other languages can be echoed by creating a companion page in Wikidata. Broichmore (talk) 19:10, 8 July 2024 (UTC)
- @Broichmore: one of us is missing the other's point; I'm not sure which. For Portuguese-language users, this results in two rather different pages that effectively have the same title. Are you saying that's not at all a problem, or not one worth fixing, or something else? And I can't see what Wikidata has to do with the matter at all. - Jmabel ! talk 00:09, 9 July 2024 (UTC)
- Yes, I missed the point. The two pages should, could be, amalgamated TBH. Broichmore (talk) 10:00, 9 July 2024 (UTC)
- @Broichmore: one of us is missing the other's point; I'm not sure which. For Portuguese-language users, this results in two rather different pages that effectively have the same title. Are you saying that's not at all a problem, or not one worth fixing, or something else? And I can't see what Wikidata has to do with the matter at all. - Jmabel ! talk 00:09, 9 July 2024 (UTC)
- On closer look, this seems to be a problem also for Commons:First_steps/de and Commons:Erste Schritte and around 20-30 more languages (everything in {{Header|Lang-FS}}) as a lot of these translations were created pre-2010 before we had mw:Extension:Translate. Ideally, they should be redirected to the correct version (Commons:First steps/xx) (the versions are so different there's little point merging). —Matrix(!) {user - talk? -
uselesscontributions} 10:34, 9 July 2024 (UTC)- But that seems to raise the question of whether Commons:Primeiros passos, Commons:Erste Schritte, etc. have some content worth preserving.
- I'd suggest that some people whose native language is other than English might want to look into this for their respective native language. - Jmabel ! talk 16:54, 9 July 2024 (UTC)
July 10
U4C Special Election - Call for Candidates
- You can find this message translated into additional languages on Meta-wiki. Please help translate to your language
Hello all,
A special election has been called to fill additional vacancies on the U4C. The call for candidates phase is open from now through July 19, 2024.
The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. Community members are invited to submit their applications in the special election for the U4C. For more information and the responsibilities of the U4C, please review the U4C Charter.
In this special election, according to chapter 2 of the U4C charter, there are 9 seats available on the U4C: four community-at-large seats and five regional seats to ensure the U4C represents the diversity of the movement. No more than two members of the U4C can be elected from the same home wiki. Therefore, candidates must not have English Wikipedia, German Wikipedia, or Italian Wikipedia as their home wiki.
Read more and submit your application on Meta-wiki.
In cooperation with the U4C,
-- Keegan (WMF) (talk) 00:02, 10 July 2024 (UTC)
Does Pywikibot still work?
Yesterday I have uploaded File:Brückner Vielflache Fig. 42.jpg using Pywikibot. After one image I have stopped the script, to fix a detail in the description. But since then I was not able to upload the other images. The main error messages are An error occurred for uri https://commons.wikimedia.org/w/api.php and The write operation timed out.
all error messages |
---|
Uploading file to commons:commons...
Sleeping for 7.3 seconds, 2024-07-10 17:23:06
ERROR: An error occurred for uri https://commons.wikimedia.org/w/api.php
ERROR: Traceback (most recent call last):
File "/home/PATH/env/lib/python3.12/site-packages/pywikibot/data/api/_requests.py", line 684, in _http_request
response = http.request(self.site, uri=uri,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/PATH/env/lib/python3.12/site-packages/pywikibot/comms/http.py", line 283, in request
r = fetch(baseuri, headers=headers, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/PATH/env/lib/python3.12/site-packages/pywikibot/comms/http.py", line 457, in fetch
callback(response)
File "/home/PATH/env/lib/python3.12/site-packages/pywikibot/comms/http.py", line 343, in error_handling_callback
raise response from None
File "/home/PATH/env/lib/python3.12/site-packages/pywikibot/comms/http.py", line 448, in fetch
response = session.request(method, uri,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/PATH/env/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/PATH/env/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/PATH/env/lib/python3.12/site-packages/requests/adapters.py", line 501, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', TimeoutError('The write operation timed out'))
WARNING: Waiting 5.0 seconds before retrying. |
Was something changed yesterday? My script is the same that has worked for the one uploaded file. --Watchduck (quack) 15:42, 10 July 2024 (UTC)
- I have succesfully uploaded photo today. If you dont already do try to use
asynchronous=True
for upload. If that doesn't help try to set socket_timeout for wikibot. ie. pywikibot.config.socket_timeout = 120 site = pywikibot.Site("commons", "commons") # for Wikimedia Commons site.login() image_file_path='/tmp/example.jpg' commons_file_name = "File:example.jpg" file_page = pywikibot.FilePage(site, commons_file_name) if file_page.exists(): print(f"The file {commons_file_name} exists.") exit() file_page.text = wikitext summary = "Uploading example.jpg" file_page.upload(image_file_path, comment=comment,asynchronous=True)
- --Zache (talk) 16:48, 10 July 2024 (UTC)
pywikibot.config.socket_timeout = 120
worked. Thanks. --Watchduck (quack) 17:02, 10 July 2024 (UTC)
July 12
"campaign323@ISA"
Does anyone know the significance of "campaign323@ISA"? I am seeing a lot of bad "depicts" with that as an edit summary. - Jmabel ! talk 01:59, 12 July 2024 (UTC)