User talk:The Earwig/Archive 18

Latest comment: 2 months ago by The Earwig in topic Revdel-responder

sigma.toolforge.org

Looks like toolforge:sigma got shut down in the Grid Engine deprecation (see phab:T320041). User:Σ is inactive, and you're the only other listed maintainer. Are you planning to migrate it, or should I start trying to find someone to help? AntiCompositeNumber (talk) 00:42, 21 December 2023 (UTC)

@AntiCompositeNumber: Ah. No, the timeline's been so protracted, I haven't been actively following things and didn't know this was happening today. (The date in my mind was early next year.) I could probably do it, but certainly can't allocate time right now to immediately fix this. — The Earwig (talk) 03:27, 21 December 2023 (UTC)
Yeah, they started shutting down tools where maintainers hadn't requested more time today. The Grid won't be shut down completely until February though. I've left a note on the phab task asking for the tool to be un-disabled in the meantime. AntiCompositeNumber (talk) 03:43, 21 December 2023 (UTC)
Thanks! — The Earwig (talk) 03:44, 21 December 2023 (UTC)
Hi, I'm available today or tomorrow and would have time to fix this if it is possible to add me as a co-maintainer. I might need some time to familiarize with the infra though, as it looks like the tool isn't open source. 0xDeadbeef→∞ (talk to me) 04:02, 21 December 2023 (UTC)
Thanks for volunteering, 0xDeadbeef! I've added you as a co-maintainer. There's supposed to be a code repository but it must've disappeared (any idea where that ended up, Lego?). The active code is in ~/www/python/src and possibly other places; there are local changes not in sync with the git repo. Feel free to ping if you have any questions, though honestly, beyond what I just said, I probably know as much as you do about this. — The Earwig (talk) 04:10, 21 December 2023 (UTC)
The repository is there, it's just marked as private. It's up to date with what's on Toolforge, aside from all the uncommitted changes that is. Probably best to push the repository to Wikimedia GitLab tbh. Legoktm (talk) 04:25, 21 December 2023 (UTC)
I just did, at https://gitlab.wikimedia.org/toolforge-repos/sigma 0xDeadbeef→∞ (talk to me) 05:09, 21 December 2023 (UTC)
Btw, has the "AFD Stats" page at https://sigma.toolforge.org/afdstats always been like that? 0xDeadbeef→∞ (talk to me) 06:41, 21 December 2023 (UTC)
Besides the weird afd stats page, I've restored the others and they seem to be running fine, Lowercase sigmabot III's two daily jobs have been converted to use the new framework. Let me know if there are any other errors. 0xDeadbeef→∞ (talk to me) 07:13, 21 December 2023 (UTC)
@0xDeadbeef: Thanks a bunch! I don't think AFD Stats has always been broken, but people are mostly using https://afdstats.toolforge.org/ now, so it's not a priority to fix. Maybe I can take a look at that myself later. I also noticed the main page at https://sigma.toolforge.org/ still displays the 410 Gone error, though the individual tools are fine; did we have an index page before that disappeared? Scratch that, just some bad caching on my end. All good. — The Earwig (talk) 14:02, 21 December 2023 (UTC)
Well...seems like the afdstats tool is also still on the grid, c.f. https://github.com/enterprisey/afdstats/pull/27. Ping @Enterprisey! Legoktm (talk) 07:00, 22 December 2023 (UTC)

The Signpost: 24 December 2023

A solstice greeting

❄️ Happy holidays! ❄️

Hi Ben! I'd like to wish you a splendid solstice season as we wrap up the year. Here is an artwork, made individually for you, to celebrate. It was great to meet you in Toronto, and looking forward to collaborations in the coming year! Take care, and thanks for all you do to make Wikipedia better!
Cheers,
{{u|Sdkb}}talk
 
Solstice Celebration for The Earwig, 2023, DALL·E 3.
Note: The vibes are winter solsticey. If you're in the southern hemisphere, oops, apologies.

{{u|Sdkb}}talk 07:06, 24 December 2023 (UTC)

Thanks very much, Sdkb! Great meeting you as well. All the best to you in the new year. — The Earwig (talk) 20:30, 24 December 2023 (UTC)

Merry Christmas!

Hello, The Earwig! Thank you for your work to maintain and improve Wikipedia! Wishing you a Merry Christmas and a Happy New Year!
Chris Troutman (talk) 23:15, 24 December 2023 (UTC)

Spread the WikiLove and leave other users this message by adding {{subst:Multi-language Season's Greetings}}

Copyvio tool is down

Hello Be. Sorry to bother you but the copyvio tool is down, it's been down for about an hour and a half with 504 gateway timeout errors. Any help appreciated. Thanks, — Diannaa (talk) 16:56, 23 December 2023 (UTC)

Thanks; I've noticed things being a little spotty over the past couple weeks, but haven't identified a cause yet (i.e. no single culprit for increased usage). I'll continue to keep an eye out. — The Earwig (talk) 18:59, 23 December 2023 (UTC)
Sorry to bother you today of all days, but the tool is suffering outages again, and has currently been down for an hour and a half. Thanks, — Diannaa (talk) 17:29, 25 December 2023 (UTC)

Administrators' newsletter – January 2024

News and updates for administrators from the past month (December 2023).

  Arbitration

  Miscellaneous


Sent by MediaWiki message delivery (talk) 11:54, 1 January 2024 (UTC)

The Signpost: 10 January 2024

User:Reports bot

Hi Earwig, I am enquiring about User:Reports bot and its task to update Wikipedia:WikiProject Women in Red/Metrics. There is a proposal to update the WikiProject banner for this project and I'm just checking that it won't disrupt the work of the bot? Best regards — Martin (MSGJ · talk) 22:33, 18 January 2024 (UTC)

Hey MSGJ, I don’t see any issue with this. The bot is flexible about the page contents, provided its Reports bot variable comments on the individual metric pages are preserved. — The Earwig alt (talk) 22:44, 18 January 2024 (UTC)
Thanks. Not planning to change that page itself but only the banner {{WIR}} used to tag relevant pages within the scope of the project. It was just in case your bot was relying on any specific template or categories to find these pages. — Martin (MSGJ · talk) 09:01, 19 January 2024 (UTC)

Temporary Password

I am User:Wxao Zesty, I am requesting for a temporary password to my email. Since, the last one did not go through.216.176.69.228 (talk) 20:02, 19 January 2024 (UTC)

The Signpost: 31 January 2024

Administrators' newsletter – February 2024

News and updates for administrators from the past month (January 2024).

 

  CheckUser changes

  Wugapodes

  Interface administrator changes

 

  Guideline and policy news

  • An RfC about increasing the inactivity requirement for Interface administrators is open for feedback.

  Technical news

  • Pages that use the JSON contentmodel will now use tabs instead of spaces for auto-indentation. This will significantly reduce the page size. (T326065)

  Arbitration

  • Following a motion, the Arbitration Committee adopted a new enforcement restriction on January 4, 2024, wherein the Committee may apply the 'Reliable source consensus-required restriction' to specified topic areas.
  • Community feedback is requested for a draft to replace the "Information for administrators processing requests" section at WP:AE.

  Miscellaneous


Sent by MediaWiki message delivery (talk) 18:01, 1 February 2024 (UTC)

Using The Wikipedia Library for copyvio detection

Hello. I noticed that large chunks of this section of herbicide are copied directly from this source(you'll need to log in) but the copyvio detector doesn't pick it up: [1]. I can't find a tool to show it nicely, but it is especially obvious if you look at the original diff: [2]. Presumably it isn't detected because the tool can't access the full text? I just wondered whether you'd considered linking up the detector with WP:TWL so that it can check the full text? Admittedly, I am not sure whether the publishers permit automated access, but you would think that they would like us to be checking whether their copyright is being violated! @Samwalton9 (WMF): just in case they can add anything. SmartSE (talk) 10:29, 19 December 2023 (UTC)

@Smartse It's an interesting idea! I don't think we could do anything immediately, but if it would be feasible/helpful we could initiate a conversation with one of more of the library's partners about this. Perhaps EBSCO, given that they're our search provider? I'm not sure on the details of how this would work. Samwalton9 (WMF) (talk) 12:56, 19 December 2023 (UTC)
Hey Smartse. I'm with Samwalton9 that this would be really cool to support, but I'd be very surprised if TWL's partners would be willing to open up a service to us that would enable the copyvio detector to check content programmatically. Initiating a conversation couldn't hurt, though. — The Earwig (talk) 03:56, 21 December 2023 (UTC)
@The Earwig It's not impossible to imagine - TWL's partners are often concerned that WP editors are going to be copying content, so being able to say "we want to make absolutely sure that's not happening" could be seen quite positively. Would EBSCO be the right organisation, do you think, since they run (and provide us with) EBSCO Discovery Service? Samwalton9 (WMF) (talk) 09:51, 21 December 2023 (UTC)
@Samwalton9 (WMF): I was initially thinking of just searching the sources cited in the article. Apparently, most of the full texts can be accessed by appending the DOI to https://doi-org.wikipedialibrary.idm.oclc.org/ so it shouldn't be too difficult to programmatically access the full text (not withstanding the authentication and any rate-limiting) and then the text could be compared as the tool already does. I'm not familar with EBSCO, but I imagine that using that would be more complicated as you would need to take chunks of the article, query the search engine repeatedly and then check full texts that could be matches. I also posted about this at meta:Talk:CopyPatrol#Can_the_tool_access_paywalled_full_texts? and the ithenticate service can detect it in a new edit - see the hit for link.springer.com - even though the full text is paywalled, so maybe using that service in this tool could be an option as well? It seems like that tool does a pretty good job of catching new copyvios but we are less capable of detecting old instances. SmartSE (talk) 12:26, 21 December 2023 (UTC)
Checking the DOIs of sources directly cited would be a good start and wouldn't require us to get a search engine working, so we could try that (though the full scope is of course somewhat limited). If I'm to do that through TWL's proxy, we'd need to get the bot access somehow and confirm this usage is within their terms. @Samwalton9: I'm also unfamiliar with EBSCO and from skimming the linked pages it's not clear to me if they offer a search API that I would be able to use for what SmartSE described (query the search engine repeatedly given text snippets from the article and receive results that enable me to get the full text of the source for comparison). I see discussion of end-user search tools, but not an API. One change to the copyvio detector I am sure we will need to make is not showing the user the full text of the suspected source, only the copied snippets. — The Earwig (talk) 14:19, 21 December 2023 (UTC)
@The Earwig Is this a helpful link? Once we've confirmed this is a viable and useful approach I'd be happy to bring this up with them. Samwalton9 (WMF) (talk) 16:07, 8 January 2024 (UTC)
@Samwalton9 (WMF): Probably. I can't say for sure (the API documentation requires an account, and I still don't know the terms of use), but it looks like the right direction. Thanks! — The Earwig (talk) 17:01, 8 January 2024 (UTC)
Alright, I'll get an initial conversation kicked off with them and see how feasible this is. I'll be in touch! Samwalton9 (WMF) (talk) 10:33, 12 January 2024 (UTC)
@The Earwig Good news! We met with EBSCO today and they're enthusiastic about the idea. Their main question was around request load - do you have any data/estimates about how many daily or monthly requests Copyvios makes?
The other topic we talked about was how pulling the text through would work (or not). EDS has access to all these databases to index for searching, but not necessarily for displaying full text. Even if they did, that would be for subscribing customers so there would be some concern about pulling the full text through to display publicly in the tool. It might be the case that they could return some information about finding a match in a source, but perhaps not display the actual matched text directly. That's something we'll need to get more clarity on with them, but perhaps even if that is the case we could make some UI changes to highlight that a match was found in EDS, and the relevant URL, but not display the matching text? Happy to think that through with you.
If this still sounds feasible to you I'd be happy to copy you into our email thread so you could ask any more specific questions you might have. Samwalton9 (WMF) (talk) 16:25, 5 February 2024 (UTC)
@Samwalton9 (WMF): Sounds good, thanks for the update! We can definitely indicate a match without including the full text if needed. There is already some support in the tool for this with the Turnitin option.
Regarding request rate, the tool checks about 1,200 articles per day or 36,000 per month. I'd be surprised if that's too much for them, but we could make the new functionality opt-in like Turnitin, so users have to check a box to use EDS which will drastically reduce the rate (the Turnitin feature is used only 100 times/day). — The Earwig (talk) 16:54, 5 February 2024 (UTC)
@The Earwig Thanks for the data! I remember reading somewhere that the tool makes multiple requests per article check, is that right? I wonder if you have a sense of how many actual API requests are being made? Samwalton9 (WMF) (talk) 13:05, 6 February 2024 (UTC)
@Samwalton9 (WMF): Yes, that's right – up to 8 per article, depending on page size, but again, configurable. Altogether for Google Search the number is under 10k for most days. — The Earwig (talk) 14:41, 6 February 2024 (UTC)
Great, thanks! I've cc'd you on an email. Samwalton9 (WMF) (talk) 15:36, 6 February 2024 (UTC)

The Signpost: 13 February 2024

lowercase sigmabot III

Hi! I reached out to Σ by email about lowercase sigmabot III, which had not been archiving anything (with the exceptions of AN and ANI) since last week. They responded (by email) saying Please reach out to Earwig for this issue. The crontab was erased somehow, which means that it's no longer running the bot on its schedule. I'm not sure what changed but I think he will know where to look and that For the time being I just kicked it off manually. Thank you for any insight you might have! HouseBlaster (talk · he/him) 15:07, 28 February 2024 (UTC)

Thanks for letting me know. I'll take a look at this. — The Earwig (talk) 15:29, 28 February 2024 (UTC)
It's not clear what the original issue was, but I've jiggled things a bit, so if we're lucky it won't happen again. — The Earwig (talk) 16:29, 28 February 2024 (UTC)
Thank you! HouseBlaster (talk · he/him) 17:05, 28 February 2024 (UTC)

Administrators' newsletter – March 2024

News and updates for administrators from the past month (February 2024).

  Guideline and policy news

  Technical news

  • The mobile site history pages now use the same HTML as the desktop history pages. (T353388)

  Miscellaneous


Sent by MediaWiki message delivery (talk) 12:22, 1 March 2024 (UTC)

The Signpost: 2 March 2024

Revdel-responder

Hi, it could be a WP:THURSDAY thing but the revdel-respoder script seems to have a problem today. I keep getting a message "Sorry! revdel-responder failed to parse the page content". I'm not good enough at interpreting the console to work out what's gone wrong. Nthep (talk) 11:44, 14 March 2024 (UTC)

not sure if anything has happened during the day but, it seems to have resolved itself. Nthep (talk) 18:49, 14 March 2024 (UTC)
Thanks for letting me know, Nthep. It's possible that was some intermittent error. If you run across it again, let me know the page, or send me the text from the console (right click -> Inspect -> "Console" tab, there should be a line starting with "Error while parsing page content"). — The Earwig (talk) 03:48, 15 March 2024 (UTC)