User talk:HBC NameWatcherBot/Blacklist/Archive 3

LickFoo Anchoring

{{editprotected}} The "LickFoo" regex pattern should be better anchored - as currently written it matches "click here", which, while possibly promotional, shouldn't really fall under that regex. A suggested replacement would be:

;(?<!c)lick ?(my|his|her):REGEX,LABEL(LickFoo)

I initially thought of using a "\b" word boundary metacharacter, but that wouldn't work if it was inside of a username built as a single word. The negative lookbehind expression should do the trick - it'll only match if "lick" isn't preceded by a "c", which is probably sufficient for the single FP reported on the entry. Other suggestions or a slightly different implementation welcome as well, of course. —Krellis (Talk) 15:30, 18 February 2009 (UTC)

  Not done - I think that this is a better solution to the problem. עוד מישהו Od Mishehu 07:07, 19 February 2009 (UTC)
For future reference, \b does match the beginning and end of a string as well as any non-word character. Chillum 13:58, 19 February 2009 (UTC)

Request for addition to Blacklist

{{editprotected}}

I have a request to add a serial vandal, and sockpuppeteer to this bot's watch list. The serial vandal usually makes a single change to Jimbo Whale's userpage, changing the 'founded' to 'co-founded'. This vandal has created more then 20 socks to do this. The usernames are also usually similar, consisting of variants of The Hollaback Girl, such as There Ain't No Hollaback Girl, or Hollabck Girl, or the latter with girl replaced by "Boy".— dαlus Contribs 07:17, 27 February 2009 (UTC)

How does hollab(a?)ck (girl|boy) seem to you? (please note that these regexes are NOT case sensitive.) עוד מישהו Od Mishehu 07:03, 1 March 2009 (UTC)
As long as it works, I'll be happy. If you check out the sock master's page, then follow the confirmed sock link, you'll see the socks are about to reach thirty.— dαlus Contribs 09:07, 1 March 2009 (UTC)
  Done - hopefully this will help you catch the sockpuppets more quickly. עוד מישהו Od Mishehu 11:04, 1 March 2009 (UTC)

wikipedia

What's wrong with using wikipedia in your username?? 98.162.148.46 (talk) 18:20, 1 March 2009 (UTC)

I don't know. The user who added it hasn't been active since June '07, so no point in asking him/her. The addition was done with no edit summary. Please also note that there isn't necessarily something wrong with all of these - some of them are highly indicative of trouble making, but not wrong themselves. Even the most blatant ones probably have the occasional exceptions. עוד מישהו Od Mishehu 11:49, 3 March 2009 (UTC)

Delete wikipedia off the blacklist! --98.162.148.46 (talk) 05:14, 7 March 2009 (UTC)

There used to be a rule against having Wikipedia in your username. Chillum 06:39, 7 March 2009 (UTC)
What rule was that? What did it look like? --98.162.148.46 (talk) 22:49, 7 March 2009 (UTC)
There's nothing wrong with "wikipedia" inherently but in most uses, it looks like the name would imply a position of authority. E.G. User:WikipediaEditor may sound like this user is claiming to be the Editor of Wikipedia (like the Editor of a Newspaper). So they're good for the bot's list because they are worth checking. Mangojuicetalk 19:24, 20 March 2009 (UTC)

Regex

The regex "sock *puppet|meat *puppet" Can be simplified to "(sock|meat) *puppet" Indianopilot (talk) 20:58, 2 April 2009 (UTC)

"on dunlops" / "on michelins"

Sock puppets or impersonation: This has become a JA/G alternative to "on wheels" --Rrburke(talk) 17:56, 7 April 2009 (UTC)

Group names

Hi there. I've been doing some work at WP:UAA lately, specifically I've been going through the User Creation log. I always search for accounts with "and" or "&" in them, as these are often operated by multiple people. Perhaps you could add a function that searches for these to the namewatcher bot? I realize "and" by itself would be problematic (Sandy, Randy, Amanda, etc.), but " and " might work. --Gardenhoser! (talk) 14:59, 21 April 2009 (UTC)

I believe that "and" is probably too prone to false positives. Blacklisting " and " (with the spaces) and the "&" character seem potentially reasonable, but I don't want to do it unless there is more concensus for such a move. עוד מישהו Od Mishehu 11:15, 22 April 2009 (UTC)
This would look like this:
(\band\b|&)
REGEX,LABEL(and),NOTE(The use of the term "and" may indicate a multiple person account)
As for the rate of false positives, you can always add: ALTERNATE_TARGET(User:Someuser/Username filter test area) to make it report to a private page for a while so as to examine its usefulness. You will also need to add <!-- HBC NameWatcherBot v1.0.1 allowed --> to the destination page or the bot won't report there. Chillum 15:18, 4 September 2009 (UTC)

One to remove from the sockkpuppets section

{{editprotected}}   Clerk note: Please can the match on kieran in user names be removed. It has an appalling false positive rate. Many new accounts are registered, and they are pretty much never socks of this user. Having it here just adds to the SPI clerk workload. Mayalld (talk) 11:37, 30 April 2009 (UTC)

  Done. — Martin (MSGJ · talk) 12:14, 30 April 2009 (UTC)

Request

Can someone please add

[A-Z]{5,}
REGEX,WAIT_TILL_EDIT,ALTERNATE_TARGET(User:Triplestop/Spam watch)

I am watching for a certain troll, see Category:Wikipedia_sockpuppets_of_ALLCAPS_FOREVER Triplestop x3 15:00, 12 July 2009 (UTC)

This seems like it would be prone to a lot of false positives... --- RockMFR 15:50, 12 July 2009 (UTC)
Did this edit cover your request? It reports something similar to what you want.TNXMan 15:53, 12 July 2009 (UTC)
I think the bot uses case insensitive matching, I can review the source code later to be sure. If so this regex would match any name with more than 5 letters in a row. Chillum 15:59, 12 July 2009 (UTC)
After reading the source code I can confirm that REGEXes are matched without regard to case. I may add a flag for a filter to be case sensitive when I have some time, but I am busy at work this week. Chillum 16:01, 12 July 2009 (UTC)

Ok, never mind then. I will just look through the new user log for possible attack names. Thanks, Triplestop x3 16:37, 12 July 2009 (UTC)

When I wrote the bot I did not even include regular expressions, they were added later. The code it was built on was case insensitive and there were no present cases for making the distinction so I just added REGEXes as insensitive. Now that an actual use case has come up I will try to make the time to add the CASE_SENSITIVE flag to the bot. This change will be backwards compatible as filters without this flag will still be treated the same. Chillum 16:41, 12 July 2009 (UTC)
I have made the needed changes to add the CASE_SENSITIVE flag. I have also notified the current bot operator of this change. Once the change is up can running I can add that pattern with this new flag for you. Chillum 18:32, 12 July 2009 (UTC)
This change is now in place, though it has been so long I don't know if this particular pattern is needed anymore. Chillum 00:26, 4 September 2009 (UTC)

regex morphs

I've noticed that the bot does remove offensive names, but doesn't notice tweaked names like User:Ffuk1 I suggest the regxps ought to be modified to accomodate these names.ManishEarthTalkStalk 02:24, 3 September 2009 (UTC)

It certainly is possible. Another issue is that of similar characters like "pr1ck". This would require a lot of work to do with regexes, perhaps I could add a setting to detect homoglyphs. To do that I would need to compile a table of similar looking characters, or find and existing table. Once I had that then it would not be hard to add a HOMOGLYPH flag. Chillum 02:28, 3 September 2009 (UTC)
For stuff like pr1ck, there is no need for complicated regexes, just create a function which changes all the chars in the input string following these rules:

1--> i or l
3-->e
0-->o
@-->a
$-->s
!-->l or i
7-->t
and so on and so forth.
Usage of leet is also very popular, and there is a table for leet at Leet#Orthography.
The original problem can be easily solved by merging repeated characters. Also, many MMORPG games with in built chat have their own filters, which might be open-source. Or, ask User:ClueBot for his filters. ManishEarthTalkStalk 12:13, 4 September 2009 (UTC)

Yes that is the general idea behind the HOMOGLYPH flag I plan to eventually add. What I really need is an extensive table of the homoglyphs, I have made a crude start here in a format the bot can understand: User:HBC NameWatcherBot/Homoglyphs. Chillum 15:10, 4 September 2009 (UTC)
What about character combos : ph-->f, c or k-->ck?? ManishEarthTalkStalk 05:48, 5 September 2009 (UTC)
I doubt both caps and lower are needed. Just change the entire string to lowercase and have just glyphs for lowercase. More concise and easier to understand. ManishEarthTalkStalk 05:51, 5 September 2009 (UTC)
I've made User:HBC NameWatcherBot/Homoglyphs/Leet for leet combos. ManishEarthTalkStalk 05:57, 5 September 2009 (UTC)
I have made some progress in designing a routine. For example it will take the string "on wheels" and automatically create a regex like so: "(o|0)n (w|vv)h((e|3)(e|3)|(e|3)a)(l|\!|1|i|\|)(s|z)" so that is can make matches like this: '0n vvh33|z' matches 'on wheels' or even '0ri \/\/|-|3&|z' matches 'on wheels'. This can also handle combos such as f->ph etc. I still need to build an extensive table and integrate it into the bot but these early results are promising. Chillum 16:58, 7 September 2009 (UTC)
Great! Where's the list?ManishEarthTalkStalk 12:59, 9 September 2009 (UTC)
User:HBC NameWatcherBot/Homoglyphs. I have added most of the obvious unicode homoglyphs from a few sources. Chillum 13:28, 9 September 2009 (UTC)

KissFoo

The regexp "kiss ?(my|his|her)" should probably have a note of low-confidence; not everything someone has to say about kissing is inappropriate. Intelligentsiumreview 23:39, 19 October 2009 (UTC)

I will add that flag now. Thanks. Chillum 23:48, 19 October 2009 (UTC)

Simple sock request

We could use a regexp of "Willy wonka and the \b\w\b factory" to track down socks of User:Willy wonka and the dikipedia factory to automatically list in the bot-reported SPI page. MuZemike 04:19, 20 October 2009 (UTC)

The settings would be REGEX,WAIT_TILL_EDIT,SOCK_PUPPET(Willy wonka and the dikipedia factory),ALTERNATE_TARGET(Wikipedia:Sockpuppet investigations/SPI/Subpage - Bot reported cases) MuZemike 04:21, 20 October 2009 (UTC)

Ok. Adding:
;Willy wonka and the \b\w\b factory:REGEX,WAIT_TILL_EDIT,SOCK_PUPPET(Willy wonka and the dikipedia factory),ALTERNATE_TARGET(Wikipedia:Sockpuppet investigations/SPI/Subpage - Bot reported cases)
To the blacklist. I am glad to see my ALTERNATE_TARGET feature being used. Chillum 05:06, 20 October 2009 (UTC)

Another sock request

A regexp of "Railizardz \b[1-9][0-9]\b" with the settings of REGEX,WAIT_TILL_EDIT,SOCK_PUPPET(Railizardz),ALTERNATE_TARGET(Wikipedia:Sockpuppet investigations/SPI/Subpage - Bot reported cases) would also be greatly appreciated as we have many of these socks popping up without a possibility of a rangeblock or anything. Thank you, MuZemike 20:03, 24 October 2009 (UTC)

Will do. Chillum 02:40, 25 October 2009 (UTC)
;Railizardz \b[1-9][0-9]\b:REGEX,WAIT_TILL_EDIT,SOCK_PUPPET(Railizardz),ALTERNATE_TARGET(Wikipedia:Sockpuppet investigations/SPI/Subpage - Bot reported cases)

CiscoWorks vandal

Please add a filter to catch sockpuppets of the CiscoWorks vandal. These usernames typically include the strings "cisco" or "ocsic" (which is Cisco spelled backwards). These have been used vandalize articles such as CiscoWorks. Please see Wikipedia:Sockpuppet investigations/Doesntworkciscoworks/Archive for details. Tckma (talk) 17:29, 2 November 2009 (UTC)

Capital eye versus lowercase ell and accented letters.

I see a lot of usernames that get around the blacklist of "vandal" by using a capital eye ("vandaI") instead of a lowercase ell ("vandal"). I also see many usernames that try to get around blacklisted words (like "admin") by using accented vowels, particularly í and ú. Finally, I often see "admin" as "ardmin" to get around the blacklist. Tckma (talk) 17:34, 2 November 2009 (UTC)