User:Sj/proxyscanning

Since March 28th 2005, Wikipedia tries to automatically block edits comming from open proxies. Here's a short description of this new feature.

------

Transcript of a conversation about this on #mediawiki (permission pending!)

  • <Duesentrieb> TimStarling: how did the proxy-block-test turn out? or are you still working on it?
  • <TimStarling> we're using a blacklist supplied by SORBS... it's active now
  • <silsor> are you still linking to http://en.wikipedia.org/wiki/User:SORBS_DNSBL
  • <TimStarling> there will probably still be a lot of proxies that aren't in the list, it'll just slow them down a bit because they have to work out which ones they are
  • <TimStarling> the only really good solution is to scan them ourselves
  • <Duesentrieb> too slow, don't you think? And ther's always AOL...
  • <TimStarling> no, you can scan proxies
  • <TimStarling> I've said this a few times on this channel in the last few weeks... it's entirely possible
  • <silsor> we just got our first catch
  • <silsor> <ziedoros> Argh...."Your user name or IP address has been blocked by SORBS DNSBL."
  • <silsor> <ziedoros> Any idea where I should send my protests?
  • <mark-> SORBS has many false positives
  • <mark-> Blitzed OPM is much smaller, but has almost no false positives
  • <Duesentrieb> tim: is the blocking for edits only, or also for reading?
  • <TimStarling> for edits only
  • <Duesentrieb> tim: good. Are blocks logged somewhere?
  • <TimStarling> no
  • <mark-> testing from wp-servers isn't allowed... scripts for that exist, though
  • <TimStarling> we've already got a scanning script... not much good though
  • <TimStarling> we'd be better off using nmap
  • <mark-> nah... use libopm
  • <TimStarling> we can test from wikipedia servers, I don't know how many times I'll have to say that
  • <Duesentrieb> TimStarling: but testing "real time", after the user pressed submit, but before he gets a response, would be way to slow, no?
  • <TimStarling> no, you start the test on the request of the edit page.. we were only testing common HTTP proxy ports, not every port
  • <mark-> TimStarling: I heard that the current colo didn't allow us to
  • <TimStarling> Mark: the problem was just that we weren't prepared to deal with complaints
  • <TimStarling> we just have to have a system in place... lots of organisations do it
  • <Duesentrieb> TimStarling: bots may not even request the edit page, though...
  • <bw`> the colo doesn't have a problem with scanning for proxies
  • <bw`> but it would help for abuse tracking reasons if they all came from the same IP
  • <mark-> at Blitzed, we use RT for that so wikipedia can use OTRS
  • <mark-> TimStarling: libopm is a ultra fast proxy scanning lib
  • <mark-> it shouldn't be hard to make bindings for php they already exist for perl, for example
  • <TimStarling> if we were going to set up proxy scanning, we'd have a dedicated machine, with a reverse DNS entry like "proxyscanner.wikimedia.org"
  • <TimStarling> and that machine would have apache running on it, and connecting to it would give you an information page intended to discourage complaints
  • <mark-> proxyscanner-please-visit-the-webserver-on-this-host.wikimedia.org.. something like that is what most DNSBLs use
  • <mark-> it helps a bit but we still need to deal with reports
  • <bw`> that doesn't usually help
  • <bw`> windows users with firewall software will still mail you threatening to call the FBI
  • <TimStarling> heh



more conversation about this (permission pending!)


  • Mär 29 23:25:02 <nsh-> chaper, do we use DNSBL to place automatic IP blocks?
  • Mär 29 23:25:10 <nsh-> chaper, and if so, do we have a whitelist for it?
  • Mär 29 23:25:19 <nsh-> chaper, and if not, can you make one and add an IP to it for me
  • Mär 29 23:25:24 <nsh-> blargh
  • Mär 29 23:28:31 <silsor> nsh-: you're blocked by the new DNSBL?
  • Mär 29 23:28:42 <silsor> nsh-: chaper isn't really in charge of that...
  • Mär 29 23:29:07 <nsh-> silsor, not me, but a chappie in #wikimedia
  • Mär 29 23:29:12 <nsh-> silsor, whoever is then ;-)
  • Mär 29 23:29:13 <silsor> oh, the same guy from yesterday
  • Mär 29 23:33:54 <nsh-> so, what we doing about this DNSBL whitelist lack-of issue?
  • Mär 29 23:36:59 * nsh- writes script to get someone with shell access to create a DNSBL whitelist
  • Mär 29 23:38:46 <nsh-> what's important is getting this IP whitelisted
  • Mär 29 23:38:54 <nsh-> so if someone could get onto that, it'd be cool
  • Mär 29 23:42:05 * nsh- considers being told to fuck off in no uncertain terms a lot less annoying than being ignored
  • Mär 29 23:42:12 <nsh-> :-/
  • Mär 29 23:43:50 <chaper> fsck off in no uncertain terms
  • Mär 29 23:43:59 <chaper> I wouldn't even know where to begin restarting bots.
  • Mär 29 23:44:19 <nsh-> not the bots :-)
  • Mär 29 23:44:36 <nsh-> allowing the innocent guy whose IP got onto the DNSBL list use wikipedia
  • Mär 29 23:44:49 <nsh-> i don't really care about the bots old boy
  • Mär 29 23:44:50 <nsh-> :-)
  • Mär 29 23:45:08 <nsh-> that's a lie, i love them, but they aren't an problem, just candy.
  • Mär 29 23:45:15 <nsh-> disenfranchisment, is a problem :-)
  • Mär 29 23:45:44 <chaper> Ah. That makes sense. Of course, I wouldn't really know what to do with that, either.
  • Mär 29 23:45:49 <chaper> I'm just the technician.
  • Mär 29 23:46:10 <nsh-> ok
  • Mär 29 23:49:50 <Duesentrieb> nsh-: hm... how much trouble to we have due to the proxy-blocking-thingy?
  • Mär 29 23:50:03 <Duesentrieb> is it active on *all* 'pedias, btw?
  • Mär 29 23:50:18 <nsh-> i assume so
  • Mär 29 23:50:53 <Duesentrieb> is the blocking speudo-admin always the same (SORBS DNSBL) ?
  • Mär 29 23:51:11 <nsh-> one person being unecessarily disenfranchised is just as important a problem to fix as the whole site being down, of course
  • Mär 29 23:51:34 <nsh-> Duesentrieb, I presume it works below the Wiki level
  • Mär 29 23:51:44 <nsh-> on the apache setup, or lower
  • Mär 29 23:52:03 <Duesentrieb> nsh: no, blocking is done for edits only
  • Mär 29 23:52:07 <nsh-> oh
  • Mär 29 23:52:14 <nsh-> must be a pseudo-admin then
  • Mär 29 23:52:15 <Duesentrieb> i think it uses the normal ip-blocking method
  • Mär 29 23:52:28 <nsh-> or a direct write to the DB
  • Mär 29 23:52:32 <Duesentrieb> pseudo-admin auto-blocks ip if its in the list
  • Mär 29 23:52:43 <nsh-> without an entry to block log
  • Mär 29 23:52:55 <Duesentrieb> nsh: really? hm...
  • Mär 29 23:53:18 <Duesentrieb> maybe it's not a user at all, just a link to a non-exising user page in the block-message
  • Mär 29 23:53:25 <Duesentrieb> too bad tim is sleeping late today.
  • Mär 29 23:53:34 <nsh-> Duesentrieb, just you could mess up the block log by editing with a few hundred SORBS proxies
  • Mär 29 23:54:00 <Duesentrieb> nsh-: yea - but i would expect to appear there only if someone actually tries to use it - not all at once.
  • Mär 29 23:54:06 <Duesentrieb> but i'm not sure
  • Mär 29 23:54:23 <nsh-> mmmm
  • Mär 29 23:54:42 <Duesentrieb> anyway, have a look at this: http://en.wikipedia.org/wiki/User:SORBS_DNSBL
  • Mär 29 23:54:42 <nsh-> aye
  • Mär 29 23:55:13 <Duesentrieb> maybe we should tell admins on different 'pedias to put some info into that pseudo-users page, too
  • Mär 29 23:55:18 <Duesentrieb> i've already done it for de
  • Mär 29 23:55:19 <nsh-> aye
  • Mär 29 23:55:23 <nsh-> thanks
  • Mär 29 23:55:30 <nsh-> ok, i assume any admin can unblock then
  • Mär 29 23:55:37 <nsh-> but it'll be auto-added again
  • Mär 29 23:55:38 <nsh-> margh
  • Mär 29 23:55:48 <nsh-> can we trick the DB somehow
  • Mär 29 23:56:00 <Duesentrieb> nsh: you want a way to overwrite?
  • Mär 29 23:56:12 <Duesentrieb> i expect a whitelist-mechanism would be trivial to implement
  • Mär 29 23:56:40 <Duesentrieb> but the best way is always to tell sorbs to update their db - they promise to do so
  • Mär 29 23:56:47 <Duesentrieb> but it takes a few days.
  • Mär 29 23:56:51 <nsh-> Duesentrieb, yeah, but i don't want to wait for Tim to wake up to get it done
  • Mär 29 23:56:51 <nsh-> ;-)
  • Mär 29 23:57:03 <nsh-> mmm
  • Mär 29 23:57:46 <Duesentrieb> how many people have that problem, anyway?
  • Mär 29 23:58:11 <nsh-> only one that i've heard of
  • Mär 29 23:58:18 <nsh-> but how many would know to complain on IRC?
  • Mär 29 23:58:20 <nsh-> :-)
  • Mär 29 23:58:28 <nsh-> I assume it happens to a few people every day
  • Mär 30 00:00:32 <Duesentrieb> well, put that info on the page of User:SORBS DNSBL
  • Mär 30 00:01:08 <Duesentrieb> but, anyway... does this person have a way to get around that, maybe by accessing the WP from a different box, not at work or something?
  • Mär 30 00:01:21 <Duesentrieb> Schools may have that problem frequently i guess...
  • Mär 30 00:01:47 <Duesentrieb> hm...
  • Mär 30 00:02:01 <Duesentrieb> maybe the block should not apply to loged in users
  • Mär 30 00:02:34 <Duesentrieb> i belive that this is a general issue - if a non-blocked, logged-in user accesses via a blocked ip - should s/he be blocked?
  • Mär 30 00:02:39 <Duesentrieb> IMHO, no.
  • Mär 30 00:03:12 <Duesentrieb> or we could have an extra flag for that. But that may get complicated...
  • Mär 30 00:07:14 <Duesentrieb> nsh-: have a look at http://meta.wikimedia.org/wiki/User:Sj/proxyscanning
  • Mär 30 00:07:28 <nsh-> thanks Duesentrieb
  • Mär 30 00:08:31 <Duesentrieb> nsh-: this is a writeup of thething as i understand it so far from the conversation quoted there. I still need tim to proof-read it.


  • Mär 30 01:09:10 <Duesentrieb> nsh-: arg! i'm SORBS-Blocked now :(
  • Mär 30 01:09:22 <Duesentrieb> Theyr list should be a little smarter about dynamic IPs...
  • Mär 30 01:09:27 <nsh-> Duesentrieb, blarghx0r
  • Mär 30 01:09:33 <Duesentrieb> *g*
  • Mär 30 01:09:40 <Duesentrieb> ok, reconnecting...
  • Mär 30 01:09:44 <nsh-> did you do the online test?
  • Mär 30 01:09:46 <nsh-> k
  • Mär 30 01:11:18 --- Getrennt ().
  • Mär 30 01:11:22 --> Sie schreiben nun in #mediawiki
  • Mär 30 01:11:35 <Duesentrieb_> re
  • Mär 30 01:11:59 <Duesentrieb_> nsh-: "online test" ??
  • Mär 30 01:12:33 <nsh-> Duesentrieb: the "retest" link from that User page you linked me to
  • Mär 30 01:13:06 <Duesentrieb_> ah, that one... errr... no, i would have to walk to the router to find out the external IP :)
  • Mär 30 01:16:51 <Duesentrieb_> hm - logged in users are currently blocked if they access from a blocked ip, right?
  • Mär 30 01:16:58 <Duesentrieb_> Shouldn't that be changed?
  • Mär 30 01:17:13 <Duesentrieb_> It would also take care of the more annoying false positives of the SORBS-block
  • Mär 30 01:17:33 <nsh-> Duesentrieb, i'm not a fan of the whole blocking thing anyway, so i won't comment
  • Mär 30 01:17:47 --- Grunt ist jetzt bekannt als GruntWillBBL
  • Mär 30 01:18:30 <Duesentrieb_> nsh-: well, i can't think of anything better... at least not of IPs. Blocking accounts is a different story.
  • Mär 30 01:18:31 --- ABCD ist jetzt bekannt als ABCD_away
  • Mär 30 01:19:01 <nsh-> Duesentrieb: it's a huge philosophical debate, and i don't wanna get sucked into it
  • Mär 30 01:19:12 <nsh-> Duesentrieb: but basically, blocking a user is resorting to force
  • Mär 30 01:19:21 <nsh-> and Wiki is about everyone having equal power
  • Mär 30 01:19:28 <nsh-> and thus having to solve problems rationally
  • Mär 30 01:19:33 <nsh-> instead of resorting to force
  • Mär 30 01:19:41 <nsh-> blocking is therefore anathemic to wiki
  • Mär 30 01:20:03 <Duesentrieb_> nsh-: for accounts, i'm with you. But for anonymous vandals... well... i can't think of an alternative. But let's not argue about that here and now:)
  • Mär 30 01:20:14 <nsh-> brion, set an rc.3 script for them and i'll love you forever
  • Mär 30 01:20:20 * FoeNyx starts camping in #frrc ;)
  • Mär 30 01:20:34 <nsh-> brion, don't and i'll find out your cell phone number and call you every time zwinger restarts
  • Mär 30 01:20:38 <nsh-> :-)
  • Mär 30 01:21:07 <nsh-> Duesentrieb: it's a debate to be held in person and in the presence of whiskey
  • Mär 30 01:21:10 <nsh-> :-)
  • Mär 30 01:21:38 <Duesentrieb_> nsh-: indeed.
  • Mär 30 01:22:15 <Duesentrieb_> brion: do you know your way around the ipblocklist?


  • Mär 30 01:24:22 <FoeNyx> btw Tim sugested what wp could do the open proxy check, that might lead to legal threat in france too
  • Mär 30 01:24:32 <FoeNyx> not only fbi
  • Mär 30 01:25:01 <mark-> ?
  • Mär 30 01:25:19 <mark-> legal thread?
  • Mär 30 01:25:22 <mark-> threat
  • Mär 30 01:25:48 <FoeNyx> errm I misexpressed myself in english again ?
  • Mär 30 01:26:00 <mark-> what do you mean? :)
  • Mär 30 01:26:07 <FoeNyx> mark-> troll, or anon, users can sue us
  • Mär 30 01:26:22 <mark-> based on what?
  • Mär 30 01:26:55 <FoeNyx> mark-> accessing or trying to access a computer without autorisation is forbidden ..
  • Mär 30 01:27:15 <mark-> foenyx, right now wp is not doing any proxy scanning
  • Mär 30 01:27:20 <mark-> and once we do, it's not a problem either
  • Mär 30 01:27:35 <FoeNyx> mark-> I know but Tim say we could do it
  • Mär 30 01:27:39 <mark-> hell, we (blitzed) have been doing this for some 3 or 4 years now
  • Mär 30 01:28:02 <FoeNyx> mark-> luckyly for you no french sued you ;)
  • Mär 30 01:28:12 <mark-> they can't
  • Mär 30 01:28:34 <FoeNyx> mark-> but they could annoy of french local chapter
  • Mär 30 01:28:34 <wikibugs> (NEW) Logged in users should be able to edit even if - http://bugzilla.wikimedia.org/show_bug.cgi?id=1779
  • Mär 30 01:29:28 <mark-> do any of these sues actually happen in practice, in france?
  • Mär 30 01:29:36 <mark-> because proxy scanning is so common
  • Mär 30 01:30:07 <FoeNyx> mark-> ppl usually dont care, but troll wanting to sue someone could probably use it
  • Mär 30 01:30:21 <FoeNyx> brion> thank you brion :)
  • Mär 30 01:30:29 <FoeNyx> wb nsh_
  • Mär 30 01:30:30 <mark-> FoeNyx: I think they should try that then :)
  • Mär 30 01:31:03 <mark-> maybe we shouldn't be doing this, but not because of legal reasons...
  • Mär 30 01:31:12 <mark-> technically it has many drawbacks, and a low hitrate
  • Mär 30 01:31:13 <FoeNyx> mark-> they could win, and french chapter close, and wp banned in france :p
  • Mär 30 01:31:30 <mark-> FoeNyx: doubt it ;)