jeudi 26 février 2015

Blocking Aggressive/Incoherent Bot

I have a weird bot pummeling my site. It COULD be some sort of low-level denial-of-service attack, but I think that's unlikely. I'm looking for suggestions on blocking it because it's rapidly chewing through all of my CPU and bandwidth allotments.


Here's what it does:




  1. Roughly 650 page requests per minute, like clockwork, constantly, for weeks




  2. Large list of IPs -- hundreds, rotating, with Geolocations randomly scattered all around the world




  3. Rotating user agent strings, many of which are for legit browsers




  4. HTTP_REFERER is often, but not always, filled with a spam site




  5. And weirdest of all, the GET requests almost always generate 404 errors because most are for fully-qualified URLs which are NOT MY SITE. When they are not full URLs, they are for pages or resources that don't exist, never have, and don't even appear to be exploit attempts.




Here are some sample records from my server logs:



80.84.53.26 - - [24/Feb/2015:06:15:43 -0600] "GET http://ift.tt/1wrNNHT HTTP/1.1" 404 - "http://ift.tt/1zhQiqZ" "Opera/9.20 (Windows NT 6.0; U; en)"
54.147.200.126 - - [24/Feb/2015:06:15:44 -0600] "GET http://ift.tt/1zhQgiX HTTP/1.1" 404 - "-" "Mozilla/4.0 (compatible; Ubuntu; MSIE 9.0; Trident/5.0; zh-CN)"
91.121.161.167 - - [24/Feb/2015:06:15:44 -0600] "GET http://ift.tt/1wrNPzH HTTP/1.1" 404 - "http://78.37.100.242/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
185.2.101.78 - - [24/Feb/2015:06:15:43 -0600] "GET http://mail.yahoo.com/ HTTP/1.1" 200 269726 "-" "Mozilla/4.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; MS-RTC LM 8; .NET CLR 3.0.30729)"
142.0.140.68 - - [24/Feb/2015:06:15:44 -0600] "GET http://ift.tt/1zhQir2] HTTP/1.0" 404 - "http://ift.tt/1wrNNI0" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/532.0 (KHTML, like Gecko) Chrome/4.0.206.1 Safari/532.0"


This is the third time I've dealt with these same conditions. It last happened about six months ago. For reference, my site is a blog about baseball (on a blogging platform I built myself) with a few hundred regular visitors. I'm in the US, but my site contains no state secrets!


For now I've redirected all 404 errors to a script which dynamically modifies my .htaccess file to instantly ban IPs that make incoherent requests. That works, but I don't think it's sustainable.


What is this thing? And what's the best practice method of blocking it? Thanks.





Aucun commentaire:

Enregistrer un commentaire