nginx过滤低级的CC请求及屏蔽低级爬虫

发布于2016-4-15 分类：建站评论(0) 阅读(4,325)

CC攻击，简单地说就是模拟非常多的用户不停的进行访问页面（访问那些需要大量数据操作，就是需要占用大量CPU的页面），造成被攻击的服务器资源耗尽，最常见的就是使数据库连接失败。
低级的CC攻击的user_agent通常为空或者有一个固定的名字，我们可以对特定的user_agent返回错误的状态码，从而实现过滤的效果。
对于屏蔽低级爬虫也是也是根据特定的user_agent返回错误的状态码。

下面的nginx的代码：

if ($http_user_agent ~* (ApacheBench|pingback|AhrefsBotJorgee|Scrapy|BOT/0.1|CrawlDaddy|HttpClient) ) {return 403;}
if ($http_user_agent = "" ) {return 403;}

使用方法：

可以将上面的代码直接复制到你的网站相关配置（如：mysite.com.conf）中的 location / { } 中。

或者在/usr/local/nginx/这个目录新建一个agent_deny.conf文件，将代码复制进去，然后在你的网站相关配置（如：mysite.com.conf）中的server { }插入下面这段代码：

include /usr/local/nginx/conf/agent_deny.conf;

我使用第二个方法：

server {
listen 80;
server_name www.risingsun.cc;
access_log /data/wwwlogs/www.risingsun.cc_nginx.log combined;
index index.html index.htm index.php;
include /usr/local/nginx/conf/wordpress.conf;
include /usr/local/nginx/conf/agent_deny.conf; #插入这条代码

然后保存，重启nginx：

service nginx restart

测试是否生效：

模拟ApacheBench cc攻击：返回403状态码

my@my:~# curl -I -A 'ApacheBench' www.risingsun.cc
HTTP/1.1 403 Forbidden
Server: nginx
Date: Fri, 15 Apr 2016 09:56:02 GMT
Content-Type: text/html
Content-Length: 162
Connection: keep-alive

模拟tcp攻击：返回403状态码

my@my:~# curl -I -A 'HttpClient' www.risingsun.cc
HTTP/1.1 403 Forbidden
Server: nginx
Date: Fri, 15 Apr 2016 09:57:41 GMT
Content-Type: text/html
Content-Length: 162
Connection: keep-alive

模拟UA为空的爬虫抓取：返回403状态码

my@my:~# curl -I -A '' www.risingsun.cc
HTTP/1.1 403 Forbidden
Server: nginx
Date: Fri, 15 Apr 2016 09:58:15 GMT
Content-Type: text/html
Content-Length: 162
Connection: keep-alive

附垃圾UA列表：

FeedDemon 内容采集
BOT/0.1 sql注入
CrawlDaddy sql注入
Java 内容采集
Jullo 内容采集
Feedly 内容采集
UniversalFeedParser 内容采集
ApacheBench cc攻击器
Swiftbot 无用爬虫
YandexBot 无用爬虫
AhrefsBot 无用爬虫
jikeSpider 无用爬虫
MJ12bot 无用爬虫
ZmEu phpmyadmin 漏洞扫描
WinHttp 采集cc攻击
EasouSpider 无用爬虫
HttpClient tcp攻击
Microsoft URL Control 扫描
YYSpider 无用爬虫
jaunty wordpress 爆破扫描器
oBot 无用爬虫
Python-urllib 内容采集
Indy Library 扫描
FlightDeckReports Bot 无用爬虫
Linguee Bot 无用爬虫

未经允许不得转载：Rising Sun's Blog » 建站 » nginx过滤低级的CC请求及屏蔽低级爬虫

标签：CC nginx 屏蔽爬虫爬虫防cc