搜索引擎爬虫太多导致服务器压力过大

发表于

NGINX日志如下

104.196.126.216 - - [10/Mar/2021:02:31:05 +0800] "GET /robots.txt HTTP/1.1" 200 109 "-" "ZoominfoBot (zoominfobot at zoominfo dot com)" "xxx.com" "text/plain" "/htdocs/robots.txt" 1024
93.158.90.105 - - [10/Mar/2021:02:31:25 +0800] "GET /?p=394 HTTP/1.1" 200 11039 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 822753
93.158.90.31 - - [10/Mar/2021:02:31:25 +0800] "GET /?m=2012-11-20 HTTP/1.1" 200 7797 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 1070295
93.158.90.88 - - [10/Mar/2021:02:31:25 +0800] "GET /?p=3042 HTTP/1.1" 200 9487 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 763384
130.255.162.154 - - [10/Mar/2021:02:31:26 +0800] "GET /index.php/nggallery/thumbnails?page_id=3626 HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 771114
192.36.24.93 - - [10/Mar/2021:02:31:25 +0800] "GET /?m=2018-07-06 HTTP/1.1" 200 7834 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 939284
93.158.90.80 - - [10/Mar/2021:02:31:25 +0800] "GET /?p=4819&replytocom=24619 HTTP/1.1" 200 11778 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 1692363
93.158.90.96 - - [10/Mar/2021:02:31:25 +0800] "GET /?p=3164 HTTP/1.1" 200 9470 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0" "xxx.com" "text/html" "/htdocs/index.php" 1464669

解决方案:为减少服务器的压力,可以使用robots文件屏蔽一些垃圾蜘蛛。不解释,跟着做就对了!

User-agent: AhrefsBot
Disallow: /
User-agent: DotBot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: Uptimebot
Disallow: /
User-agent: MJ12bot
Disallow: /
User-agent: MegaIndex.ru
Disallow: /
User-agent: ZoominfoBot
Disallow: /
User-agent: Mail.Ru
Disallow: /
User-agent: SeznamBot
Disallow: /
User-agent: BLEXBot
Disallow: /
User-agent: ExtLinksBot
Disallow: /
User-agent: aiHitBot
Disallow: /
User-agent: Researchscan
Disallow: /
User-agent: DnyzBot
Disallow: /
User-agent: spbot
Disallow: /
User-agent: YandexBot
Disallow: /

剩余的,可以根据需求屏蔽,比如后台、js、css等等。