完结小说,听中国有声小说,国际完美世界下载

NGINX攔截除了百度之外的蜘蛛和爬蟲

# 根據(jù) User-Agent 判斷是否攔截
map $http_user_agent $block_spider {
    default 0;
    ~*Baiduspider 0;       # 允許百度爬蟲

    # 搜索引擎蜘蛛
    ~*Googlebot 1;
    ~*bingbot 1;
    ~*360Spider 1;
    ~*Sogou 1;
    ~*YisouSpider 1;
    ~*Bytespider 1;
    ~*YandexBot 1;
    ~*DuckDuckBot 1;
    ~*AhrefsBot 1;
    ~*SemrushBot 1;
    ~*MJ12bot 1;

    # AI 爬蟲（2025版）
    ~*ChatGPT 1;
    ~*ClaudeBot 1;
    ~*Anthropic 1;
    ~*OpenAI 1;
    ~*Perplexity 1;
    ~*DeepSeek 1;
    ~*YouBot 1;
    ~*GrokBot 1;
    ~*Gemini 1;
    ~*Atlas 1;
    ~*Comet 1;
    ~*AIbot 1;

    # 通用爬蟲關鍵字
    ~*crawler 1;
    ~*spider 1;
    ~*bot 1;
}

server {
    location / {
        if ($block_spider) {
            return 403;
        }
        # 正常業(yè)務邏輯
    }
}

整理一份 AI 爬蟲已知 IP 段列表

AI 爬蟲已知 IP 段列表（2025版）

爬蟲名稱	已知 IP 段	說明
OpenAI (ChatGPT, GPTBot)	`20.171.0.0/16`, `40.83.0.0/16`, `104.18.0.0/20`	官方公布的 GPTBot 爬蟲 IP 段，用于 ChatGPT 數(shù)據(jù)采集
Anthropic (ClaudeBot)	`34.160.0.0/16`, `34.149.0.0/16`	部署在 Google Cloud，美國地區(qū)為主
Perplexity AI	`34.80.0.0/16`, `34.81.0.0/16`, `34.82.0.0/16`	主要在 GCP（臺灣/新加坡/美國）節(jié)點運行
Google Gemini / DeepMind	`66.249.64.0/19`, `64.233.160.0/19`	與 Googlebot 共用部分 IP 段，Gemini 爬蟲也在其中
You.com (YouBot)	`3.64.0.0/16`, `3.65.0.0/16`	部署在 AWS 歐洲區(qū)
XAI (GrokBot)	`44.192.0.0/16`, `44.193.0.0/16`	部署在 AWS 美國東部
DeepSeek	`101.32.0.0/16`, `101.33.0.0/16`	中國大陸及香港節(jié)點，常見于騰訊云出口
其他通用 AI 爬蟲	`52.167.0.0/16`, `52.168.0.0/16`	常見于微軟 Azure 云，部分 AI 爬蟲偽裝使用

20.171.0.0/16
40.83.0.0/16
104.18.0.0/20
34.160.0.0/16
34.149.0.0/16
34.80.0.0/16
34.81.0.0/16
34.82.0.0/16
66.249.64.0/19
64.233.160.0/19
3.64.0.0/16
3.65.0.0/16
44.192.0.0/16
44.193.0.0/16
101.32.0.0/16
101.33.0.0/16
52.167.0.0/16
52.168.0.0/16

所以我們可以這樣配置NGXIN

http {
    # 定義 AI 爬蟲 IP 段攔截規(guī)則
    geo $block_ai {
        default 0;

        # OpenAI (ChatGPT, GPTBot)
        20.171.0.0/16 1;
        40.83.0.0/16 1;
        104.18.0.0/20 1;

        # Anthropic (ClaudeBot)
        34.160.0.0/16 1;
        34.149.0.0/16 1;

        # Perplexity AI
        34.80.0.0/16 1;
        34.81.0.0/16 1;
        34.82.0.0/16 1;

        # Google Gemini / DeepMind
        66.249.64.0/19 1;
        64.233.160.0/19 1;

        # You.com (YouBot)
        3.64.0.0/16 1;
        3.65.0.0/16 1;

        # XAI (GrokBot)
        44.192.0.0/16 1;
        44.193.0.0/16 1;

        # DeepSeek
        101.32.0.0/16 1;
        101.33.0.0/16 1;

        # 其他通用 AI 爬蟲 (Azure)
        52.167.0.0/16 1;
        52.168.0.0/16 1;
    }

    server {
        listen 80;
        server_name yourdomain.com;

        location / {
            if ($block_ai) {
                return 403;
            }

            # 正常業(yè)務邏輯
            root /var/www/html;
            index index.html index.php;
        }
    }
}

或者全部整合一下：

http {
    # 定義 AI 爬蟲 IP 段攔截規(guī)則
    geo $block_ai {
        default 0;

        # OpenAI (ChatGPT, GPTBot)
        20.171.0.0/16 1;
        40.83.0.0/16 1;
        104.18.0.0/20 1;

        # Anthropic (ClaudeBot)
        34.160.0.0/16 1;
        34.149.0.0/16 1;

        # Perplexity AI
        34.80.0.0/16 1;
        34.81.0.0/16 1;
        34.82.0.0/16 1;

        # Google Gemini / DeepMind
        66.249.64.0/19 1;
        64.233.160.0/19 1;

        # You.com (YouBot)
        3.64.0.0/16 1;
        3.65.0.0/16 1;

        # XAI (GrokBot)
        44.192.0.0/16 1;
        44.193.0.0/16 1;

        # DeepSeek
        101.32.0.0/16 1;
        101.33.0.0/16 1;

        # 其他通用 AI 爬蟲 (Azure)
        52.167.0.0/16 1;
        52.168.0.0/16 1;
    }

    # 根據(jù) User-Agent 判斷是否攔截
    map $http_user_agent $block_spider {
        default 0;
        ~*Baiduspider 0;       # 允許百度爬蟲

        # 搜索引擎蜘蛛
        ~*Googlebot 1;
        ~*bingbot 1;
        ~*360Spider 1;
        ~*Sogou 1;
        ~*YisouSpider 1;
        ~*Bytespider 1;
        ~*YandexBot 1;
        ~*DuckDuckBot 1;
        ~*AhrefsBot 1;
        ~*SemrushBot 1;
        ~*MJ12bot 1;

        # AI 爬蟲（2025版）
        ~*ChatGPT 1;
        ~*ClaudeBot 1;
        ~*Anthropic 1;
        ~*OpenAI 1;
        ~*Perplexity 1;
        ~*DeepSeek 1;
        ~*YouBot 1;
        ~*GrokBot 1;
        ~*Gemini 1;
        ~*Atlas 1;
        ~*Comet 1;
        ~*AIbot 1;

        # 通用爬蟲關鍵字
        ~*crawler 1;
        ~*spider 1;
        ~*bot 1;
    }

    server {
        listen 80;
        server_name yourdomain.com;

        location / {
            # 雙重防護：IP 段 + UA
            if ($block_ai) {
                return 403;
            }
            if ($block_spider) {
                return 403;
            }

            # 正常業(yè)務邏輯
            root /var/www/html;
            index index.html index.php;
        }
    }
}

yy日韩无码,富婆的诱惑,国产菊爆视频在线观看,国产精品无码AV高清波波AV,国产成人啪精品视频站午夜,已满十八岁免费观看电视剧十八岁,中文字幕av久久人妻蜜桃臀

攔截除了百度之外的蜘蛛和AI爬蟲及IP完整清單

AI 爬蟲已知 IP 段列表（2025版）