


Configuring the User-Agent method of filtering crawlers in Nginx
It’s hard to see, it’s basically a user agent called “yisouspider” that swiped the screen. I don’t know where the spider is from at first glance. It’s so shameless. .
Find the root directory configuration area, add the user agent filter judgment statement, and find that the one called "yisouspider" will directly return 403
Note 1: If you need to add multiple filters, do this
($http_user_agent ~* "spider1|spider2|spider3|spider4")
, just separate it with |
Note 2: If you are using a subdirectory blog, like mine, then you need to find a section like "location /blog/" to modify it
location / { ......其它配置 if ($http_user_agent ~* "yisouspider") { return 403; } }
After completing the configuration and saving wq, reload nginx, then use the following command to test yourself and change the address yourself. If curl is not installed, I have no choice but to install it myself with apt or yum. It comes with a magic tool.
curl -i -a "yisouspider" www.slyar.com/blog/
Just see 403 returned, indicating that the configuration is successful
The above is the detailed content of Configuring the User-Agent method of filtering crawlers in Nginx. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











NGINX and Apache are both powerful web servers, each with unique advantages and disadvantages in terms of performance, scalability and efficiency. 1) NGINX performs well when handling static content and reverse proxying, suitable for high concurrency scenarios. 2) Apache performs better when processing dynamic content and is suitable for projects that require rich module support. The selection of a server should be decided based on project requirements and scenarios.

NGINX is more suitable for handling high concurrent connections, while Apache is more suitable for scenarios where complex configurations and module extensions are required. 1.NGINX is known for its high performance and low resource consumption, and is suitable for high concurrency. 2.Apache is known for its stability and rich module extensions, which are suitable for complex configuration needs.

NGINX and Apache each have their own advantages and disadvantages, and the choice should be based on specific needs. 1.NGINX is suitable for high concurrency scenarios because of its asynchronous non-blocking architecture. 2. Apache is suitable for low-concurrency scenarios that require complex configurations, because of its modular design.

PHP code can be executed in many ways: 1. Use the command line to directly enter the "php file name" to execute the script; 2. Put the file into the document root directory and access it through the browser through the web server; 3. Run it in the IDE and use the built-in debugging tool; 4. Use the online PHP sandbox or code execution platform for testing.

Understanding Nginx's configuration file path and initial settings is very important because it is the first step in optimizing and managing a web server. 1) The configuration file path is usually /etc/nginx/nginx.conf. The syntax can be found and tested using the nginx-t command. 2) The initial settings include global settings (such as user, worker_processes) and HTTP settings (such as include, log_format). These settings allow customization and extension according to requirements. Incorrect configuration may lead to performance issues and security vulnerabilities.

Linux system restricts user resources through the ulimit command to prevent excessive use of resources. 1.ulimit is a built-in shell command that can limit the number of file descriptors (-n), memory size (-v), thread count (-u), etc., which are divided into soft limit (current effective value) and hard limit (maximum upper limit). 2. Use the ulimit command directly for temporary modification, such as ulimit-n2048, but it is only valid for the current session. 3. For permanent effect, you need to modify /etc/security/limits.conf and PAM configuration files, and add sessionrequiredpam_limits.so. 4. The systemd service needs to set Lim in the unit file

When configuring Nginx on Debian system, the following are some practical tips: The basic structure of the configuration file global settings: Define behavioral parameters that affect the entire Nginx service, such as the number of worker threads and the permissions of running users. Event handling part: Deciding how Nginx deals with network connections is a key configuration for improving performance. HTTP service part: contains a large number of settings related to HTTP service, and can embed multiple servers and location blocks. Core configuration options worker_connections: Define the maximum number of connections that each worker thread can handle, usually set to 1024. multi_accept: Activate the multi-connection reception mode and enhance the ability of concurrent processing. s

NGINXserveswebcontentandactsasareverseproxy,loadbalancer,andmore.1)ItefficientlyservesstaticcontentlikeHTMLandimages.2)Itfunctionsasareverseproxyandloadbalancer,distributingtrafficacrossservers.3)NGINXenhancesperformancethroughcaching.4)Itofferssecur
