Nginx安装配置

可以直接看到最下面的HTTPS.

Nginx安装

我的系统如下:

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.3 LTS
Release:	16.04
Codename:	xenial

安装(如果有apache服务器, 建议卸载了, 或者改Nginx的默认端口):

sudo apt-get install nginx

此时已经开启了80端口, 并且配置处在etc/nginx

lsof -i:80

cd /etc/nginx

Nginx服务一般配置

将配置放于conf.d/*

PHP配置(可忽视)

server{
	listen 80;
	server_name php.youdomain.com;
	charset utf-8;
	access_log /data/logs/nginx/www.youdomain.com.log;
	#error_log /data/logs/nginx/www.youdomain.com.err;
    	
	location / {
        	root   /data/www/php/blog;
		index index.html index.php;
		#访问路径的文件不存在则重写URL转交给ThinkPHP处理
		if (!-e $request_filename) {
			rewrite  ^/(.*)$  /index.php/$1  last;
			break;
		}
	}
	
	## Images and static content is treated different
	location ~* ^.+.(jpg|jpeg|gif|css|png|js|ico|xml)$ {
	
		access_log        off;
		expires           30d;
		root /data/www/php/blog;
	 }

	location ~\.php/?.*$ {
   	root        /data/www/php/blog;
        fastcgi_pass   127.0.0.1:9000;
        fastcgi_index  index.php;
        #加载Nginx默认"服务器环境变量"配置
        include        fastcgi.conf;
        
        #设置PATH_INFO并改写SCRIPT_FILENAME,SCRIPT_NAME服务器环境变量
        set $fastcgi_script_name2 $fastcgi_script_name;
        if ($fastcgi_script_name ~ "^(.+\.php)(/.+)$") {
            set $fastcgi_script_name2 $1;
            set $path_info $2;
        }
        fastcgi_param   PATH_INFO $path_info;
        fastcgi_param   SCRIPT_FILENAME   $document_root$fastcgi_script_name2;
        fastcgi_param   SCRIPT_NAME   $fastcgi_script_name2;        
	}
}

反向代理配置

通过server_name, 用域名访问, 全部会到80端口, 根据域名会转发到8080

域名请A记录到该机器IP地址.


vim /etc/nginx/conf.d/www.youdomain.com.conf

server{
	listen 80;
	# 本地测试时可以将域名改为: 127.0.0.1
	server_name www.youdomain.com;
	charset utf-8;
	access_log /root/logs/nginx/www.youdomain.com.log;
	#error_log /data/logs/nginx/www.youdomain.com.err;
    location / {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://localhost:8080;
     }   
	 
    # 这个就是反爬虫文件了
    include /etc/nginx/anti_spider.conf;
}

日志文件要先建立:

sudo mkdir -p /root/logs/nginx

查看配置是否无误, 并重启:

sudo nginx -t
sudo service nginx restart
sudo nginx -s reload

访问127.0.0.1会发现502错误, 因为8080端口我们没开! 此时访问localhost会发现, 这时Nginx欢迎页面出来了, 这是默认80端口页面!

反爬虫配置

增加反爬虫配额文件:

sudo vim /etc/nginx/anti_spider.conf

#禁止Scrapy等工具的抓取  
if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) {  
     return 403;  
}  
  
#禁止指定UA及UA为空的访问  
if ($http_user_agent ~ "WinHttp|WebZIP|FetchURL|node-superagent|java/|FeedDemon|Jullo|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|Java|Feedly|Apache-HttpAsyncClient|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|BOT/0.1|YandexBot|FlightDeckReports|Linguee Bot|^$" ) {  
     return 403;               
}  
  
#禁止非GET|HEAD|POST方式的抓取  
if ($request_method !~ ^(GET|HEAD|POST)$) {  
    return 403;  
}  

#屏蔽单个IP的命令是
#deny 123.45.6.7
#封整个段即从123.0.0.1到123.255.255.254的命令
#deny 123.0.0.0/8
#封IP段即从123.45.0.1到123.45.255.254的命令
#deny 124.45.0.0/16
#封IP段即从123.45.6.1到123.45.6.254的命令是
#deny 123.45.6.0/24

# 以下IP皆为流氓
deny 58.95.66.0/24;

在网站配置server段中都插入include /etc/nginx/anti_spider.conf, 见上文. 你可以在默认的80端口配置上加上此句:sudo vim sites-available/default

重启:

sudo nginx -s reload

爬虫UA常见:

FeedDemon             内容采集  
BOT/0.1 (BOT for JCE) sql注入  
CrawlDaddy            sql注入  
Java                  内容采集  
Jullo                 内容采集  
Feedly                内容采集  
UniversalFeedParser   内容采集  
ApacheBench           cc攻击器  
Swiftbot              无用爬虫  
YandexBot             无用爬虫  
AhrefsBot             无用爬虫  
YisouSpider           无用爬虫(已被UC神马搜索收购,此蜘蛛可以放开!)  
jikeSpider            无用爬虫  
MJ12bot               无用爬虫  
ZmEu phpmyadmin       漏洞扫描  
WinHttp               采集cc攻击  
EasouSpider           无用爬虫  
HttpClient            tcp攻击  
Microsoft URL Control 扫描  
YYSpider              无用爬虫  
jaunty                wordpress爆破扫描器  
oBot                  无用爬虫  
Python-urllib         内容采集  
Indy Library          扫描  
FlightDeckReports Bot 无用爬虫  
Linguee Bot           无用爬虫  

使用curl -A 模拟抓取即可,比如:

# -A表示User-Agent
# -X表示方法: POST/GET
# -I表示只显示响应头部
curl -X GET -I -A 'YYSpider' localhost

HTTP/1.1 403 Forbidden
Server: nginx/1.10.3 (Ubuntu)
Date: Fri, 08 Dec 2017 10:07:15 GMT
Content-Type: text/html
Content-Length: 178
Connection: keep-alive

模拟UA为空的抓取:

curl -I -A ' ' localhost

模拟百度蜘蛛的抓取:

curl -I -A 'Baiduspider' localhost

重定向或者静态配置

    # 静态资源的根目录
    root /data/index/;

    # 静态
    location /cn {
           index index.html;
           try_files $uri $uri/ /cn/index.html;
    }

    # 重定向
    location / {
           rewrite ^(.*)$ https://${server_name}/cn permanent;
    }

支持HTTPS

生成免费证书,根据提示需要进行域名解析,加一个DNS txt解析。

certbot certonly --preferred-challenges dns --manual -d "你的域名.com" --server https://acme-v02.api.letsencrypt.org/directory

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/你的域名.com/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/你的域名.com/privkey.pem
   Your cert will expire on 2019-11-05. To obtain a new or tweaked
   version of this certificate in the future, simply run certbot
   again. To non-interactively renew *all* of your certificates, run
   "certbot renew"
 - If you like Certbot, please consider supporting our work by:

   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le

重新续期。

certbot renew

生成的证书和密钥:

/etc/letsencrypt/live/你的域名.com/fullchain.pem

/etc/letsencrypt/live/你的域名.com/privkey.pem

随便进一个目录生成一些强有力的辅助配置:

cd /data/cert
openssl rand 48 > session_ticket.key
openssl  dhparam -out dhparam.pem 2048

最安全的Nginx配置你的域名.conf

server {
    listen      443 ssl http2;
    server_name  你的域名;

    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    #ssl on;
    ssl_certificate /etc/letsencrypt/live/你的域名.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/你的域名.com/privkey.pem;
    ssl_dhparam /data/cert/dhparam.pem;

    ssl_session_timeout 5m;
    ssl_session_cache builtin:1000 shared:SSL:10m;
    ssl_session_tickets on;
    # openssl rand 48 > session_ticket.key
    ssl_session_ticket_key /data/cert/session_ticket.key;
    #ssl_protocols SSLv2 SSLv3 TLSv1;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;

    ssl_ciphers "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";
    #ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
    ssl_prefer_server_ciphers on;

    ssl_stapling on;
    ssl_stapling_verify on;
    ssl_trusted_certificate /etc/letsencrypt/live/你的域名.com/fullchain.pem;
    resolver 8.8.8.8 8.8.4.4  valid=300s;
    resolver_timeout 10s;
    add_header Strict-Transport-Security "max-age=31536000; includeSubdomains;";

    # 其他的一些配置放在这里
    access_log /root/logs/nginx/www.youdomain.com.log;
    #error_log /data/logs/nginx/www.youdomain.com.err;

    # 静态资源的根目录
    root /data/index/;

    # 静态
    location /cn {
           index index.html;
           try_files $uri $uri/ /cn/index.html;
    }

    # 重定向
    location / {
           rewrite ^(.*)$ https://${server_name}/cn permanent;
    }

    # 反向代理
    location /api {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://localhost:8080;
     }   
	 
    # 这个就是反爬虫文件了
    include /etc/nginx/anti_spider.conf;
    
}