在日常工作中经常会使用upstream 模块和 http_proxy_module 模块，特别是在实现负载均衡和反向代理功能时。

通过结合使用 upstream 模块和 http_proxy_module 模块，可以实现以下功能：

负载均衡：upstream 模块定义多个后端服务器，并根据负载均衡算法将请求分配到这些服务器上，同时 http_proxy_module 模块负责通过代理转发请求到这些服务器。
高可用性：当一个后端服务器不可用时，upstream 模块可以自动将请求分发给其他可用的服务器，而http_proxy_module 模块将确保这些请求正确地代理到可用服务器上。
反向代理：http_proxy_module 模块支持反向代理功能，可以代理并转发来自客户端的请求到后端服务器，并通过 upstream 定义的后端服务器组实现负载均衡和高可用性。

upstream模块的作用

负载均衡：通过将多台后端服务器组合在一个 upstream 块中，Nginx 可以将客户端请求平均分发到这些服务器上，以提高性能和吞吐量。
高可用性：当一个后端服务器不可用时，Nginx 可以自动将请求转发给其他可用的后端服务器，确保服务的可靠性和高可用性。
健康检查：Nginx 可以定期检查后端服务器的健康状态，并基于检查结果来调整负载均衡策略，确保只将请求发送到正常运行的服务器上。
Failover 恢复：当一个后端服务器发生故障并恢复正常运行时，Nginx 能够将请求恢复到该服务器上，以提供更好的服务恢复能力

我的理解（画重点）

upstream定义了多个服务地址，在proxy时可以根据upstream的配置向多个服务中的一个地址发送请求，从而使请求是比较均衡的在多个服务器得到执行
upstream还可以在一个服务器不可用时，自动将这个请求转换到下一个地址（根据upstream的weight来确定）；从而达到高可用
upstream有被动定期检查的功能：即在用户请求时多个服务中的一个服务地址出现不可用的情况下，根据配置的max_fails和fail_timeout的值，将这个服务地址标为不可用；在fail_timeout的值的时间内，不会将请求转发到这个不可用的服务地址上；但是这个时间后还是会将请求转发到这个服务地址上

遇坑实验

一般我们会采用像下面的标准简单的配置

log_format main '$request_time $remote_addr -  "$request"  "$upstream_addr"- $upstream_status ';
upstream test {
      server localhost:6001 weight=1 max_fails=3 fail_timeout=30s;
      server localhost:6002 weight=2 max_fails=3 fail_timeout=30s;
      zone zone_for_test 1m;
}
server {
	listen 80;
  server_name test.com;
  access_log /var/log/nginx/access_80.log main;
	root /var/www/html;
  location /proxy {
       proxy_pass http://test/;  
   }
   location / {
        add_header Content-Type "text/html";
        return 200 '<html><body><h1>Hello Test Com!</h1></body></html>';
   }
}
server {
    listen 6001;
    server_name test6001.com;
    root /var/www/html;
    location / {
        add_header Content-Type "text/html";
        return 200 '<html><body><h1>Hello 6001!</h1></body></html>';
    }

}
server {
    listen 6002;
    server_name test6002.com;
    root /var/www/html;
    index 6001.html;
    location / {
       add_header Content-Type "text/html";
        return 200 '<html><body><h1>Hello 6002!</h1></body></html>';
    }
}

但是这样的配置实际上是无法真正达到高可用的目的，下面分成几个实验步骤

步骤一：模拟其中的一个服务彻底挂掉的情况

把上面配置文件中的server_name为test6002.com的server配置块删除掉，模拟其中的一个服务彻底挂掉的情况。

我们多次访问http://test.com/proxy，并没有出现返回服务不可用的情况；并且我们查看日志文件access_80.log会发现有一些下面的日志内容，这表明upstream确实是将请求自动转发到了下一个服务

"127.0.0.1:6002, 127.0.0.1:6001"- 502, 200

步骤二：模拟其中的一个服务器由于程序错误返回500的错误的情况

把上面配置文件中的server_name为test6002.com的server配置块修改成如下，即模拟其中的一个服务器由于程序错误返回500的错误的情况

server {
    listen 6002;
    server_name test6002.com;
    root /var/www/html;
    index 6001.html;
    location / {
       return 500;
    }
}

然后我们再次多次访问http://test.com/proxy，这个时候我们会发现有时会出现相应500错误的情况

而且在出现"500"错误时，日志文件出现的内容如下；这证明upstream并没有自动转发到下一个服务

"127.0.0.1:6002"- 500

步骤三：模拟其中的一个服务器由于程序错误返回502的错误的情况

步骤一提到的服务彻底挂掉的情况下,nginx是做了服务状态为502的情况，那么我们将服务修改成主动返回502状态，会出现什么情况呢

server {
    listen 6002;
    server_name test6002.com;
    root /var/www/html;
    index 6001.html;
    location / {
       return 502;
    }
}

然后我们再次多次访问http://test.com/proxy，这个时候我们会发现有时会出现相应502错误的情况

而且在出现"502"错误时，日志文件出现的内容如下；这证明upstream并没有自动转发到下一个服务

"127.0.0.1:6002"- 502

实验总结

上面三个步骤基本覆盖了常会碰到的各种情况；我们可以得出下面的结论

upstream只有在服务彻底挂掉的情况下（即无法得到服务的响应时），才会将请求自动转发到下一个服务
在服务能响应且主动返回500,502的情况，都会直接将返回500或502错误给用户

原因及如何解决

一般到这里我们都会比较郁闷，于是会去网上搜索，但是基本上搜索不到很有用的回答。

实际上官方文档已经解释了原因，在官方文档中关于"max_fails"的配置说明有下面一段文字

max_fails=number
sets the number of unsuccessful attempts to communicate
with the server that should happen in the duration set by the fail_timeout parameter 
to consider the server unavailable for a duration also set by the fail_timeout parameter. 
What is considered an unsuccessful attempt is defined by the proxy_next_upstream, fastcgi_next_upstream, uwsgi_next_upstream, scgi_next_upstream, memcached_next_upstream, and grpc_next_upstream directives.

重点是其中的这句话

What is considered an unsuccessful attempt is defined by the proxy_next_upstream, fastcgi_next_upstream, uwsgi_next_upstream, scgi_next_upstream, memcached_next_upstream, and grpc_next_upstream directives.

即Nginx对于服务不可达错误的认定是由下面的一些指令确定的，只有这些指令设置的错误类型，才会主动将服务转化到下个服务

proxy_next_upstream
fastcgi_next_upstream
uwsgi_next_upstream
scgi_next_upstream
memcached_next_upstream
grpc_next_upstream

上面的第一个指令"proxy_next_upstream"就是在proxy模块中，所以看到这里就感觉问题有希望解决了，那么我们再看下"proxy_next_upstream"这个指令在官方文档中是怎么样说明的

Syntax:	proxy_next_upstream error | timeout | invalid_header | http_500 | http_502 | http_503 | http_504 | http_403 | http_404 | http_429 | non_idempotent | off ...;
Default:	
proxy_next_upstream error timeout;
Context:	http, server, location

从上面可看出proxy_next_upstream是有默认值的即“error timeout”；这就解释了我们上面结论。

所以我们可以修改proxy_next_upstream这个指令的值，将500,502错误也加上，就可以解决这个问题,相应配置修改如下

server {
	listen 80;
  server_name test.com;
  access_log /var/log/nginx/access_80.log main;
	root /var/www/html;
  location /proxy {
       proxy_next_upstream error timeout http_500 http_502;
       proxy_pass http://test/;  
   }
   location / {
        add_header Content-Type "text/html";
        return 200 '<html><body><h1>Hello Test Com!</h1></body></html>';
   }
}

总结

配置proxy_next_upstream error timeout http_500 http_502 后可实现更好的高可用性

网站首页 > 精选教程正文

你真的懂Nginx吗之《爬坑upstream和proxy模块》

upstream模块的作用

我的理解（画重点）

遇坑实验

实验总结

原因及如何解决

总结

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎你发表评论:

网站首页 > 精选教程 正文

你真的懂Nginx吗之《爬坑upstream和proxy模块》

upstream模块的作用

我的理解（画重点）

遇坑实验

实验总结

原因及如何解决

总结

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎 你 发表评论:

网站首页 > 精选教程正文

取消回复欢迎你发表评论: