My Lazy Admin: Load Balancing in Nginx

Overview

--------

- Nginx can provide load balancing to a group of servers

- Advantages of load balancing

a. optimize resource utilization

b. maximize throughput

c. reduce latency

d. fault tolerance

Proxying Traffic to a group of servers

--------------------------------------

- setup is similar to `Secure TCP to Upstream` KB

example config

http {

upstream backend {

# `server` directive here is different from

# the one used in virtual servers

server backend1.example.com;

# member 2

server backend2.example.com;

# backup

server 192.0.0.1 backup;

}

server {

location / {

# call the upstream via name

proxy_pass http://backend;

}

actual testing

# version

nginx version: nginx/1.10.2

built by gcc 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)

built with OpenSSL 1.0.1e-fips 11 Feb 2013

# proxy server config

[/etc/nginx/conf.d]_$ cat virt1.conf

upstream backend {

server backend1.home.net;

server backend2.home.net;

}

server {

server_name virt1.home.net;

location / {

proxy_pass http://backend;

}

[/etc/nginx/conf.d]_$

# backend1 config

[/etc/nginx/conf.d]_$ cat backend1.conf

server {

listen 192.168.1.11:80;

server_name backend1.home.net;

root /data;

location / {

autoindex on;

}

[/etc/nginx/conf.d]_$

# backend2 config

[/etc/nginx/conf.d]_$ cat backend2.conf

server {

listen 192.168.1.12:80;

server_name backend2.home.net;

root /data;

location / {

autoindex on;

}

[/etc/nginx/conf.d]_$

Load Balancing Methods

----------------------

Round-Robin - requests are distributed evenly across servers - server weights are taken into consideration - default method (no directive for enabling it)	upstream backend { server backend1.example.com; server backend2.example.com; }
least_conn - request is sent to server w/ least number of active connections - server weights are taken into consideration	upstream backend { least_conn; server backend1.example.com; server backend2.example.com; }
ip_hash - destination of request is determine from client's ip (first 3 octets of IPV4 or whole IPV6) - guarantees that requests from same IP gets into same server unless it is not available - if server needs to be removed temporarily, it can be mark as `down` to preserve current hashing of client IPs (requests to that server will go to the next in the group)	upstream backend { ip_hash; server backend1.example.com; server backend2.example.com; } in case 1 server is down for maintenance: upstream backend { server backend1.example.com; server backend2.example.com; server backend3.example.com down; }
hash - generic hash method - destination server is determined by user-defined key (text/variable/source IP:PORT) - `consistent` option enables "ketama" hash load balancing - requests will be evenly distributed across based on user-defined hash value - if a server is added/removed, only few keys will be remapped w/c minimizes cache misses	upstream backend { hash $request_uri consistent; server backend1.example.com; server backend2.example.com; }
least_time - available on Nginx PLUS only - selects server w/ lowest average latency and least number of connections - lowest average latency is computed based on: a. header - time to receive 1st byte from server b. last_byte - time to receive full response from server	upstream backend { least_time header; server backend1.example.com; server backend2.example.com; }

Server Weights

--------------

- by default, Nginx distributes requests among servers according to their

weights using round-robin method

- default weight is 1

example config from

upstream backend {

# This member has the highest weight of 5 meaning 5 out of 6

# requests will go this server.

server backend1.example.com weight=5;

# This member has a default weight of 1.

server backend2.example.com;

# this server doesn't receive request unless any of the servers

# above goes down (weight=1)

server 192.0.0.1 backup;

}

Server Slow Start

-----------------

- prevents a recently recovered server from overwhelming by connections w/c may

timeout and cause the server to be marked as failed again

- this allows to gradually recover its weight from zero to nominal value

- this is done using `slow_start` directive

- note: if there is only 1 server in a group, the following directives will be

ignored

a. max_fails

b. fail_timeout

c. slow_start

config

upstream backend {

server backend1.example.com slow_start=30s;

server backend2.example.com;

server 192.0.0.1 backup;

}

Enabling Session Persistence

----------------------------

- Nginx plus can identify user sessions and route requests to same upstream

server

- Nginx plus supports 3 session persistence methods using `sticky` directive:

1. sticky cookie

- simplest session persistence method

- adds session cookie to the first response from upstream

- identifies server w/c has sent the response

- when client issues next request, it will contain the cookie and Nginx

will route it to the same upstream server

2. sticky route

- when client sends 1st request, a route is assigned to that client

- all subsequent requests will be compared to `route` option

- route information is taken from cookie/URI

3. cookie learn

- NGINX Plus finds session identifiers by inspecting requests and

responses

- NGINX Plus learns w/c upstream server corresponds to w/c session

identifier

- session identifiers are passed to HTTP cookie

- if a request contains a session identifier already "learned", NGINX

Plus forwards the request to the corresponding server

- doesn't require keeping cookies on client side

- all info is kept on memory-side in shared memory zone

- most sophisticated way of storing persistence method

Sample configs:

sticky cookie	upstream backend { server backend1.example.com; server backend2.example.com; # srv_id -> cookie name # expires -> time cookie will expire on client browser # domain -> domain for w/c cookie is set (optional) # path -> path for w/c cookie is set (optional) sticky cookie srv_id expires=1h domain=.example.com path=/; }
sticky route	upstream backend { server backend1.example.com route=a; server backend2.example.com route=b; sticky route $route_cookie $route_uri; }
cookie learn	upstream backend { # upstream servers creates session by setting "EXAMPLECOOKIE" in the response < I don't understand server backend1.example.com; server backend2.example.com; # create -> indicates how a new session is created (required) # lookup -> indicates how to search for existing sessions (required) # zone -> shared memory zone where all cookies are kept sticky learn create=$upstream_cookie_examplecookie lookup=$cookie_examplecookie zone=client_sessions:1m timeout=1h; # sequence: # 1. new session is created from cookie "EXAMPLECOOKIE" sent by upstream server # 2. existing sessions are searched in cookie "EXAMPLECOOKIE" }

Limiting number of connections

------------------------------

- requires NGINX Plus

- number of connections can be limited using `max_conns` parameter

- if `max_conns` has been reached, further connections can be placed in queue if

`queue` directive is specified

- if `queue` has been filled up or upstream server cannot be selected during the

timeout period specified in `timeout`, client will receive an error

- NOTE: `max_conns` will be ignored if idle keepalive is set on worker processes

- number of connections can exceed `max_conns` in a configuration where the

memory is shared w/ multiple worker processes

sample config

upstream backend {

server backend1.example.com max_conns=3;

server backend2.example.com;

queue 100 timeout=70;

}

Passive Health Monitoring

-------------------------

- when Nginx consider a server unavailable, it stops sending requests to that

server until its consider to be active again

- `server` options w/c consider a server unavailable:

a. fail_timeout (default: 10 seconds)

b. max_fails (default: 1 attempt)

sample config

upstream backend {

server backend1.example.com;

# server is mark unavailable for 30 seconds if response is not receive after 3 attempts

server backend2.example.com max_fails=3 fail_timeout=30s;

# server is mark unavailable for 10 seconds if response is not receive after 2 attempts

server backend3.example.com max_fails=2;

}

Active Health Monitoring

------------------------

- a more sohpisticated way of monitoring compared to "Passive Health Monitoring"

- periodically sends special requests to servers and checks their response

- to activate: include `health_check` within the `location` block that passes

request to the group

sample config	http { upstream backend { # shared memory zone among worker processes # - stores configuration of server group # - enables workers to use same set of counters to keep track # of responses from servers in the group # - a counter includes: # a. current number of connections to each server in the group # - b. number of failed attempts to pass request to a server zone backend 64k; server backend1.example.com; server backend2.example.com; server backend3.example.com; server backend4.example.com; } server { location / { # - by default, Nginx Plus sends `/` requests to each member of group every 5 seconds # - if an error/timeout occured (proxied server respond other than status code 2XX/3XX), # health check failed for that server # - a server that fails a health check is unhealty and Nginx Plus stops sending requests # to it until it passes a healtch check proxy_pass http://backend; health_check; } } }
default behaviour can be overwritten using some parameters in `health_check` directive	location / { proxy_pass http://backend; health_check interval=10 fails=3 passes=2; # interval - interval between checks in seconds # fails - # of failed health checks to consider a server unhealthy # passes - # of successful health checks to consider a server healthy }
setting a specific URI to check every health check	location / { proxy_pass http://backend; health_check uri=/some/path; # every health check will append the uri to the IP/hostname of # each server in the group # e.g # http://backend1.example.com/some/path }
specifying a custom condition that will satisfy a healthy check using `match` directive	http { ... match server_ok { status 200-399; # responses must be any number between 200 and 399 body !~ "maintenance mode"; # body must not contain any kind of "maintenance mode" string (regex) } server { ... location / { proxy_pass http://backend; health_check match=server_ok; } } }
another examples using `match`	# example 1 match welcome { status 200; header Content-Type = text/html; body ~ "Welcome to nginx!"; } # example 2 match not_redirect { status ! 301-303 307; header ! Refresh; }

Sharing Data with Multiple Worker Processes

-------------------------------------------

- if `upstream` directive doesn't include `zone` directive:

a. each worker process keeps its own copy of server group config

b. each maintain its own set of related counters

c. server group configuration isn't changeable

- if `upstream` directive does include `zone` directive:

a. server group config is placed in memory area shared among all workers

b. workers utilize same set of counters

c. dynamically configurable

- `zone` directive is mandatory for:

a. health checks

b. on-the-fly configurations

example scenario if server configuration is not shared	For example, if the configuration of a group is not shared, each worker process maintains its own counter for failed attempts to pass a request to a server (see the max_fails parameter). In this case, each request gets to only one worker process. When the worker process that is selected to process a request fails to transmit the request to a server, other worker processes don’t know anything about it. While some worker process can consider a server unavailable, others may still send requests to this server. For a server to be definitively considered unavailable, max_fails multiplied by the number of workers processes of failed attempts should happen within the timeframe set by fail_timeout. On the other hand, the zone directive guarantees the expected behavior.
some notes on `least_conn` directive	The least_conn load balancing method might not work as expected without the zone directive, at least on small loads. This method of tcp and http load balancing passes a request to the server with the least number of active connections. Again, if the configuration of the group is not shared, each worker process uses its own counter for the number of connections. And if one worker process passes by a request to a server, the other worker process can also pass a request to the same server. However, you can increase the number of requests to reduce this effect. On high loads requests are distributed among worker processes evenly, and the least_conn load balancing method works as expected.
setting the size of zone - there are no exact settings - it all depends on usage patterns - each feature such as: `sticky` `cookie` `route` `learn` will affect the zone size.	For example, the 256 Kb zone with the sticky_route session persistence method and a single health check can hold up to: 128 servers (adding a single peer by specifying IP:port); 88 servers (adding a single peer by specifying hostname:port, hostname resolves to single IP); 12 servers (adding multiple peers by specifying hostname:port, hostname resolves to many IPs).

Configuring Load Balancing using DNS

------------------------------------

- Nginx Plus can monitor changes in server IP and apply this w/o restart

- this is done via `resolver` directive

- by default, Nginx resolves DNS records based on TTL

> this can be overwritten using `valid` parameter

- if hostname resolves to multiple IPs, the IPs will be store on the configuration and will be load balanced

sample config

http {

resolver 10.0.0.1 valid=300s ipv6=off;

resolver_timeout 10s;

server {

location / {

proxy_pass http://backend;

}

upstream backend {

zone backend 32k;

least_conn;

...

server backend1.example.com resolve;

server backend2.example.com resolve;

}

Load Balancing of Microsoft Exchange Servers

--------------------------------------------

- in release 7 and later, Nginx Plus can proxy traffic to Exchange servers and

load balanced it

sample config

http {

...

upstream exchange {

zone exchange 64k;

ntlm;

server exchange1.example.com;

server exchange2.example.com;

}

server {

listen 443 ssl;

ssl_certificate /etc/nginx/ssl/company.com.crt;

ssl_certificate_key /etc/nginx/ssl/company.com.key;

ssl_protocols TLSv1 TLSv1.1 TLSv1.2;

location / {

proxy_pass https://exchange;

proxy_http_version 1.1;

proxy_set_header Connection "";

}

On-Fly Configuration

--------------------

- In Nginx Plus, it is possible to modify load balancing configuration

> view all servers

> view a server in a group

> modify parameter for a particular server

> add server(s)

> remove server(s)

- stores configuration in shared memory

- changes are discarded every time Nginx config file is reloaded

setup

http {

...

# Configuration of the server group

upstream appservers {

zone appservers 64k;

server appserv1.example.com weight=5;

server appserv2.example.com:8080 fail_timeout=5s;

server reserve1.example.com:8080 backup;

server reserve2.example.com:8080 backup;

}

server {

# Location that proxies requests to the group

location / {

proxy_pass http://appservers;

health_check;

}

# This allows on-the-fly configuration via API interface

location /upstream_conf {

upstream_conf;

allow 127.0.0.1;

deny all;

}

Configuring Persistence of On-the-Fly Configuration

---------------------------------------------------

- changes is stored on a special file to avoid being discarded every config reload

setup

http {

...

upstream appservers {

zone appservers 64k;

# This file is modified by configuration commands from `upstream_conf`

# API interface. You should avoid modifying this file directly.

state /var/lib/nginx/state/appservers.conf;

# All these servers should be moved to the file using the upstream_conf API:

# server appserv1.example.com weight=5;

# server appserv2.example.com:8080 fail_timeout=5s;

# server reserve1.example.com:8080 backup;

# server reserve2.example.com:8080 backup;

}

APIs used to configure Upstream servers on the fly

--------------------------------------------------

view all backup servers	http://127.0.0.1/upstream_conf?upstream=appservers&backup=
adds new server in the group	http://127.0.0.1/upstream_conf?add=&upstream=appservers&server=appserv3.example.com:8080&weight=2&max_fails=3
remove server from a group	http://127.0.0.1/upstream_conf?remove=&upstream=appservers&id=2
modifies parameter for a particular server	http://127.0.0.1/upstream_conf?upstream=appservers&id=2&down=

My Lazy Admin

Monday, June 25, 2018

Load Balancing in Nginx

No comments:

Post a Comment