Monday, June 25, 2018

Load Balancing in Nginx


Overview
--------

- Nginx can provide load balancing to a group of servers
- Advantages of load balancing
    a. optimize resource utilization
    b. maximize throughput
    c. reduce latency
    d. fault tolerance

Proxying Traffic to a group of servers
--------------------------------------

- setup is similar to `Secure TCP to Upstream` KB

example config
http {
    upstream backend {
        # `server` directive here is different from
        # the one used in virtual servers
        server backend1.example.com;

        # member 2
        server backend2.example.com;

        # backup
        server 192.0.0.1 backup;
    }
    server {
        location / {
            # call the upstream via name
            proxy_pass http://backend;
        }
    }
}
actual testing


# version

nginx version: nginx/1.10.2
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC)
built with OpenSSL 1.0.1e-fips 11 Feb 2013

# proxy server config
 
[/etc/nginx/conf.d]_$ cat virt1.conf
upstream backend {
  server backend1.home.net;
  server backend2.home.net;
 }

server {
  server_name virt1.home.net;

  location / {
    proxy_pass http://backend;
  }
}
[/etc/nginx/conf.d]_$

# backend1 config
 
[/etc/nginx/conf.d]_$ cat backend1.conf
server {
  listen 192.168.1.11:80;
  server_name backend1.home.net;
  root /data;

  location / {
    autoindex on;
  }
}
[/etc/nginx/conf.d]_$

# backend2 config
 
[/etc/nginx/conf.d]_$ cat backend2.conf
server {
  listen 192.168.1.12:80;
  server_name backend2.home.net;
  root /data;

  location / {
    autoindex on;
  }
}
[/etc/nginx/conf.d]_$

Load Balancing Methods
----------------------

Round-Robin

 - requests are distributed evenly across servers
 - server weights are taken into consideration
 - default method (no directive for enabling it)
upstream backend {
   server backend1.example.com;
   server backend2.example.com;
}
least_conn

 - request is sent to server w/ least number of
   active connections
 - server weights are taken into consideration
upstream backend {
    least_conn;

    server backend1.example.com;
    server backend2.example.com;
}
ip_hash

 - destination of request is determine from client's
   ip (first 3 octets of IPV4 or whole IPV6)
 - guarantees that requests from same IP gets into
   same server unless it is not available
 - if server needs to be removed temporarily, it
   can be mark as `down` to preserve current hashing
   of client IPs (requests to that server will go
   to the next in the group)
upstream backend {
    ip_hash;

    server backend1.example.com;
    server backend2.example.com;
}

in case 1 server is down for maintenance:

 
upstream backend {
    server backend1.example.com;
    server backend2.example.com;
    server backend3.example.com down;
}
hash

 - generic hash method
 - destination server is determined by user-defined
   key (text/variable/source IP:PORT)
 - `consistent` option enables "ketama" hash load
   balancing
 - requests will be evenly distributed across based
   on user-defined hash value
 - if a server is added/removed, only few keys will
   be remapped w/c minimizes cache misses
upstream backend {
    hash $request_uri consistent;

    server backend1.example.com;
    server backend2.example.com;
}
least_time

 - available on Nginx PLUS only
 - selects server w/ lowest average latency and
   least number of connections
 - lowest average latency is computed based on:
    a. header - time to receive 1st byte from server
    b. last_byte - time to receive full response
                   from server
upstream backend {
    least_time header;

    server backend1.example.com;
    server backend2.example.com;
}

Server Weights
--------------

- by default, Nginx distributes requests among servers according to their
  weights using round-robin method
- default weight is 1

example config from
upstream backend {

    # This member has the highest weight of 5 meaning 5 out of 6
    # requests will go this server.
    server backend1.example.com weight=5;

    # This member has a default weight of 1.
    server backend2.example.com;
   
    # this server doesn't receive request unless any of the servers
    # above goes down (weight=1)
    server 192.0.0.1 backup;
}

Server Slow Start
-----------------

- prevents a recently recovered server from overwhelming by connections w/c may
  timeout and cause the server to be marked as failed again
- this allows to gradually recover its weight from zero to nominal value
- this is done using `slow_start` directive
- note: if there is only 1 server in a group, the following directives will be
  ignored
    a. max_fails
    b. fail_timeout
    c. slow_start

config
upstream backend {
    server backend1.example.com slow_start=30s;
    server backend2.example.com;
    server 192.0.0.1 backup;
}

Enabling Session Persistence
----------------------------

- Nginx plus can identify user sessions and route requests to same upstream
  server
- Nginx plus supports 3 session persistence methods using `sticky` directive:
    1. sticky cookie
         - simplest session persistence method
         - adds session cookie to the first response from upstream
         - identifies server w/c has sent the response
         - when client issues next request, it will contain the cookie and Nginx
           will route it to the same upstream server
    2. sticky route
         - when client sends 1st request, a route is assigned to that client
         - all subsequent requests will be compared to `route` option
         - route information is taken from cookie/URI
    3. cookie learn
         - NGINX Plus finds session identifiers by inspecting requests and
           responses
         - NGINX Plus learns w/c upstream server corresponds to w/c session
           identifier
         - session identifiers are passed to HTTP cookie
         - if a request contains a session identifier already "learned", NGINX
           Plus forwards the request to the corresponding server
         - doesn't require keeping cookies on client side
         - all info is kept on memory-side in shared memory zone
         - most sophisticated way of storing persistence method

Sample configs:

sticky cookie
upstream backend {
    server backend1.example.com;
    server backend2.example.com;

    # srv_id  -> cookie name
    # expires -> time cookie will expire on client browser
    # domain  -> domain for w/c cookie is set (optional)
    # path    -> path for w/c cookie is set (optional)
    sticky cookie srv_id expires=1h domain=.example.com path=/;
}
sticky route
upstream backend {
    server backend1.example.com route=a;
    server backend2.example.com route=b;

    sticky route $route_cookie $route_uri;
}
cookie learn
upstream backend {
   # upstream servers creates session by setting "EXAMPLECOOKIE" in the response < I don't understand
   server backend1.example.com;
   server backend2.example.com;


   # create -> indicates how a new session is created (required)
   # lookup -> indicates how to search for existing sessions (required)
   # zone   -> shared memory zone where all cookies are kept
   sticky learn
       create=$upstream_cookie_examplecookie
       lookup=$cookie_examplecookie
       zone=client_sessions:1m
       timeout=1h;
   # sequence:
   #   1. new session is created from cookie "EXAMPLECOOKIE" sent by upstream server
   #   2. existing sessions are searched in cookie "EXAMPLECOOKIE"
}

Limiting number of connections
------------------------------

- requires NGINX Plus
- number of connections can be limited using `max_conns` parameter
- if `max_conns` has been reached, further connections can be placed in queue if
  `queue` directive is specified
- if `queue` has been filled up or upstream server cannot be selected during the
  timeout period specified in `timeout`, client will receive an error
- NOTE: `max_conns` will be ignored if idle keepalive is set on worker processes
- number of connections can exceed `max_conns` in a configuration where the
  memory is shared w/ multiple worker processes

sample config
upstream backend {
    server backend1.example.com  max_conns=3;
    server backend2.example.com;

    queue 100 timeout=70;
}

Passive Health Monitoring
-------------------------

- when Nginx consider a server unavailable, it stops sending requests to that
  server until its consider to be active again
- `server` options w/c consider a server unavailable:
    a. fail_timeout (default: 10 seconds)
    b. max_fails (default: 1 attempt)

sample config
upstream backend {               
    server backend1.example.com;

    # server is mark unavailable for 30 seconds if response is not receive after 3 attempts
    server backend2.example.com max_fails=3 fail_timeout=30s;

    # server is mark unavailable for 10 seconds if response is not receive after 2 attempts
    server backend3.example.com max_fails=2;
}

Active Health Monitoring
------------------------

- a more sohpisticated way of monitoring compared to "Passive Health Monitoring"
- periodically sends special requests to servers and checks their response
- to activate: include `health_check` within the `location` block that passes
  request to the group

sample config
http {
    upstream backend {
       
        # shared memory zone among worker processes
        #   - stores configuration of server group
        #   - enables workers to use same set of counters to keep track
        #     of responses from servers in the group
        #   - a counter includes:
        #       a. current number of connections to each server in the group
        #   -   b. number of failed attempts to pass request to a server
        zone backend 64k;

        server backend1.example.com;
        server backend2.example.com;
        server backend3.example.com;
        server backend4.example.com;
    }

    server {
        location / {
  # - by default, Nginx Plus sends `/` requests to each member of group every 5 seconds
  # - if an error/timeout occured (proxied server respond other than status code 2XX/3XX),
  #   health check failed for that server
  # - a server that fails a health check is unhealty and Nginx Plus stops sending requests
  #   to it until it passes a healtch check
            proxy_pass http://backend;
            health_check;
        }
    }
}
default behaviour can be
overwritten using some parameters
in `health_check` directive
location / {
    proxy_pass http://backend;
    health_check interval=10 fails=3 passes=2;
    # interval - interval between checks in seconds
    # fails    - # of failed health checks to consider a server unhealthy
    # passes   - # of successful health checks to consider a server healthy
}
setting a specific URI to check
every health check
location / {
    proxy_pass http://backend;
    health_check uri=/some/path;
    # every health check will append the uri to the IP/hostname of
    # each server in the group
    # e.g
}
specifying a custom condition
that will satisfy a healthy
check using `match` directive
http {
    ...

    match server_ok {
        status 200-399;  # responses must be any number between 200 and 399
        body !~ "maintenance mode";  # body must not contain any kind of "maintenance mode" string (regex)
    }

    server {
        ...

        location / {
            proxy_pass http://backend;
            health_check match=server_ok;
        }
    }
}
another examples using `match`
# example 1
match welcome {
    status 200;
    header Content-Type = text/html;
    body ~ "Welcome to nginx!";
}

# example 2
match not_redirect {
    status ! 301-303 307;
    header ! Refresh;
}

Sharing Data with Multiple Worker Processes
-------------------------------------------

- if `upstream` directive doesn't include `zone` directive:
    a. each worker process keeps its own copy of server group config
    b. each maintain its own set of related counters
    c. server group configuration isn't changeable
- if `upstream` directive does include `zone` directive:
    a. server group config is placed in memory area shared among all workers
    b. workers utilize same set of counters
    c. dynamically configurable
- `zone` directive is mandatory for:
    a. health checks
    b. on-the-fly configurations

example scenario if server configuration
is not shared
For example, if the configuration of a group is not shared, each worker process maintains its
own counter for failed attempts to pass a request to a server (see the max_fails parameter).
In this case, each request gets to only one worker process. When the worker process that is
selected to process a request fails to transmit the request to a server, other worker
processes don’t know anything about it. While some worker process can consider a server
unavailable, others may still send requests to this server. For a server to be definitively
considered unavailable, max_fails multiplied by the number of workers processes of failed
attempts should happen within the timeframe set by fail_timeout. On the other hand, the
zone directive guarantees the expected behavior.
some notes on `least_conn` directive
The least_conn load balancing method might not work as expected without the zone directive,
at least on small loads. This method of tcp and http load balancing passes a request to the
server with the least number of active connections. Again, if the configuration of the group
is not shared, each worker process uses its own counter for the number of connections. And
if one worker process passes by a request to a server, the other worker process can also
pass a request to the same server. However, you can increase the number of requests to
reduce this effect. On high loads requests are distributed among worker processes evenly,
and the least_conn load balancing method works as expected.
setting the size of zone

 - there are no exact settings
 - it all depends on usage patterns
 - each feature such as:
     `sticky`
     `cookie`
     `route`
     `learn`
   will affect the zone size.
For example, the 256 Kb zone with the sticky_route session persistence method and a single
health check can hold up to:

128 servers (adding a single peer by specifying IP:port);
88 servers (adding a single peer by specifying hostname:port, hostname resolves to single IP);
12 servers (adding multiple peers by specifying hostname:port, hostname resolves to many IPs).

Configuring Load Balancing using DNS
------------------------------------

- Nginx Plus can monitor changes in server IP and apply this w/o restart
- this is done via `resolver` directive
- by default, Nginx resolves DNS records based on TTL
    > this can be overwritten using `valid` parameter
- if hostname resolves to multiple IPs, the IPs will be store on the configuration and will be load balanced

sample config
http {
    resolver 10.0.0.1 valid=300s ipv6=off;
    resolver_timeout 10s;

    server {
        location / {
            proxy_pass http://backend;
        }
    }
  
    upstream backend {
        zone backend 32k;
        least_conn;
        ...
        server backend1.example.com resolve;
        server backend2.example.com resolve;
    }
}

Load Balancing of Microsoft Exchange Servers
--------------------------------------------

- in release 7 and later, Nginx Plus can proxy traffic to Exchange servers and
  load balanced it

sample config
http {
    ...
    upstream exchange {
        zone exchange 64k;
        ntlm;
        server exchange1.example.com;
        server exchange2.example.com;
    }

    server {
        listen              443 ssl;
        ssl_certificate     /etc/nginx/ssl/company.com.crt;
        ssl_certificate_key /etc/nginx/ssl/company.com.key;
        ssl_protocols       TLSv1 TLSv1.1 TLSv1.2;

        location / {
            proxy_pass         https://exchange;
            proxy_http_version 1.1;
            proxy_set_header   Connection "";
        }
    }

On-Fly Configuration
--------------------

- In Nginx Plus, it is possible to modify load balancing configuration
    > view all servers
    > view a server in a group
    > modify parameter for a particular server
    > add server(s)
    > remove server(s)
- stores configuration in shared memory
- changes are discarded every time Nginx config file is reloaded

setup
http {
    ...
    # Configuration of the server group
    upstream appservers {
        zone appservers 64k;

        server appserv1.example.com      weight=5;
        server appserv2.example.com:8080 fail_timeout=5s;

        server reserve1.example.com:8080 backup;
        server reserve2.example.com:8080 backup;
    }

    server {
        # Location that proxies requests to the group
        location / {
            proxy_pass http://appservers;
            health_check;
        }

        # This allows on-the-fly configuration via API interface
        location /upstream_conf {
            upstream_conf;
            allow 127.0.0.1;
            deny  all;
        }
    }
}

Configuring Persistence of On-the-Fly Configuration
---------------------------------------------------

- changes is stored on a special file to avoid being discarded every config reload

setup
http {
    ...
    upstream appservers {
        zone appservers 64k;
        # This file is modified by configuration commands from `upstream_conf`
        # API interface. You should avoid modifying this file directly.
        state /var/lib/nginx/state/appservers.conf;

        # All these servers should be moved to the file using the upstream_conf API:
        # server appserv1.example.com      weight=5;
        # server appserv2.example.com:8080 fail_timeout=5s;
        # server reserve1.example.com:8080 backup;
        # server reserve2.example.com:8080 backup;
    }
}

APIs used to configure Upstream servers on the fly
--------------------------------------------------


No comments:

Post a Comment