Saturday, March 20, 2021

Network Time Protocol (NTP)

Introduction to NTP

  Devices across the network often needs an accurate time which is done by synchronizing to a reference clock. One example of network setup that is very dependent on accurate time is in a kerberized environment. Kerberos makes use of timestamps to determine the validity of TGT (Ticket Granting Ticket). One way of achieving an accurate time is via Network Time Protocol (NTP) that runs via UDP port 123.

  In a typical setup, a client has NTP installed and is synchronizing to a reference clock which is either a central NTP server within your organization or to publicly accessible NTP server like 0.rhel.pool.ntp.org.

Strata and atomic clocks

In NTP, we have this so called "strata" (or stratum for singular) which is the level on which your reference clock is located. Atomic clocks are the most accurate clocks in the world which are located in stratum "0".

Strata levels:
0   - most accurate clocks: atomic clocks, GPS, mobile phone systems
1   - devices with GPS/atomic clocks attached
2   - syncs from stratum 1; serves lower strata
...
15 - lowest valid startum
16 - devices located here are in unsynchronized state

The Drift File

  This contains the offset (or difference) in PPM (parts per million) between your system's clock compared to a reference clock. More specifically in RHEL/Centos, it is /var/lib/ntp/drift. If this file is present and NTP is started/restarted, it will adjust your time based from the content of the file. The existence of the drift file allows ntpd to quickly adjust the time wihout recomputing again so you will see the following line in /var/log/messages:

  Mar 25 21:55:11 rhel6 ntpd[8991]: frequency initialized -37.951 PPM from /var/lib/ntp/drift

If it doesn't exist, the next restart will make ntpd launch on a special mode for 15 minutes and then back to normal mode again. ntpd recreates (rm /var/lib/ntp/drift && touch /var/lib/ntp/drift) the drift file every hour so it is important that NTP has correct permissions on /var/lib/ntp directory.

  A positive value means that your clock is fast compared to the reference and a negative means it's slow. We have 86,400 s/day and if we divide it into 1,000,000 PPM the quotient would be 0.0864 s/day-PPM. That value is what we need to convert the content of the drift file into seconds.

  Using the above excerpt in /var/log/messages as an example, the drift file's content is -37.951 so multiplying it to 0.0864 results to -3.2789664. It means that your system is slower than the reference clock by 3 seconds. That might not be a lot but in time synchronization's perspective that offset is large! Usually acceptable values are in millisecond range.

The NTP daemon (NTPD)

When launched, you will see the following in ps output.

    /usr/sbin/ntpd -u ntp:ntp -p /var/run/ntpd.pid -g

That is true in RHEL 6.X however in RHEL 7.X, "-p" part is omitted. "-u ntp:ntp" drops privileges of ntpd to ntp user and ntp group while "-p /var/run/ntpd.pid" holds the PID value. By default, ntpd will exit when it sees a 1000 seconds difference between your time and the reference clock. "-g" overrides that default behavior and continues to operate and sync to the reference.

Configuring an NTP client

1. Install NTP package
  # yum install -y ntp
2. Add your reference clocks (NTP servers)
  # vi /etc/ntp.conf
  server my.ntp1.server iburst  # iburst provides ???
  server my.ntp2.server
3. Start and enable NTPD at boot
  RHEL 6/SysV-based systems:
  # service ntpd start
  # chkconfig ntpd on
  RHEL 7.X/Systemd-based servers:
  # systemctl start ntpd
  # systemctl enable ntpd
4. Check if your system is synchronizing to the NTP server. NTP may take a time to sync.
  # ntpstat  # keyword to see is "synchronised to X at stratum Y..."
  # ntpq -p

Configuring an NTP server

1. Install NTP package
  # yum install -y ntp
2. Add your reference clocks (NTP servers). Since we are configuring an NTP server to serve time to devices in our internal organization, we need to sync to a publicly accessible NTP pool.
  # vi /etc/ntp.conf
  server 0.centos.pool.ntp.org iburst
  server 1.centos.pool.ntp.org
3. Update ntp.conf to allow access to our clients
  # vi /etc/ntp.conf
  restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
  # hint: that line above is usually commented on a default ntp.conf so there's no need to memorize the format
4. Start and enable NTPD at boot
  RHEL 6/SysV-based systems:
  # service ntpd start
  # chkconfig ntpd on
  RHEL 7.X/Systemd-based servers:
  # systemctl start ntpd
  # systemctl enable ntpd
5. Open up UDP/port 123 on our firewall (if there is any firewall that is active)
  RHEL 6.X/SysV-based systems:
  # iptables -A INPUT -m state --state NEW -p udp --dport 123 -j ACCEPT
  # service iptables restart
   RHEL 7.X/Systemd-based servers:
  # firewall-cmd --add-service=ntp --permanent
  # firewall-cmd --reload
6. Monitor if you are syncing to an internet NTP pool
  # ntpstat  # keyword to see is "synchronised to X at stratum Y..."
  # ntpq -p

Understanding "ntpq -p" output

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*80.26.104.184   217.127.32.90    2 u   66  256  377  470.247   32.058  33.497
+128.95.231.7    140.142.2.8      3 u  254  256  377  217.646   -3.832   2.734
+64.112.189.11   128.10.252.6     2 u    2  256  377  258.208    2.395  47.246
 127.127.1.0     LOCAL(0)        10 l   56   64  377    0.000    0.000   0.002

Where:
  remote - reference clock/NTP server; 127.127.1.0 refers to yourself
    * = selected as primary time source
    + = included in the average computation
    - = rejected
  refid - NTP server of your reference clock/NTP server
  st - stratum of "remote"
  t - server type (unicast/broadcast/multicast/local)
  when - how long since last poll (in seconds)
  poll - how frequently to query server (in seconds)
  reach - success/failure rate of last 8 queries in octal bitmask
    377 = 11111111 = all last 8 queries are successful = success
    257 = 10101111 = only last 4 queries are successful
    5 and 7 = failed
  delay - network round trip time (in milliseconds)
  offset - difference between local clock and remote clock (in milliseconds)
  jitter - difference of successive time values from server (high jitter might mean unstable clock or poor network performance)

Sources

RHEL 6 Deployment Guide
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/pdf/Deployment_Guide/Red_Hat_Enterprise_Linux-6-Deployment_Guide-en-US.pdf

Everything about NTP

ntp.conf(5)  - contains options for ntp configuration
ntpd(8)  - NTP daemon

No comments:

Post a Comment