Logs Not Searchable or Not Coming In Nagios Log Server

Server Management Service

Logs Not Searchable or Not Coming In Nagios Log Server error happens in the process of running a query in a Nagios dashboard.

Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to fix Nagios related errors.

In this context, we shall look into the causes of this Nagios error and how to tackle it.

Nature of error Logs Not Searchable or Not Coming In Nagios Log Server ?

As previously mentioned, while we run a query in a dashboard, logs may not show up when they should. Here, we will use a scenario of a remote server sending syslogs to help provide a clear troubleshooting path;

i. Log Server

Name: nls-c7x-x64

IP: 10.25.5.86

Listening Port: TCP 5544

ii. Remote Server Sending Logs

Name: centos12

IP: 10.25.13.30

Sending Port: TCP 5544

OS: CentOS 6.7 x64

Furthermore, let us focus on the methods to fix Logs Not Searchable or Not Coming In Nagios Log Server.

Remote Server – Check Rsyslog Config

This server has already been set up to send logs to nls-c7x-x64 using the setup steps in the Log Server GUI.

To confirm this has been done, we create a file and it contains:

/etc/rsyslog.d/99-nagioslogserver.conf

### Begin forwarding rule for Nagios Log Server NAGIOSLOGSERVER

$WorkDirectory /var/lib/rsyslog # Where spool files will live NAGIOSLOGSERVER
$ActionQueueFileName nlsFwdRule0 # Unique name prefix for spool files NAGIOSLOGSERVER
$ActionQueueMaxDiskSpace 1g # 1GB space limit (use as much as possible) NAGIOSLOGSERVER
$ActionQueueSaveOnShutdown on # Save messages to disk on shutdown NAGIOSLOGSERVER
$ActionQueueType LinkedList # Use asynchronous processing NAGIOSLOGSERVER
$ActionResumeRetryCount -1 # Infinite retries if host is down NAGIOSLOGSERVER

# Remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional NAGIOSLOGSERVER

*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER

### End of Nagios Log Server forwarding rule NAGIOSLOGSERVER

It is important to note here the following line:

*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER

It is assumed that the server centos12 can resolve the address nls-c7x-x64, otherwise, it will not be able to send it logs.

To confirm this, we execute the following command on centos12:

ping nls-c7x-x64 -c 1

We expect an output similar to this if it can successfully resolve nls-c7x-x64:

PING nls-c7x-x64.box293.local (10.25.5.86) 56(84) bytes of data.
64 bytes from nls-c7x-x64.box293.local (10.25.5.86): icmp_seq=1 ttl=64 time=0.273 ms

— nls-c7x-x64.box293.local ping statistics —

1 packets transmitted, 1 received, 0% packet loss, time 2ms
rtt min/avg/max/mdev = 0.273/0.273/0.273/0.000 ms

On the other hand, we can expect an output similar to this if it cannot resolve nls-c7x-x64:

ping: unknown host nls-c7x-x64

Going back to that config line:

*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER

The @@ indicates that the port type is TCP and the port number is 5544.

If it was UDP, there would only be one @.

Remote Server – Check Rsyslog Is Running

Assuming the config is correct, we may want to make sure that rsyslogd is running:

service rsyslog status

We can expect an output like this if it is running:

rsyslogd (pid 2098) is running…

Also, we can expect an output similar to this is if it is not running:

rsyslogd is stopped

Subsequently, if it is not running, we should start it:

service rsyslog start

Remote Server – Check Firewall Rules

We want to make sure that the iptables firewall allows outbound traffic. By default, there are no restrictions on outbound traffic.

To confirm this, we execute the following command:

iptables –list

We expect an output similar to this:

Chain INPUT (policy ACCEPT)

target prot opt source destination

ACCEPT all — anywhere anywhere state RELATED,ESTABLISHED

ACCEPT icmp — anywhere anywhere

ACCEPT all — anywhere anywhere

ACCEPT tcp — anywhere anywhere state NEW tcp dpt:ssh

REJECT all — anywhere anywhere reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)

target prot opt source destination

REJECT all — anywhere anywhere reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)

target prot opt source destination

Specifically, this last output is what we need to look at:

Chain OUTPUT (policy ACCEPT)

target prot opt source destination

The first line has (ACCEPT) which means there is no restriction at the top level (it would say DROP if there was).

The second line is simply headings for all the outbound rules that have been defined. Because there is no third line, there are NO outbound rules defined so the default here is to ACCEPT all outbound traffic (allow it).

If we had a restricted environment where outbound rules were DROP, we would need to add an outbound firewall rule for TCP port 5544 to nls-c7x-x64 on 10.25.5.86:

iptables -I OUTPUT -p tcp –destination-port 5544 -d 10.25.5.86 -j ACCEPT

Then run;

service iptables save

Remote Server – Watch Outbound Traffic

To confirm that the log traffic is leaving the remote server we can run a tcpdump to watch the traffic.

First, we must install tcpdump:

yum -y install tcpdump

Wait while tcpdump is installed.

Now we execute the following command to watch the traffic:

tcpdump src host 10.25.13.30 and tcp dst port 5544 and dst host 10.25.5.86

We will receive this message first:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

An example of traffic flow is as follows:

16:43:49.001130 IP centos12.box293.local.60907 > nls-c7x-x64.box293.local.5544: Flags [P.],
seq 2751017526:2751017581, ack 431734015, win 115, options [nop,nop,TS val 93111400 ecr 92667575], length 55

If we do not see any traffic, nothing is being logged and hence there is nothing to send. We can easily add a test entry to rsyslog which will generate traffic:

Open an additional ssh session to the remote server

Execute the following command:

logger TroubleshootingTest

In our other SSH session, we should now see a line of traffic that confirms that rsyslog is sending the logs onto nls-c7x-x64.

Press Ctrl C to stop the tcpdump.

Log Server – Watch Inbound Traffic

To confirm that the log traffic is entering the log server we can run a tcpdump. This is similar to the previous steps except it confirms that the traffic has made through any routers or firewalls between the remote server and the log server.

First, we must install tcpdump with this command:

RHEL|CentOS

yum install -y tcpdump

Debian|Ubuntu

apt-get install -y tcpdump

Now we execute the following command to watch the traffic:

tcpdump src host 10.25.13.30 and tcp dst port 5544 and dst host 10.25.5.86

We will receive this message first:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

An example of traffic flow is as follows:

16:52:42.509481 IP centos12.box293.local.60907 > nls-c7x-x64.box293.local.5544: Flags [P.],
seq 2751017651:2751017706, ack 431734015, win 115, options [nop,nop,TS val 93644443 ecr 92674681], length 55

If we do not see any traffic, it may just be that nothing is being logged and hence there is nothing to send. We can easily add a test entry to rsyslog which will generate traffic:

Open an additional ssh session to the remote server

Execute the following command:

logger TroubleshootingTest

In our log server SSH session, we should now see a line of traffic that confirms that the traffic is hitting the log server.

Press Ctrl C to stop the tcpdump.

If we do not see any traffic, then there may be a firewall or router blocking the traffic.

Log Server – Check Firewall Rules

We want to make sure that the iptables firewall allows inbound traffic. By default there are restrictions on inbound traffic however Nagios Log Server creates the firewall rules to allow the traffic.

RHEL 6|CentOS 6

There are separate firewall daemons for IPv4 and IPv6 and hence our Support Techs suggest separate commands.

First, check the status of the firewall:

IPv4

service iptables status

IPv6

service ip6tables status

If the firewall is running, it should produce output like:

Table: filter

Chain INPUT (policy ACCEPT)

num target prot opt source destination

1 ACCEPT all — 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
2 ACCEPT icmp — 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all — 0.0.0.0/0 0.0.0.0/0
4 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
5 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2057
6 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2056
7 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5544
8 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:3515
9 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpts:9300:9400
10 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443
11 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80
12 ACCEPT udp — 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:5544
13 REJECT all — 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)

num target prot opt source destination

1 REJECT all — 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)

num target prot opt source destination

Specifically, these lines tell us that the firewall rule exists and is allowing inbound UDP and TCP traffic on port 5544:

7 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5544

12 ACCEPT udp — 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:554

If these firewall rules do not exist, we add them by executing the following commands:

IPv4

iptables -I INPUT -p udp –dport 5544 -j ACCEPT
iptables -I INPUT -p tcp –dport 5544 -j ACCEPT

service iptables save

IPv6

ip6tables -I INPUT -p udp –dport 5544 -j ACCEPT
ip6tables -I INPUT -p tcp –dport 5544 -j ACCEPT

Then , execute;

service ip6tables save

If the firewall is not running, it will produce this output:

iptables: Firewall is not running.

If the firewall is not running, this means that inbound traffic is allowed.

To enable the firewall on boot and to start it, we execute the following commands:

IPv4

chkconfig iptables on
service iptables start

IPv6

chkconfig ip6tables on

service ip6tables start

RHEL 7|CentOS 7

First, check the status of the firewall:

systemctl status firewalld.service

If the firewall is running, it should produce output like:

● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2018-12-13 11:16:59 AEDT; 39min ago
Docs: man:firewalld(1)
Main PID: 670 (firewalld)
CGroup: /system.slice/firewalld.service
└─670 /usr/bin/python -Es /usr/sbin/firewalld –nofork –nopid

Similarly, if the firewall is not running, it will produce this output:

● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2018-12-13 11:57:15 AEDT; 1s ago
Docs: man:firewalld(1)
Process: 670 ExecStart=/usr/sbin/firewalld –nofork –nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 670 (code=exited, status=0/SUCCESS)

If the firewall is not running, this means that inbound traffic is allowed.

To enable the firewall on boot and to start it, we execute the following commands:

systemctl enable firewalld.service
systemctl start firewalld.service

To list the firewall rules execute this command:

firewall-cmd –list-all

Which should produce output like:

public (active)
target: default
icmp-block-inversion: no
interfaces: ens32
sources:
services: dhcpv6-client ssh
ports: 80/tcp 443/tcp 9300-9400/tcp 3515/tcp 5544/tcp 2056/tcp 2057/tcp 5544/udp
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:

Specifically, the ports line tells us that the firewall rule exists and is allowing inbound UDP and TCP traffic on port 5544:

ports: 80/tcp 443/tcp 9300-9400/tcp 3515/tcp 5544/tcp 2056/tcp 2057/tcp 5544/udp

If these firewall rules do not exist, they can be added by executing the following commands:

firewall-cmd –zone=public –add-port=5544/udp
firewall-cmd –zone=public –add-port=5544/tcp
firewall-cmd –reload

Debian

Debian has the iptables firewall installed but not enabled by default. The firewall rules are maintained by the netfilter-persistent service.

We can determine if it is installed with the following command:

systemctl status netfilter-persistent.service

If we receive this output then there is no firewall service active on our Debian machine:

Unit netfilter-persistent.service could not be found.

This means all inbound traffic is allowed, we will receive SNMP Traps.

If we receive this output then the firewall service is active on our Debian machine:

● netfilter-persistent.service – netfilter persistent configuration
Loaded: loaded (/lib/systemd/system/netfilter-persistent.service; enabled)
Active: active (exited) since Tue 2018-11-27 14:24:11 AEDT; 1min 26s ago
Main PID: 17749 (code=exited, status=0/SUCCESS)

If the netfilter-persistent service is enabled we can now check the status of the firewall:

iptables –list

An open firewall-config would produce output like:

Chain INPUT (policy ACCEPT)

target prot opt source destination

Chain FORWARD (policy ACCEPT)

target prot opt source destination

Chain OUTPUT (policy ACCEPT)

target prot opt source destination

We can see no rules exist.

If a rule did exist allowing inbound UDP traffic on port 162 then it would look like this:

target prot opt source destination

ACCEPT udp — anywhere anywhere udp dpt:snmp-trap

If these firewall rules do not exist, they can be added by executing the following commands:

iptables -I INPUT -p udp –destination-port 5544 -j ACCEPT
iptables -I INPUT -p tcp –destination-port 5544 -j ACCEPT

Ubuntu

Ubuntu uses the Uncomplicated Firewall (ufw) to manage firewall rules however it is not enabled on a default install.

We can check it with the following command:

ufw status

If the firewall is not running, it will produce this output:

Status: inactive

Meanwhile, if the firewall is running, it should produce output like:

Status: active

If the firewall is not running, this means that inbound traffic is allowed (we will receive SNMP Traps).

To enable the firewall on boot and to start it, we execute the following command:

ufw enable

Be careful executing this command, we will not be able to access the server when it next reboots. Its default configuration is to deny all incoming connections. We will need to add rules for all the different ports connect to this server.

To list the firewall rules we execute this command:

ufw status verbose

Which should produce output like:

Status: active

Logging: on (low)

Default: deny (incoming), allow (outgoing), disabled (routed)

New profiles: skip

To                            Action               From
— —— —-
5544/udp                      ALLOW IN             Anywhere
5544/tcp                      ALLOW IN             Anywhere
5544/udp (v6)                 ALLOW IN             Anywhere (v6)
5544/tcp (v6)                 ALLOW IN             Anywhere (v6)

We can see from the output that firewall rules exist allowing inbound UDP and TCP traffic on port 5544.

If these firewall rules do not exist, they can be added by executing the following commands:

ufw allow proto udp from any to any port 5544
ufw allow proto tcp from any to any port 5544
ufw reload

Log Server – Check Logstash Is Running

Assuming the config is correct, we may want to make sure that logstash is running:

RHEL 6|CentOS 6|Ubuntu 14

service logstash status

RHEL 7|CentOS 7|Debian|Ubuntu 16/18

systemctl status logstash.service

We can expect an output similar to this if it is running:

Logstash Daemon (pid 1171) is running…

We can expect an output similar to this if it is not running:

Logstash Daemon is stopped

If it is not running, we should start it:

RHEL 6|CentOS 6|Ubuntu 14

service logstash start

RHEL 7|CentOS 7|Debian|Ubuntu 16/18

systemctl start logstash.service

Log Server – Check Log Server Is Listening

We want to make sure that the server is listening to port 5544. To check, we execute the following command:

netstat -nal | grep 5544

We can expect an output similar to:

tcp 0 0 :::5544 :::* LISTEN
tcp 0 0 ::1:56104 ::1:5544 ESTABLISHED
tcp 0 0 ::1:5544 ::1:56104 ESTABLISHED
udp 0 0 :::5544 :::*

If it was not listening, then there would be no output to that command or the TCP ports would not appear.

Log Server – Search Log Server Dashboard

To confirm the logs are being received, we can search for the logs in the dashboard.

Initially, we log into the Log Server and click the Dashboards menu.

In the default dashboard we can search for the test logs we generated.

In the Query field type:

TroubleshootingTest

Press Enter and we should see the results below in the “Events Over Time” and “All Events” panels:

Log Server – Check Logstash Log

If we are still not seeing anything in the default dashboard we can check the logstash log file. Usually, nothing logs in here unless something goes wrong.

To check, we execute the following command:

tail -f /var/log/logstash/logstash.log

Log Server – Logs Appear A Few Hours Later

We do not see the logs in the default dashboard until a few hours after they were sent. In some situations, the date and time are not set correctly on all the Nagios Log Server nodes.

To ensure that the cluster timezone settings are correct, we follow the steps given below:

i. Log into Nagios Log Server

ii. In the top menu bar click Admin

iii. Under General click Global Settings

iv. Here we can define the Cluster Timezone

v. If it is not correct, select the timezone and click Save Settings

Log Server – Disable Filters

An incorrect filter can cause logs to not process by Log Server. A useful troubleshoot is to disable any extra filters we have added and see if the logs start appearing;

i. Log into Log Server and click Configure

ii. Under Global (All Instances) click Global Config

iii. On the right side of the screen is the Filters section

iv. The default filter included in Nagios Log Server is Apache

v. Disable any other filters we have added by clicking the Active icon (it will turn into Inactive)

vi. Click the Save & Apply button at the top

Once we have disabled the filters, we go to the Dashboards and see if logs start appearing.

We will need to go through the process of enabling filters one-by-one (Save & Apply) each time until we identify the filter that is causing the issue.

Once we know what filter is causing the issue, we can investigate why there is an issue with this filter.