Logs Not Searchable or Not Coming In Nagios Log Server error happens in the process of running a query in a Nagios dashboard.
Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to fix Nagios related errors.
In this context, we shall look into the causes of this Nagios error and how to tackle it.
As previously mentioned, while we run a query in a dashboard, logs may not show up when they should. Here, we will use a scenario of a remote server sending syslogs to help provide a clear troubleshooting path;
i. Log Server
Name: nls-c7x-x64
IP: 10.25.5.86
Listening Port: TCP 5544
ii. Remote Server Sending Logs
Name: centos12
IP: 10.25.13.30
Sending Port: TCP 5544
OS: CentOS 6.7 x64
Furthermore, let us focus on the methods to fix Logs Not Searchable or Not Coming In Nagios Log Server.
This server has already been set up to send logs to nls-c7x-x64 using the setup steps in the Log Server GUI.
To confirm this has been done, we create a file and it contains:
/etc/rsyslog.d/99-nagioslogserver.conf
### Begin forwarding rule for Nagios Log Server NAGIOSLOGSERVER
$WorkDirectory /var/lib/rsyslog # Where spool files will live NAGIOSLOGSERVER $ActionQueueFileName nlsFwdRule0 # Unique name prefix for spool files NAGIOSLOGSERVER $ActionQueueMaxDiskSpace 1g # 1GB space limit (use as much as possible) NAGIOSLOGSERVER $ActionQueueSaveOnShutdown on # Save messages to disk on shutdown NAGIOSLOGSERVER $ActionQueueType LinkedList # Use asynchronous processing NAGIOSLOGSERVER $ActionResumeRetryCount -1 # Infinite retries if host is down NAGIOSLOGSERVER
# Remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional NAGIOSLOGSERVER
*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER
### End of Nagios Log Server forwarding rule NAGIOSLOGSERVER
It is important to note here the following line:
*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER
It is assumed that the server centos12 can resolve the address nls-c7x-x64, otherwise, it will not be able to send it logs.
To confirm this, we execute the following command on centos12:
ping nls-c7x-x64 -c 1
We expect an output similar to this if it can successfully resolve nls-c7x-x64:
PING nls-c7x-x64.box293.local (10.25.5.86) 56(84) bytes of data.
64 bytes from nls-c7x-x64.box293.local (10.25.5.86): icmp_seq=1 ttl=64 time=0.273 ms
— nls-c7x-x64.box293.local ping statistics —
1 packets transmitted, 1 received, 0% packet loss, time 2ms
rtt min/avg/max/mdev = 0.273/0.273/0.273/0.000 ms
On the other hand, we can expect an output similar to this if it cannot resolve nls-c7x-x64:
ping: unknown host nls-c7x-x64
Going back to that config line:
*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER
The @@ indicates that the port type is TCP and the port number is 5544.
If it was UDP, there would only be one @.
Remote Server – Check Rsyslog Is Running
Assuming the config is correct, we may want to make sure that rsyslogd is running:
service rsyslog status
We can expect an output like this if it is running:
rsyslogd (pid 2098) is running…
Also, we can expect an output similar to this is if it is not running:
rsyslogd is stopped
Subsequently, if it is not running, we should start it:
service rsyslog start
Remote Server – Check Firewall Rules
We want to make sure that the iptables firewall allows outbound traffic. By default, there are no restrictions on outbound traffic.
To confirm this, we execute the following command:
iptables –list
We expect an output similar to this:
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all — anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp — anywhere anywhere
ACCEPT all — anywhere anywhere
ACCEPT tcp — anywhere anywhere state NEW tcp dpt:ssh
REJECT all — anywhere anywhere reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
target prot opt source destination
REJECT all — anywhere anywhere reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Specifically, this last output is what we need to look at:
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
The first line has (ACCEPT) which means there is no restriction at the top level (it would say DROP if there was).
The second line is simply headings for all the outbound rules that have been defined. Because there is no third line, there are NO outbound rules defined so the default here is to ACCEPT all outbound traffic (allow it).
If we had a restricted environment where outbound rules were DROP, we would need to add an outbound firewall rule for TCP port 5544 to nls-c7x-x64 on 10.25.5.86:
iptables -I OUTPUT -p tcp –destination-port 5544 -d 10.25.5.86 -j ACCEPT
Then run;
service iptables save
Remote Server – Watch Outbound Traffic
To confirm that the log traffic is leaving the remote server we can run a tcpdump to watch the traffic.
First, we must install tcpdump:
yum -y install tcpdump
Wait while tcpdump is installed.
Now we execute the following command to watch the traffic:
tcpdump src host 10.25.13.30 and tcp dst port 5544 and dst host 10.25.5.86
We will receive this message first:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
An example of traffic flow is as follows:
16:43:49.001130 IP centos12.box293.local.60907 > nls-c7x-x64.box293.local.5544: Flags [P.],
seq 2751017526:2751017581, ack 431734015, win 115, options [nop,nop,TS val 93111400 ecr 92667575], length 55
If we do not see any traffic, nothing is being logged and hence there is nothing to send. We can easily add a test entry to rsyslog which will generate traffic:
Open an additional ssh session to the remote server
Execute the following command:
logger TroubleshootingTest
In our other SSH session, we should now see a line of traffic that confirms that rsyslog is sending the logs onto nls-c7x-x64.
Press Ctrl C to stop the tcpdump.
Log Server – Watch Inbound Traffic
To confirm that the log traffic is entering the log server we can run a tcpdump. This is similar to the previous steps except it confirms that the traffic has made through any routers or firewalls between the remote server and the log server.
First, we must install tcpdump with this command:
RHEL|CentOS
yum install -y tcpdump
Debian|Ubuntu
apt-get install -y tcpdump
Now we execute the following command to watch the traffic:
tcpdump src host 10.25.13.30 and tcp dst port 5544 and dst host 10.25.5.86
We will receive this message first:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
An example of traffic flow is as follows:
16:52:42.509481 IP centos12.box293.local.60907 > nls-c7x-x64.box293.local.5544: Flags [P.],
seq 2751017651:2751017706, ack 431734015, win 115, options [nop,nop,TS val 93644443 ecr 92674681], length 55
If we do not see any traffic, it may just be that nothing is being logged and hence there is nothing to send. We can easily add a test entry to rsyslog which will generate traffic:
Open an additional ssh session to the remote server
Execute the following command:
logger TroubleshootingTest
In our log server SSH session, we should now see a line of traffic that confirms that the traffic is hitting the log server.
Press Ctrl C to stop the tcpdump.
If we do not see any traffic, then there may be a firewall or router blocking the traffic.
Log Server – Check Firewall Rules
We want to make sure that the iptables firewall allows inbound traffic. By default there are restrictions on inbound traffic however Nagios Log Server creates the firewall rules to allow the traffic.
RHEL 6|CentOS 6
There are separate firewall daemons for IPv4 and IPv6 and hence our Support Techs suggest separate commands.
First, check the status of the firewall:
IPv4
service iptables status
IPv6
service ip6tables status
If the firewall is running, it should produce output like:
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT all — 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
2 ACCEPT icmp — 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all — 0.0.0.0/0 0.0.0.0/0
4 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
5 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2057
6 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2056
7 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5544
8 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:3515
9 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpts:9300:9400
10 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443
11 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80
12 ACCEPT udp — 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:5544
13 REJECT all — 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT all — 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Specifically, these lines tell us that the firewall rule exists and is allowing inbound UDP and TCP traffic on port 5544:
7 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5544
12 ACCEPT udp — 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:554
If these firewall rules do not exist, we add them by executing the following commands:
IPv4
iptables -I INPUT -p udp –dport 5544 -j ACCEPT
iptables -I INPUT -p tcp –dport 5544 -j ACCEPT
service iptables save
IPv6
ip6tables -I INPUT -p udp –dport 5544 -j ACCEPT
ip6tables -I INPUT -p tcp –dport 5544 -j ACCEPT
Then , execute;
service ip6tables save
If the firewall is not running, it will produce this output:
iptables: Firewall is not running.
If the firewall is not running, this means that inbound traffic is allowed.
To enable the firewall on boot and to start it, we execute the following commands:
IPv4
chkconfig iptables on
service iptables start
IPv6
chkconfig ip6tables on
service ip6tables start
RHEL 7|CentOS 7
First, check the status of the firewall:
systemctl status firewalld.service
If the firewall is running, it should produce output like:
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2018-12-13 11:16:59 AEDT; 39min ago
Docs: man:firewalld(1)
Main PID: 670 (firewalld)
CGroup: /system.slice/firewalld.service
└─670 /usr/bin/python -Es /usr/sbin/firewalld –nofork –nopid
Similarly, if the firewall is not running, it will produce this output:
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2018-12-13 11:57:15 AEDT; 1s ago
Docs: man:firewalld(1)
Process: 670 ExecStart=/usr/sbin/firewalld –nofork –nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 670 (code=exited, status=0/SUCCESS)
If the firewall is not running, this means that inbound traffic is allowed.
To enable the firewall on boot and to start it, we execute the following commands:
systemctl enable firewalld.service
systemctl start firewalld.service
To list the firewall rules execute this command:
firewall-cmd –list-all
Which should produce output like:
public (active)
target: default
icmp-block-inversion: no
interfaces: ens32
sources:
services: dhcpv6-client ssh
ports: 80/tcp 443/tcp 9300-9400/tcp 3515/tcp 5544/tcp 2056/tcp 2057/tcp 5544/udp
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
Specifically, the ports line tells us that the firewall rule exists and is allowing inbound UDP and TCP traffic on port 5544:
ports: 80/tcp 443/tcp 9300-9400/tcp 3515/tcp 5544/tcp 2056/tcp 2057/tcp 5544/udp
If these firewall rules do not exist, they can be added by executing the following commands:
firewall-cmd –zone=public –add-port=5544/udp
firewall-cmd –zone=public –add-port=5544/tcp
firewall-cmd –reload
Debian
Debian has the iptables firewall installed but not enabled by default. The firewall rules are maintained by the netfilter-persistent service.
We can determine if it is installed with the following command:
systemctl status netfilter-persistent.service
If we receive this output then there is no firewall service active on our Debian machine:
Unit netfilter-persistent.service could not be found.
This means all inbound traffic is allowed, we will receive SNMP Traps.
If we receive this output then the firewall service is active on our Debian machine:
● netfilter-persistent.service – netfilter persistent configuration
Loaded: loaded (/lib/systemd/system/netfilter-persistent.service; enabled)
Active: active (exited) since Tue 2018-11-27 14:24:11 AEDT; 1min 26s ago
Main PID: 17749 (code=exited, status=0/SUCCESS)
If the netfilter-persistent service is enabled we can now check the status of the firewall:
iptables –list
An open firewall-config would produce output like:
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
We can see no rules exist.
If a rule did exist allowing inbound UDP traffic on port 162 then it would look like this:
target prot opt source destination
ACCEPT udp — anywhere anywhere udp dpt:snmp-trap
If these firewall rules do not exist, they can be added by executing the following commands:
iptables -I INPUT -p udp –destination-port 5544 -j ACCEPT
iptables -I INPUT -p tcp –destination-port 5544 -j ACCEPT
Ubuntu
Ubuntu uses the Uncomplicated Firewall (ufw) to manage firewall rules however it is not enabled on a default install.
We can check it with the following command:
ufw status
If the firewall is not running, it will produce this output:
Status: inactive
Meanwhile, if the firewall is running, it should produce output like:
Status: active
If the firewall is not running, this means that inbound traffic is allowed (we will receive SNMP Traps).
To enable the firewall on boot and to start it, we execute the following command:
ufw enable
Be careful executing this command, we will not be able to access the server when it next reboots. Its default configuration is to deny all incoming connections. We will need to add rules for all the different ports connect to this server.
To list the firewall rules we execute this command:
ufw status verbose
Which should produce output like:
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), disabled (routed)
New profiles: skip
To Action From
— —— —-
5544/udp ALLOW IN Anywhere
5544/tcp ALLOW IN Anywhere
5544/udp (v6) ALLOW IN Anywhere (v6)
5544/tcp (v6) ALLOW IN Anywhere (v6)
We can see from the output that firewall rules exist allowing inbound UDP and TCP traffic on port 5544.
If these firewall rules do not exist, they can be added by executing the following commands:
ufw allow proto udp from any to any port 5544
ufw allow proto tcp from any to any port 5544
ufw reload
Log Server – Check Logstash Is Running
Assuming the config is correct, we may want to make sure that logstash is running:
RHEL 6|CentOS 6|Ubuntu 14
service logstash status
RHEL 7|CentOS 7|Debian|Ubuntu 16/18
systemctl status logstash.service
We can expect an output similar to this if it is running:
Logstash Daemon (pid 1171) is running…
We can expect an output similar to this if it is not running:
Logstash Daemon is stopped
If it is not running, we should start it:
RHEL 6|CentOS 6|Ubuntu 14
service logstash start
RHEL 7|CentOS 7|Debian|Ubuntu 16/18
systemctl start logstash.service
Log Server – Check Log Server Is Listening
We want to make sure that the server is listening to port 5544. To check, we execute the following command:
netstat -nal | grep 5544
We can expect an output similar to:
tcp 0 0 :::5544 :::* LISTEN
tcp 0 0 ::1:56104 ::1:5544 ESTABLISHED
tcp 0 0 ::1:5544 ::1:56104 ESTABLISHED
udp 0 0 :::5544 :::*
If it was not listening, then there would be no output to that command or the TCP ports would not appear.
Log Server – Search Log Server Dashboard
To confirm the logs are being received, we can search for the logs in the dashboard.
Initially, we log into the Log Server and click the Dashboards menu.
In the default dashboard we can search for the test logs we generated.
In the Query field type:
TroubleshootingTest
Press Enter and we should see the results below in the “Events Over Time” and “All Events” panels:
Log Server – Check Logstash Log
If we are still not seeing anything in the default dashboard we can check the logstash log file. Usually, nothing logs in here unless something goes wrong.
To check, we execute the following command:
tail -f /var/log/logstash/logstash.log
Log Server – Logs Appear A Few Hours Later
We do not see the logs in the default dashboard until a few hours after they were sent. In some situations, the date and time are not set correctly on all the Nagios Log Server nodes.
To ensure that the cluster timezone settings are correct, we follow the steps given below:
i. Log into Nagios Log Server
ii. In the top menu bar click Admin
iii. Under General click Global Settings
iv. Here we can define the Cluster Timezone
v. If it is not correct, select the timezone and click Save Settings
Log Server – Disable Filters
An incorrect filter can cause logs to not process by Log Server. A useful troubleshoot is to disable any extra filters we have added and see if the logs start appearing;
i. Log into Log Server and click Configure
ii. Under Global (All Instances) click Global Config
iii. On the right side of the screen is the Filters section
iv. The default filter included in Nagios Log Server is Apache
v. Disable any other filters we have added by clicking the Active icon (it will turn into Inactive)
vi. Click the Save & Apply button at the top
Once we have disabled the filters, we go to the Dashboards and see if logs start appearing.
We will need to go through the process of enabling filters one-by-one (Save & Apply) each time until we identify the filter that is causing the issue.
Once we know what filter is causing the issue, we can investigate why there is an issue with this filter.
This article will guide you in solving #Logs Not Searchable or Not Coming In #Nagios Log #Server which occurs in the process of running a query in the Nagios dashboard.