How to fix scheduled backups no longer working in Nagios








Are you experiencing scheduled backup failures in Nagios? This guide will help to fix it.


Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to solve Nagios related errors.


In this context, we shall look into what causes this error and how to get rid of it.


What causes scheduled backup failures in Nagios?

In some cases when trying to perform scheduled backup processes in Nagios, it simply fails.

For example, during a recent backup of a .tar.gz file, we discovered that the file did not backup successfully.

Furthermore, you might see many folders not being removed as per the backup limit setting in "/store/backups/nagiosxi/" location.


A major factor that affects a file backup process in Nagios is as a result of database corruption.


How to scheduled backup failures in Nagios?

To get to the root of why backups processes are failing, start by performing a manual backup in an SSH on the Nagios XI server. This will generate a verbose output that will help in determining why the backup is failing.


You can follow the steps below to do perform manual backup;


 How to create a Backup From The Command Line

Start by establishing a connection to your Nagios XI server as the root user via a terminal or SSH tool such as putty.


Next, run the below script to create a backup of your Nagios XI;


/usr/local/nagiosxi/scripts/backup_xi.sh


Then you will get a successful backup with the following message:


===============
BACKUP COMPLETE
===============
The backup will be stored in /store/backups/nagiosxi/2xxxxx.tar.gz location.


 
How to create a manual Backup via The Web Interface

To create manual backups via the web UI via "Admin - System Backups - Local Backup Archives".


Next, click on the "Create Backup" button to enable the backup process to start. The backup process status will not appear on the page. However, you will know the backup completion once the .tar.gz file appears in the list of backups.


As earlier stated database corruption is one of the main reasons for the failure of backups. Here is backup failed due to database corruption which we came across recently.


Running configuration check…
Stopping nagios: done.
Starting nagios: done.
Backing up Core Config Manager (NagiosQL)…
tar: Removing leading `/’ from member names
tar: Removing leading `/’ from member names
Backing up Nagios Core…
tar: Removing leading `/’ from member names
tar: /usr/local/nagios/var/ndo.sock: socket ignored
tar: /usr/local/nagios/var/rw/nagios.qh: socket ignored
Backing up Nagios XI…
tar: Removing leading `/’ from member names
Backing up MRTG…
tar: Removing leading `/’ from member names
Backing up NRDP…
tar: Removing leading `/’ from member names
Backing up Nagvis…
tar: Removing leading `/’ from member names
Backing up MySQL databases…
mysqldump: Got error: 130: Incorrect file format ‘nagios_flappinghistory’ when using LOCK TABLES
Error backing up MySQL database ‘nagios’ – check the password in this script!


From the error report above, you will see that the error "check the password in this script" is just a generic message and is not the cause. The line before it explains the issue here.


In order to fix this problem, we repaired the Nagios and nagiosql databases. To do this, run the following commands in the command line as the root user:


/usr/local/nagiosxi/scripts/repairmysql.sh nagios
/usr/local/nagiosxi/scripts/repairmysql.sh nagiosql
/usr/local/nagiosxi/scripts/repairmysql.sh nagiosxi


If you are running Nagios XI 2014 onwards, you can use;

cd /usr/local/nagiosxi/scripts/
./repair_databases.sh


Then we run a force repair on the tables. To do that, here are the commands to run;


For RHEL 6|CentOS 6|Oracle Linux 6 Systems, run the commands below;


service mysqld stop
cd /var/lib/mysql/nagios
myisamchk -r -f nagios_<corrupted_table>
service mysqld start
rm -f /usr/local/nagiosxi/var/dbmaint.lock
php /usr/local/nagiosxi/cron/dbmaint.php



For RHEL 7|CentOS 7|Oracle Linux 7|Debian 9 system, run the commands below;


systemctl stop mariadb.service
cd /var/lib/mysql/nagios
myisamchk -r -f nagios_<corrupted_table>
systemctl start mariadb.service
rm -f /usr/local/nagiosxi/var/dbmaint.lock
php /usr/local/nagiosxi/cron/dbmaint.php



For Ubuntu 14 system, run the following commands;


service mysql stop
cd /var/lib/mysql/nagios
myisamchk -r -f nagios_<corrupted_table>
service mysql start
rm -f /usr/local/nagiosxi/var/dbmaint.lock
php /usr/local/nagiosxi/cron/dbmaint.php



For Debian 8|Ubuntu 16/18 systems, run the commands below;


systemctl stop mysql.service
cd /var/lib/mysql/nagios
myisamchk -r -f nagios_<corrupted_table>
systemctl start mysql.service
rm -f /usr/local/nagiosxi/var/dbmaint.lock
php /usr/local/nagiosxi/cron/dbmaint.php


After repairing the database, you can now run a manual backup to ensure that the issue is solved.


Then to remove the directories of failed backup files by executing the following command;


find /store/backups/nagiosxi/ -mindepth 1 -maxdepth 1 -type d -exec rm -rf ‘{}’ \;


Inspecting Scheduled Backups

If you wish to see what exactly is happening when the scheduled backups are running, then you can inspect the output live by executing the following command;


tail /usr/local/nagiosxi/var/cmdsubsys.log -f


Press CTRL + C when you have finished.


[Need support in fixing Nagios errors? We are available to help you today]



Conclusion

This article will guide you on how to solve Nagios scheduled backups failure which occurs when the database is currupted.


For Linux Tutorials

We create Linux HowTos and Tutorials for Sys Admins. Visit us on LinuxAPT.com

Also for Tech related tips, Visit forum.outsourcepath.com or General Technical tips on www.outsourcepath.com