It is a fact that maintaining a Server is a Time consuming task.
You will have to regularly deal with tasks such as upgrades, security patches and the occasional server errors.
Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to fix Nginx related issues.
Nature of Nginx error "502 bad gateway" ?
Sometimes, an Nginx related error such as "502 Bad Gateway" affects the Web application or website running it.
On looking as the error log, you will see a detailed error message such as this;
2020/04/04 08:34:43 [error] 949#949: *7 connect() failed (111: Connection refused) while connecting to upstream, client: XXX.XXX.XXX.XXX, server: myserver.com, request: "GET /myurl-this/ HTTP/1.0", subrequest: "/redis-fetch", upstream: "redis://127.0.0.1:6379", host: "refserver.com", referrer: "http://ibmimedia.com/myurl-this/"
When you take a careful watch at the message you will see terms such as "failed" and "refused". This shows that there is an issue with the web server.
How to fix Nginx error "502 bad gateway"?
1. Backend service failed
Nginx depends on backend services like PHP-FPM, database services and cache servers to run web applications.
So, if any of these services crash or freeze, Nginx won’t get any data from them, resulting in "502 Bad gateway" error.
Some known services that tends to fail are:
The reasons for service failure can range from traffic spikes and resource limits to disk errors and DDoS attacks.
If you suspect a backend service is unresponsive or failed, you can try killing all unresponsive processes and restarting the service.
For instance, here is the command to execute to kill defunct PHP-FPM processes and restart services;
# kill -9 $(pgrep php-fpm)
# /etc/init.d/php-fpm restart
* Restarting PHP FastCGI Process Manager php-fpm [ OK ]
It is important to note that you should not use these commands if you are not sure how it works.
If the service restart fails then you may need to get an Nginx support Expert to take a closer look at the health of your server.
[Our Nginx experts are online 24/7. Click here if you need help resolving your server error.]
2. High server load
The second most common reason for "502 bad gateway" in Nginx is high load average in backend servers.
Load spikes cause services to not respond.
The following reasons for load spikes has been noted by our support team:
i. Sudden spike in website traffic (can be seasonal or marketing / promotional).
ii. Malware infection on the server.
iii. Comment spamming or other vulnerability exploits.
iv. Brute force attacks that’s designed to exploit web apps.
v. Application bugs that cause memory leaks or resource hogging.
To troubleshoot a high load issue, first we figure out which resource is being abused (I/O, Memory, CPU or Net).
Then, find out which service is abusing that resource, and from that point, find out which user in that service owns the abusive script or software.
3. Incorrect service configuration
Your Nginx server and the backend services relies on many sub-systems to work properly.
This includes DNS resolution, Apache processes, PHP services, DB server, etc.
If even one of these services have a wrong config entry, that service will fail to respond, and Nginx will show "502 bad gateway" error.
Some noted configuration issues include;
i. DNS resolver misconfigured in Nginx causing domain lookups to fail.
ii. DB login details set incorrectly after a recent migration, restore or upgrade.
iii. Apache firewall settings (mod_security) syntax error causing Apache to crash.
iv. Incorrect memory or file limits set for PHP applications.
v. Capacity limits (like no: of connections per IP) set too restrictively causing legit visits to fail.
There is no easy way to find out a configuration error.
You really need to scan the error log and pay attention to what the error says.
For instance, this error here says the PHP application reached the maximum limit of processes (defined by pm.max_children setting) allowed.
WARNING: [mysite.com] server reached max_children setting (30), consider raising it
ERROR: unable to read what child say: Bad file descriptor (9)
If you are not familiar with PHP or web server settings, it is best to ask a Server Administrator.
If you need help fixing a similar error, click here to talk to our Nginx admins. We are online 24/7 and can attend your ticket within a few mins.
How Ibmi Media Server Experts prevents Nginx configuration errors?
As a quick aside, here is how to prevent server errors related to config issues.
Configuration errors are generally caused by stale server settings that’s not adjusted for new traffic or site upgrades.
That is why Dedicated Server Admins audit our customer servers at least once a month.
During this audit, we detect possible performance bottlenecks, security loopholes and hardware issues.
This helps us to proactively resolve potential issues, rather than reacting to a downtime once an error has happened.
4. Service port blocked in firewall
Firewalls are the bedrock of server security. But if not setup right, these firewalls can cause legitimate requests to be blocked or services to fail.
For instance, in Linux servers that run Plesk automation suite, Nginx runs on port 80, and Apache runs on port 7080.
But firewalls by default block uncommon ports such as 7080, and it will result in Nginx unable to connect to Apache.
Result? 502 Bad Gateway error.
Such issues often happens when a new service is enabled (eg. caching server, Ruby, etc.) in the backend, or during a migration, or after a server upgrade.
To fix it, we look at what port each service runs on using a command like this:
# netstat -lpn
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 19785/nginx
tcp6 0 0 :::80 :::* LISTEN 19785/nginx
and if we find any service running in non-standard ports, we either change the service configuration to change it to a standard port, or edit firewall config to allow the non-standard port.
5. Web application bugs
A rare case for "502 Bad Gateway" error is application code error.
If your web server logs show a scary looing error like this, it is possible that our application code is incompatible with the server version;
[notice] child pid 27831 exit signal Segmentation fault (11)
You will need to inspect the software requirements of your application, and re-configure the services to match the required versions.