×


Troubleshoot Azure Cache for Redis timeouts

Are you trying to troubleshoot Azure Cache for Redis timeouts?

This guide will help you.


Azure Cache for Redis provides an in-memory data store based on the Redis. Moreover, Redis improves application performance by supporting common application architecture patterns. 

However, timeouts can occur here due to various reasons that include requests being bound by bandwidth limitations, commands taking a long time to process on the server, and so on.

Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to perform Azure related queries.

In this context, we shall look into how to troubleshoot for Redis timeouts.


How to troubleshoot Azure Cache for Redis timeouts ?

Here, you will learn ways to troubleshoot the Redis timeout issues.


Redis server patching for timeouts issue

Azure Cache for Redis regularly updates its server software as part of the managed service functionality.

This patching activity takes place behind the scene. Sometimes, if there is any failover and during the patching of Redis server nodes are, Redis clients who connect to these nodes may experience temporary timeouts as connections are switched between these nodes.


StackExchange.Redis timeout exceptions for Azure Cache

StackExchange.Redis uses a configuration setting named synctimeout. It mainly uses it for synchronous operations with a default value of 5000 ms.

In case, if the synchronous call doesn’t complete at this time, the StackExchange.Redis client throws a timeout error similar to the following example:

System.TimeoutException: Timeout performing MGET 2728cc84-58ae-406b-8ec8-3f962419f641, inst: 1,mgr: Inactive, queue: 73, qu=6, qs=67, qc=0, wr=1/1, in=0/0 IOCP: (Busy=6, Free=999, Min=2,Max=1000), WORKER (Busy=7,Free=8184,Min=2,Max=8191)

This error message contains metrics that can help point you to the cause and possible resolution of the issue. The below table contains details about the error message metrics.


TABLE

Here are the steps we follow to investigate possible root causes.

1. As a best practice, we make sure that the customer is using the following pattern to connect when using the StackExchange.Redis client.

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
{
return ConnectionMultiplexer.Connect("cachename.redis.cache.windows.net,abortConnect=false,ssl=true,password=...");
});
public static ConnectionMultiplexer Connection
{
get
{
return lazyConnection.Value;
}
}

2. Ensure that the server and the client application are in the same region in Azure.


3. We make sure to use the latest version of the StackExchange.Redis NuGet package.


4. If the requests are bound by bandwidth limitations on the server or client, it takes longer for them to complete and can cause timeouts. We see if the timeout is because of network bandwidth on the server, or because of client network bandwidth and take respective action.


5. We check if we are getting CPU bound on the server or on the client. High CPU causes the request not to process within the synctimeout interval and causes a request to time out.

Also, we check if we are getting CPU bound on the server by monitoring the CPU cache performance metric.


6. Long-running commands that are taking a long time to process on the Redis-server can cause timeouts. So we make use of SLOWLOG commands.


7. Timeouts can also occur due to the High Redis server load. We monitor the server load by monitoring the Redis Server Load cache performance metric.


[Need urgent assistance with Azure queries? – We are here to help you. ]


Conclusion

This article will guide you on how to troubleshoot Azure #Cache for Redis #timeouts. Azure Cache for Redis regularly updates its server software as part of the managed service functionality that it provides. 

Azure Cache for #Redis is a fully managed, in-memory cache that enables high-performance and scalable architectures. Use it to create cloud or hybrid deployments that handle millions of requests per second at sub-millisecond latency—all with the configuration, security, and availability benefits of a managed service.

This patching activity takes place largely behind the scene. During the failovers when Redis server nodes are being patched, Redis clients connected to these nodes may experience temporary timeouts as connections are switched between these nodes.


To help mitigate #Azure memory issues:

1. Upgrade the cache to a larger size so that you aren't running against memory limitations on the system.

2. Set expiration times on the keys so that older values are evicted proactively.

3. Monitor the used_memory_rss cache metric. When this value approaches the size of their cache, you're likely to start seeing performance issues. Distribute the data across multiple shards if you're using a premium cache, or upgrade to a larger cache size.


To fix #CPU bound on the server or on the client:

i. Check if you're getting bound by CPU on your client. High CPU could cause the request to not be processed within the synctimeout interval and cause a request to time out. 

ii. Moving to a larger client size or distributing the load can help to control this problem.

iii. Check if you're getting CPU bound on the server by monitoring the CPU cache performance metric. Requests coming in while Redis is CPU bound can cause those requests to time out. To address this condition, you can distribute the load across multiple shards in a premium cache, or upgrade to a larger size or pricing tier.