AWS Redshift Monitoring: The Complete Guide

Would you like to detect problems in your Amazon Redshift environments? Does your team need a high-level overview of what monitoring options they can choose from when they deploy Redshift nodes and clusters?

This post was originally published on Stackify.

Would you like to detect problems in your Amazon Redshift environments? Does your team need a high-level overview of what monitoring options they can choose from when they deploy Redshift nodes and clusters?

First, we'll start with one of the most important components of any monitoring strategy: performance and availability monitoring. Then, we'll continue with monitoring Redshift configuration changes and how to meet compliance requirements with Redshift. At the end of this post, I've got a surprise monitoring challenge for you to take your monitoring game to the next level, so get ready!

To get your feet wet, let's start with the essentials of AWS Redshift monitoring.

What Is AWS Redshift?

AWS Redshift is part of Amazon’s big data ecosystem and is a fully managed data warehouse platform. It stores and analyzes large amounts of data blazingly fast—on a petabyte scale. This performance is impressive, thanks to Redshift’s columnar storage and massively parallel processing (MPP) architecture. Redshift supports a wide range of data sources and a whole lot of business intelligence and reporting applications. This is why Redshift is one of the fastest-growing big data products in the Amazon cloud.

Redshift is based on PostgreSQL and works very much like any other relational database system. This means your team can use their SQL skills to manage and query data with relative ease.

Moving on from this high-level overview of Redshift, let’s now turn our attention to Redshift monitoring. I find monitoring somewhat simplified in the cloud compared to on-prem data warehouse monitoring.

With the latter, your team must keep track of a lot of monitoring metrics all the time. These metrics represent a wide range of technology stacks and sometimes multiple monitoring systems. Does monitoring have to be that complicated in the cloud? Let’s review the key monitoring options you can choose from, starting with availability and performance monitoring.

AWS Redshift Infrastructure Availability and Performance Monitoring

The first tool in your Redshift monitoring toolkit is AWS CloudWatch. CloudWatch collects and analyzes Redshift performance metrics and can send performance and availability alerts for your team to investigate. Think of CloudWatch as your eyes and ears in the cloud. It can generate reports about performance and availability, as well as charts that you can use to gain better insights and spot any trends about the health of your Redshift operations. These reports can even help you to justify your team’s Redshift costs to C-level executives.

CloudWatch monitors Redshift performance and availability metrics on a cluster and node level. Just think of a cluster node as a server in the Redshift cluster. For performance, CloudWatch keeps track of various storage, network, and server compute metrics, like CPU and disk utilization, storage read/write IOPS, network throughputs, overall health status, and so on. The volume of metrics is manageable, unlike that of on-premise metrics. However, these CloudWatch metrics only focus on Redshift cluster infrastructure—not on database and query performance. Let’s see how your team can monitor Redshift database query performance next.

Monitor Redshift Database Query Performance

To monitor your Redshift database and query performance, let’s add Amazon Redshift Console to our monitoring toolkit. Your team can access this tool by using the AWS Management Console.

When your team opens the Redshift Console, they’ll gain database query monitoring superpowers, and with these powers, tracking down the longest-running and most resource-hungry queries is going to be a breeze. What’s more, your team can drill down and see the technical details of each query stage during execution using the Workload Execution Breakdown charts. However, bear in mind that to use these charts, you’ll need at least a two-node Redshift cluster.

CloudWatch can also monitor how long the database queries are running for with the QueryDuration metric. Your team can use this metric to detect problematic queries and tackle them head-on. However, long-running queries are not the only thing your team should monitor. Let’s dive into Redshift configuration monitoring next.

AWS Cloud Diary

Sometimes we need to monitor not just Redshift performance and availability, but other operational changes and activities that may impact your Redshift deployments.

Let me introduce AWS CloudTrail as a foundational component of your Redshift monitoring strategy. CloudTrail is an auditing service that records all actions, API calls, events, and activities in the cloud for every Amazon service, including Redshift. If CloudWatch is your eyes and ears, then CloudTrail is the all-knowing “cloud diary” that keeps track of your Redshift node and cluster configuration changes. When you combine CloudTrail with CloudWatch, your team can monitor Redshift configuration changes, which can help them immensely with regulatory and compliance requirements too.

CloudTrail keeps track of more than 80 Redshift configuration and security related metrics. Wouldn’t you like to know when someone creates a snapshot of your most critical databases? Or when someone configures a Redshift security group with unrestricted access to a sensitive Redshift database?

With CloudTrail, you can record all these configuration changes in CloudTrail log repositories. Then, your team can use CloudWatch to monitor those CloudTrail logs and send monitoring alerts for further investigation. With this monitoring duo, your team can’t miss what’s going on with your Redshift clusters anymore. But wait, there’s more.

AWS Redshift Database Audit Logging

Using CloudTrail with CloudWatch gives you immense monitoring powers, but this might still not give you enough visibility. Sometimes, you also want to monitor what’s going on inside your databases. To do that, your team should configure Redshift database audit logging. With this, your team can monitor and detect any configuration changes in Redshift database schemas, database user changes, database connections, authentication attempts, database queries, and so much more.

Redshift can generate and send these log entries to an S3 bucket, and it also logs these activities in database system tables on each Redshift node. If you want to aggregate these audit logs to a central location, AWS Redshift Spectrum is another good option for your team to consider.

AWS Redshift Compliance Monitoring

Last but not least, let’s discuss how you can monitor Redshift configuration drifts and compliance with AWS Config. This service continuously monitors and tracks any configuration changes.

Hold on, haven’t we talked about this earlier? You’re right.

AWS Config uses CloudTrail logs, but with a key difference. It not only monitors the configuration changes, but it also compares and evaluates those changes against your own configuration rules and industry standards. Previously, your team only knew the result of the configuration change, but they didn’t know what the original configuration was unless they spent time investigating it. AWS Config gives you the full picture of those changes and flags your cluster as either compliant or noncompliant. It doesn’t get any easier than that.

Monitoring Challenge to Take Your Application Monitoring to the Next Level

AWS Redshift monitoring is only the tip of the monitoring iceberg, but now let’s make this more interesting with an application monitoring challenge.

In this challenge, you need to capture and store your application errors and exceptions—even the ones you’re not aware of. Next, you need to correlate, enrich, and analyze that diagnostic data. Finally, you have to turn all that data into meaningful and actionable results that your team can use to fix difficult application issues.

Getting to the actionable results stage is hard; it takes time and skills you and your team might not have at the moment. Your developers and engineers need these results ASAP because they have a new software release coming up and it’s all hands on deck. Retrace can help you get those actionable results and insights about your application with a click of a button.

Go Forth and Monitor

I’ll wrap this up now with three keywords to remember for your Redshift monitoring strategy. These words are CloudWatch, CloudTrail, and AWS Config.

CloudWatch is your eyes and ears that monitor availability and performance metrics. CloudTrail is the all-knowing audit logging service to capture Redshift—and, in fact, all cloud—configuration changes. When you combine CloudWatch and CloudTrail, you’ll get full operational visibility of Redshift. With AWS Config, you can monitor and track configuration drifts and compliance.

To improve your Redshift monitoring game, remember these three monitoring services and tell your team about them. And of course, don’t forget to take the application monitoring challenge with Retrace and go from zero to hero in no time.