×


Amazon Redshift - Its features and how to set it up

Are you trying to set up Amazon Redshift ?

This guide is for you.


Amazon Redshift is a fully-managed petabyte-scale cloud based data warehouse product designed for large scale data set storage and analysis. It is also used to perform large scale database migrations.

We have a lot of customers using Amazon Redshift since it is a fully managed data warehouse service in the cloud.

To create a data warehouse we have to launch a set of computing resources called nodes, which then organize into clusters.

Each cluster runs an Amazon Redshift engine and contains one or more databases.

Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to perform AWS related queries.

In this context, we shall look into how to configure Amazon Redshift.


Main Features of Amazon Redshift includes:

1. Supports VPC − We can launch Redshift within VPC and control access to the cluster through the virtual networking environment.

2. Encryption − We can encrypt and configure data in Redshift while creating tables in Redshift.

3. SSL − To encrypt connections between clients and Redshift.

4. Scalable − With a few simple clicks, the number of nodes can easily scale in the Redshift data warehouse as per requirement.

5. Cost-effective − It is a cost-effective alternative to traditional data warehousing practices.


How to set up Amazon Redshift ?

Now, you will learn the steps to set up Amazon Redshift.:


1. Sign in and launch a Redshift Cluster

i. Sign in to AWS Management Console and open Amazon Redshift console – https://console.aws.amazon.com/redshift/

ii. Select the region where we need to create the cluster using the Region menu.

iii. Then click the Launch Cluster button.

iv. Here, provide the required details and click the Continue button till the review page.

v. On the confirmation page, click the Close button to finish so that cluster is visible in the Clusters list.

vi. Then select the cluster in the list and review the Cluster Status information.


2. Configure the security group to authorize client connections to the cluster.

Follow these steps to the security group on the EC2-VPC platform.

i. Open Amazon Redshift Console and click Clusters on the navigation pane.

ii. When we select the desired Cluster, its Configuration tab opens.

iii. Then click the Security group.

iv. Here, click the Inbound tab.

v. Click the Edit button. Set the fields as below and click the Save button:

Type − Custom TCP Rule
Protocol − TCP
Port Range − Type the same port number used while launching the cluster. By default, the port for Amazon Redshift is 5439.
Source − Select Custom IP, then type 0.0.0.0/0.


3. Connect to Redshift Cluster

Given below are the steps to connect directly to Redshift Cluster:

i. Connect the cluster by using a SQL client tool. It supports SQL client tools that are compatible with PostgreSQL JDBC or ODBC drivers.

JDBC: https://jdbc.postgresql.org/download/postgresql-8.4-703.jdbc4.jar
ODBC: https://ftp.postgresql.org/pub/odbc/versions/msi/psqlodbc_08_04_0200.zip or
http://ftp.postgresql.org/pub/odbc/versions/msi/psqlodbc_09_00_0101x64.zip for 64-bit machines

ii. Use the following steps to get the Connection String.

a) Open Amazon Redshift Console and select Cluster in the Navigation pane.

b) Select the cluster of choice and click the Configuration tab.

c) A page opens with JDBC URL under Cluster Database Properties. Copy the URL.

iii. Use the following steps to connect the Cluster with SQL Workbench/J.

a) Open SQL Workbench/J.

b) Select the File and click the Connect window.

c) Select Create a new connection profile and fill in the required details like name, etc.

d) Click Manage Drivers and Manage Drivers dialog box opens.

e) Click the Create a new entry button and fill in the required details.

iv. Then click the folder icon and navigate to the driver location.

Finally, click the Open button

v. Leave the Classname box and Sample URL box blank. Then click OK.

vi. Choose the Driver from the list.

vii. In the URL field, paste the JDBC URL.

viii. Eventually, enter the username and password to their respective fields.

ix. Finally, select the Autocommit box and click Save profile list.


[Need urgent help with installing missing packages on Ubuntu Servers? We'd be happy to assist you. ]


Conclusion

This article covers an effective method to set up Amazon Redshift. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. This enables you to use your data to acquire new insights for your business and customers. The first step to create a data warehouse is to launch a set of nodes, called an Amazon Redshift cluster.

Amazon Redshift is a relational database management system (RDBMS), so it is compatible with other RDBMS applications. Amazon Redshift and PostgreSQL have a number of very important differences that you need to take into account as you design and develop your data warehouse applications.

Amazon Redshift is based on PostgreSQL.

Amazon Redshift is specifically designed for online analytic processing (OLAP) and business intelligence (BI) applications, which require complex queries against large datasets.


What is the difference between Amazon Redshift and Amazon Redshift Spectrum and Amazon Aurora?

Amazon Simple Storage Service (Amazon S3) is a service for storing objects, and Amazon Redshift Spectrum enables you to run Amazon Redshift SQL queries against exabytes of data in Amazon S3.

Both Amazon Redshift and Amazon RDS enable you to run traditional relational databases in the cloud while offloading database administration. 

Customers use Amazon RDS databases primarily for online-transaction processing (OLTP) workload while Redshift is used primarily for reporting and analytics.