Tuesday, May 11, 2021

AWS Redshift Architecture and Administrative Tasks

 

1.Acrhitecture/Design

ü  Redshift Hosted on top  of Linux OS cluster ,below is the  architectural Diagram.

ü  Redshift Consists of 1 leader node (min) and many Compute nodes

ü  Leader node used to communicate with End users thru ODBC/JDBC

ü  Redshift  data stored in Compute node. It has 2 types Dense compute & Dense storage.

ü  Redshift  has data slices (considered as disk) where it used to store the  data with mirroring

ü  Redshift Back end database is PostgreSQL

ü  Redshift Version based on Cluster version and PostgreSQL version.

ü  Amazon Redshift data is stored in a columnar fashion which drastically reduces the I/O on disks .

ü  While launching the Redshift cluster ,we can connect  it using Endpoint or cluster names,LN & CN cant be accessed from  the users.

ü  AWS charge only for Compute node ,not for Leader Node.

ü  Use AWS cluster endpoint to connect to AWS cluster.


2. DB administration  

ü  Data loading and unloading  To & from to AWS redshift. It needs S3 bucket  access.

ü  we need access_key_ID & secret_access_key for the user to load & unload the data.

ü  We can use the ARN  for authentication to use copy & unload the data from S3

ü  Each Redshift cluster comes with Master username/password ,like Super user.

ü  We  need to provide the cluster name  each Redshift cluster ,which we can represent  the APP name

ü  We can use the tag to notify the APP name


3. Backup/Restore methods/Process

 

ü  In Redshift we can take the snapshot of the entire cluster, from this we can restore the database and specific  tables.

ü  If we need we can restore specific tables from the snapshot.

We can customize the retention  for Redshift database snapshots


4. Maintenance Tasks needed

ü  We need stats collection job needed to improve the performance of the queries.

ü  We  need Vaccum job ,its kind  of defragmentation activity on Redshift that’s available ,both Online &  offline

ü  WLM-work load management  schedule  based on the usage.

ü  Based on the storage  usage ,it will  alert the usage of disk/server

ü  We can setup Cloud watch monitoring for AWS redshift Clusters to get the Alarm.

ü  We can create the Event for monitoring and provide the ARN and email to you.


5. Encryption

ü  We can encrypt the Redshift cluster using KMS

 

6. Tools that used for Administration.

ü  We can do the administration work thru AWS console

ü  By default, AWS console has SQL editor which we can use for  DB administration.

ü  Query editor has limitation time to run the query below 10 mins execution.

ü  Aginity work Bench for Redshift ,  tool to work with AWS redshift

ü  We can install the PostgreSQL client on other nodes and make connection to Redshift database.



No comments:

Post a Comment