datablogs: Database Administration
Showing posts with label Database Administration. Show all posts
Showing posts with label Database Administration. Show all posts

Tuesday, November 14, 2023

PostgreSQL Table Partition on AWS RDS

If we grow bigger in the business , seamlessly our customer and transaction data also increases . In the meantime performance needs to consider as well 

So in this case of bigger tables indexes will not help us to achieve good performance on peak times . Alternatively we have partitioning to split the tables data into multiple pieces on all the relational database environments 

Like wise we are going to do range partition on sample table in PostgreSQL Database , In PostgreSQL three types of partition methods are available , 


Below are the important concern in the PostgreSQL Partition ,

  • Possible to attach regular table into partitioned one 
  • Not Possible to transform regular table to partitioned one   

So based on the above formula , we have tried to transform regular table into partitioned one for your reference 

Any one can use this example and perform partitioning in AWS PostgreSQL RDS easily 

Click GitHub Link for Code : AWS-PostgreSQL-RDS-Table-Partition

Step 1 : Create base datablogspaycheck table and insert some sample records 

DROP TABLE IF EXISTS datablogspaycheck CASCADE;

DROP SEQUENCE IF EXISTS public.paycheck_id_seq;

CREATE SEQUENCE public.paycheck_id_seq

    START WITH 1

    INCREMENT BY 1

    NO MINVALUE

    NO MAXVALUE

    CACHE 1;

create table datablogspaycheck

(

    payment_id int NOT NULL DEFAULT nextval('public.paycheck_id_seq'::regclass), 

    created timestamptz NOT NULL,

    updated  timestamptz NOT NULL DEFAULT now(),

    amount float,

    status varchar DEFAULT 'new'

);

CREATE INDEX idx_paycheck ON datablogspaycheck (created);

INSERT INTO datablogspaycheck (created) VALUES (

generate_series(timestamp '2023-01-01'

               , now()

               , interval  '5 minutes') ); 

Step 2 : Rename base table with new name

ALTER TABLE datablogspaycheck RENAME TO datablogspaycheck_basetable;

Step 3 : Create Partitioned table 

create table datablogspaycheck

(

    payment_id int NOT NULL DEFAULT nextval('public.paycheck_id_seq'::regclass), 

    created timestamptz NOT NULL,

    updated  timestamptz NOT NULL DEFAULT now(),

    amount float,

    status varchar DEFAULT 'new'

)PARTITION BY RANGE (created);

Step 4 : Create Separate Partition for each create date 

CREATE TABLE datablogspaycheck_202303 PARTITION OF datablogspaycheck

    FOR VALUES FROM ('2023-01-01') TO ('2023-03-01');

   

CREATE TABLE datablogspaycheck_20230304 PARTITION OF datablogspaycheck

    FOR VALUES FROM ('2023-03-01') TO ('2023-04-01');

    

CREATE TABLE datablogspaycheck_202304 PARTITION OF datablogspaycheck

    FOR VALUES FROM ('2023-04-01') TO ('2023-05-01');

    

CREATE TABLE datablogspaycheck_202311 PARTITION OF datablogspaycheck

    FOR VALUES FROM ('2023-05-01') TO ('2023-11-01');

   

CREATE TABLE datablogspaycheck_2024 PARTITION OF datablogspaycheck

    FOR VALUES FROM ('2023-11-01') TO ('2024-01-01');

Step 5 : Migrate the all records

insert into datablogspaycheck (payment_id,created,updated,amount,status) select payment_id,created,updated,amount,status from datablogspaycheck_basetable;

Step 6 : Validate each partition 

select * from datablogspaycheck_202303 order by 2 desc

select * from datablogspaycheck_20230304 order by 2 desc

select * from datablogspaycheck_202311 order by 2 desc

Its done , Easily migrated normal table data into partitioned table 

Thanks for Reading !!!



Wednesday, June 21, 2023

Oracle RDS Audit log enable

Oracle Audit Log : 

Oracle Audit Log refers to the feature in Oracle Database that records and stores information about various database activities and events. It provides a mechanism to track and monitor user activities, system events, and changes made to the database.

  1. User Logins: Recording user login attempts and authentication information.
  2. Database Activities: Logging SQL statements executed by users, including select, insert, update, and delete operations.
  3. Privilege Usage: Monitoring the usage of privileges, such as granting or revoking permissions.
  4. Schema Changes: Tracking modifications to database objects, such as creating or altering tables, views, or indexes.
  5. System Events: Recording system-level events, such as startup and shutdown of the database.
  6. Security Violations: Detecting unauthorized access attempts or suspicious activities.
  7. Administrative Operations: Logging administrative tasks performed by database administrators, such as user management or database configuration changes.

The Oracle Audit Log provides an essential tool for security, compliance, and troubleshooting purposes.

Types of Auditing in Amazon RDS for Oracle : 

  1. Standard Auditing 
  2. Unified Auditing 
  3. Fine-grained Auditing

We are going to see , how do we enable Standard auditing in Oracle RDS 

How to enable Audit Log in Oracle RDS?

Make sure you have enabled custom parameter group for Oracle RDS 

  • Modify below values for Audit_Trail Parameter 

            Audit_Trail - DB, EXTENDED

  • Next ,Just needs to modify below DDL or DML statements to capture the logs from the server 

            AUDIT DELETE ANY TABLE;

            AUDIT DELETE TABLE BY USER_01 BY ACCESS;

            AUDIT DELETE TABLE BY USER_02 BY ACCESS;

            AUDIT ALTER, GRANT, INSERT, UPDATE, DELETE ON DEFAULT;

            AUDIT READ ON DIRECTORY datapump_dir;

Its all done , we have enabled required logs to capture for security purpose 

How to we monitor Audit Logs ? 

We can just run the below command get the captured audit logs in Oracle RDS ,

SELECT * FROM DBA_AUDIT_TRAIL order by 1 desc 

Its just for normal scenario , explained the process . Still we can separate Audit Table space and many further things are available in Oracle . Let see on another blogs 

Happy Auditing !!!


Friday, February 24, 2023

How to Automate MongoDB Database Backups in Linux

We have setup of One Primary with Multiple Secondary

Even if we configured highly available setup and backups , native backup are so special to take it and keep it somewhere in the cloud 

using the below script we can easily schedule backup in Linux environments  

https://github.com/selvackp/MongoNativeBackup-/blob/main/mongo_dump.sh

export PATH=/bin:/usr/bin:/usr/local/bin #Decalre Today Date TODAY=`date +"%d%b%Y"` #Declare Variables Required to pass for mongo dump command DB_BACKUP_PATH='/mnt/mongobackup' MONGO_HOST='localhost' MONGO_PORT='27017' MONGO_USER='xxxxxxxxxxx' MONGO_PASSWD='xxxxxxxxxxxxx' DATABASE_NAMES='ALL' #Remove Old Backup Files find ${DB_BACKUP_PATH} -name "*.zip" -type f -mtime +3 -delete find ${DB_BACKUP_PATH} -type d -mtime +3 -exec rm -rf {} \; #Create Directory for Backup mkdir -p ${DB_BACKUP_PATH}/${TODAY} cd ${DB_BACKUP_PATH}/${TODAY}/ if [ ${DATABASE_NAMES} = "ALL" ]; then echo "You have choose to backup all database" mongodump --uri="mongodb://${MONGO_USER}:${MONGO_PASSWD}@${MONGO_HOST}:${MONGO_PORT}" else echo "Running backup for selected databases" for DB_NAME in ${DATABASE_NAMES} do mongodump --uri="mongodb://${MONGO_USER}:${MONGO_PASSWD}@${MONGO_HOST}:${MONGO_PORT}/${DB_NAME}" done fi #Compress The Backup cd ${DB_BACKUP_PATH}/${TODAY} zip -r ${DB_BACKUP_PATH}_${TODAY}.zip ${DB_BACKUP_PATH}/${TODAY} cd ${DB_BACKUP_PATH}/${TODAY} #Copy the Compressed file into Azure Container using Shared Access Token azcopy cp ${DB_BACKUP_PATH}_${TODAY}.zip "https://xxxxxxxxxxx.blob.core.windows.net/xxxxxxxxxxxx?sp=w&st=xxxxxTxxxxxxxZ&se=xxxxxxZ&spr=https&sv=2021-06-08&sr=c&sig=csdfcdsxxxxxxxxxxxxxxx" --recursive=true #Send Mail with Backup Logs if [ $? -ne 0 ] then echo "Mongo Native backup Failed in $(hostname) $(date). Please contact administrator." | mail -r mail@datablogs.com -s "Mongo Native backup Failed $(hostname)" dbsupport@datablogs.com < /mongodata/cronscripts/mongo_backup_log.log else echo "Mongo Native backup completed in $(hostname)." | mail -r mail@datablogs.com -s "Mongo Native backup completed in $(hostname)" dbsupport@datablogs.com < /mongodata/cronscripts/mongo_backup_log.log fi

Friday, February 17, 2023

How to tune long running Queries on Azure Synapse DW (Dedicated Pool) - Part 2

Its took little long time to derive our approach tunning on azure data factory with Azure Synapse why because we need to run the system atleast two months validate our approach is smooth 

Yes its all running good as expected performance on the ETL loads and Processes 

Here are the major things we need to take care on Azure Synapse Dedicated Pool ,

  1. Dedicated SQL Pool Scaling 
  2. Dedicated SQL Pool Workload Management

Dedicated SQL Pool Scaling : 

We have decide below metrices to optimize the Azure Synapse 

  1. Data Load and Processing timelines
  2. Critical Process running timelines 
  3. Minimal Processing timelines 
  4. Reporting and Visualtions Timelines 
Based on above requirements , we can easily scalable the Dedicated SQL Pool based on the load 

Consider we need to load the data and process with in 2 hours , we can upscale the Dedicated SQL Pool to as like we want 

Key Part to increase the DW is concurrency , Instance Capacity and allocated resource for each queries 


Before start our critical process , we can automate upscale process with ADF Pipelines itself . So many blogs available to configure that 

Best method configure the authentication method with service principle ,


Dedicated SQL Pool Workload Management : 

We have decide below metrices to prepare workload management 

  1. Short Queries
  2. Critical Queries 
  3. High Intensive Queries 
  4. Reporting Queries

Based on the classifications , we have to split workload group for above queries 

Step 1 :

We need to create login and user for workload management in Dedicated SQL Pool 

--CREATE LOGIN [Analyticsprocess] WITH PASSWORD='xxxxxxxxx'

--CREATE USER [Analyticsprocess] FOR LOGIN [Analyticsprocess]

--GRANT CONTROL ON DATABASE::[sql-datablogs-dw] TO 

Analyticsprocess

Step 2 :

Consider you have upscaled instance into DW400c below are the resources allocation for the DW400c instance concurrency requirements 

In the workload group --> New workload group --> Click ELT

Consider analytics process user is used for high intensive queries we have to allocate as much as minimum resource for workload group 

Click Classifiers --> Add Classifiers --> Name it as ELT --> and specify Member should be the login and Label is important to mention 


Once click Add , we will get below concurrency range based on DW400c




By Default , its having system level workload group to handle the queries but its not effective we have to force our workload group 



Step 3 :

This is very important to utilize the workload group properly . We need to specify Label on the heavy processing queries so that it will utilized properly  

CREATE TABLE rpt_datablogs_finalreport.smgreport WITH (HEAP,DISTRIBUTION = REPLICATE) AS select * into rpt_datablogs_finalreport.smgreport_vw from rpt_datablogs_finalreport.vw_smgreport OPTION (LABEL='highintensiveprocess')

 


So these two things given much improvement on performance as well as reduced operational cost below 60% 

Still few more things to learn to optimize it on Azure Data Factory lets see on that next blog 

Happy Troubleshooting !!! 


Monday, January 30, 2023

Update Backup and Restore in Same SQL Server RDS

Update of Previous post ,

As mentioned in below blog , past years we cant restore database into same SQL Server RDS 

https://medium.com/@yogitha.o/backup-and-restore-in-same-sql-server-rds-3e667622f7ff

But now you can able to restore without any issues from recent releases of SQL Server RDS 

Thursday, December 15, 2022

Sunday, August 28, 2022

How to tune long running Queries on Azure Synapse DW (Dedicated Pool) - Part 1

Three hours of long running pipeline is reduced to run in 35 Minutes when we given right kind of top bottom optimization like our Body !!! 

Yes Sometimes if we give proper core and cross workouts you can run 10km in 55 Minutes !!! 

Its happened to me But Not Now 😆😆😆 

Oh Oh we are away from Data ground !!! Lets Move into Azure Synapse tunning,

In the starting phase every piece of code and pipeline process was really trouble us to optimize it , whatever we do on tunning it will run as same time 

Waited long time until each process to complete as its running 3 hours !!!

But Reecha blog helped something to start with basic check and given some hope to fine tune it 

Remember below piece of monitoring code always help us to too dig more 

--Check Long Running Query in Azure Synapse

SELECT *
FROM sys.dm_pdw_exec_requests
WHERE status not in ('Completed','Failed','Cancelled')
  AND session_id <> session_id()
ORDER BY submit_time DESC;
--Find the distributed query plan steps for long running query

SELECT * FROM sys.dm_pdw_request_steps WHERE request_id = 'xxxxxx'  ORDER BY step_index; 

***Replace your request_id from first query result  
Sample Running Query :


Initially we thought data volume is huge , so we are expecting this much delay on complete . But distributed query Plans are given more finishing time for each queries in Azure Synapse

We will go step by step process of tunning , 

Multiple ways we can see the Execution plan of your Azure Synapse Queries

  1. Using Azure Console --> Click you Dedicated Pool --> Click Query Activity --> Each Queries you have Query Plans 

        2.Using SSMS , you can run below query and get your explain plan in XML Format 

                EXPLAIN select * from dbo_datablogs.vw_fact_transferdata (nolock)



So ,We have figured out the execution plan and further need to fix below things to make it faster 

We need to reduce data movement operations ( Shuffle Move , Trim Move , Partition Move ) and needs to have proper indexing on your business tables 

Based on your data distribution on tables and indexing , queries will be executed on SQL Pool . We need to distribute the tables accordingly . So Major part of the tunning on table level only 

Once we done that in proper way we have achieved your milestone ,

How to we distribute table in Azure Synapse ? 

Two types of distribution is available in Azure Synapse . Round Robin and Hash Distribution 

Round Robin Distribution
  • By Default , If you create table it will create in Round Robin Distribution
  • Table rows are distributed in all distribution 
  • Main purpose of this table to improve loading speed we can use round robin distribution
  • If your table is just junk or doesn't have proper any constraints and keys use round robin distribution
CREATE TABLE [dbo_datablogs].[ControlTable_blogs](
                [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY NONCLUSTERED NOT ENFORCED,
                [SourceObjectSettings] [nvarchar](max) NULL) 
WITH ( CLUSTERED COLUMNSTORE INDEX,  DISTRIBUTION = ROUND_ROBIN )

Hash Distribution
  • If your table is having frequent DML operations ( Insert , Update , Delete ) use hash distribution
  • It will increase performance 200% of your queries 
  • It will distribute the data to nodes based on your distributed column 
CREATE TABLE [dbo_datablogs].[ControlTable_blogs](
                [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY NONCLUSTERED NOT ENFORCED,
                [SourceObjectSettings] [nvarchar](max) NULL) 
WITH ( HEAP,  DISTRIBUTION = HASH(Id))

Replicate 

This is another way to storing the table in SQL Pool 
  • Full copy of the table will be available in all distribution to avoid data movement 
  • Table is having less size use replicate option
CREATE TABLE [dbo_datablogs].[ControlTable_blogs](
                [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY NONCLUSTERED NOT ENFORCED,
                [SourceObjectSettings] [nvarchar](max) NULL) 
WITH ( HEAP,  DISTRIBUTION = REPLICATE )

Thumb of Rules of Creating tables 

First Level : Find the absolute unique column or else use partial unique column for creating table as Hash Distribution 

Second Level : If your table doesn't have proper unique columns , try to create the table with used columns in Join , Group By , Distinct , Over and Having Clauses on Procedures or Select Clauses 

Lets examine First Level , 

When creating the table with Round Robin and created table with below script ,

CREATE TABLE dbt_datablogs.fact_transferdata
WITH
  (
    CLUSTERED COLUMNSTORE INDEX,  
    DISTRIBUTION = ROUND_ROBIN   
  )  
AS SELECT * FROM dbt_datablogs.vw_fact_transferdata

Its taken 28 Minutes to complete


So we figured out good candidate key and created table with below script 

CREATE TABLE dbt_datablogs.fact_transferdata
WITH
  (
    CLUSTERED COLUMNSTORE INDEX,  
    DISTRIBUTION = HASH(tranferaddressid)
  )  
AS SELECT * FROM dbt_datablogs.vw_fact_transferdata

Its taken 4 Minutes to complete 


We will see the huge difference when creating proper keys 

Lets examine Second Level ,

When we don't have proper keys , based on your business logics or select clause join conditions we can try to evaluate with multiple columns and create proper hash function to improve performance 

Lets do the smaller table tunning on this blog , you can try the same in your own environments , 

Below table is small but its having shuffle move and its delaying the process 11 Minutes ,







Then based on shuffle columns , add the proper hash function and recreate the table 

CREATE TABLE dbt_datablogs.fact_memberinfo
WITH
  (
    Heap,  
    DISTRIBUTION = HASH(infoid)
  )  
AS SELECT * FROM dbt_datablogs.vw_memberinfo

Once created the table again examined the Execution plan for the same statement and process is finished with in a seconds 



This is not an end , Also we have tunned few things on Data Factory and Azure Synapse Workload Management to get expected performance , we will see further on next blog 

To learn more in-depth , Please use below references from Microsoft Site as well 

References : 


Thursday, August 11, 2022

SQL Server Point in Time Recovery

 Oooooouch !!! I have deleted critical data in the ledger table !!!

When we ran delete or drop Script on any database without double check , Its Really big problem for the end users !!! Also you wont sleep 😁😁😁

If you have GODDBA , you are saved millions dollars of your data also your valuable customers . 

Oh Great 😀😀😀 How we can achieve that ? 

In SQL Server Multiple Database Backup Methods are available to tackle that , 

If SQL Server database is in full recovery model we can take transaction log backups on every hour/Minute/Seconds into local or remote drives . 

So, Yes am taking Log Backup 😕😕😕 How to we restore particular time ? 

Using with transaction logs we can bring back the data as much as you want depends on your log backup strategy . 

To achieve that here is the automated point in time restore script for your valuable environment ,

Scripts are available in https://github.com/selvackp/SQLServerPointInTimeRestore.git

So finally you have saved your millions dollars worth of data in 10 Minutes !!!

USE master
GO

DECLARE 
        @DatabaseOLDName			sysname  = 'test',										
        @DatabaseNewName			sysname  = 'test1',										
        @PrimaryDataFileName		        sysname  = 'test',									
	@SecDataFileName                        sysname  =  NULL,
        @DatabaseLogFileName		        sysname  = 'test_log',								
	@PrimaryDataFileCreatePath              sysname  = 'D:\MISC\Bkp\testdata.mdf',			
        @SecDataFileCreatePath		        sysname  = NULL,									
        @SecDataFileCreatePath1		        sysname  = NULL,									
        @DatabaseLogFileCreatePath              sysname  = 'D:\MISC\Bkp\test_log.ldf',		
        @PITRDateTime				datetime = '2022-08-11T20:44:11';

DECLARE @command				nvarchar(MAX),
        @OldPhysicalPathName		        nvarchar(MAX),
        @FullBackupDateTime			datetime,
        @DiffBackupDateTime			datetime,
        @LogBackupDateTime			datetime,
        @message				nvarchar(MAX);

SET @command = N'RESTORE DATABASE @DatabaseNewName FROM DISK = @OldPhysicalPathName WITH FILE = 1, NORECOVERY, NOUNLOAD, REPLACE, STATS = 5, STOPAT = @PITRDateTime,
     MOVE N''' + @PrimaryDataFileName + N''' TO N''' + @PrimaryDataFileCreatePath + N''','
           + COALESCE('
     MOVE N''' + @SecDataFileName + ''' TO N''' + @SecDataFileCreatePath + ''',', '')
           + N'
     MOVE N''' + @DatabaseLogFileName + N''' TO N''' + @DatabaseLogFileCreatePath + N''';';

SELECT     TOP (1) @OldPhysicalPathName = bmf.physical_device_name,@FullBackupDateTime = bs.backup_start_date
FROM       msdb.dbo.backupset AS bs INNER JOIN msdb.dbo.backupmediafamily AS bmf ON  bmf.media_set_id = bs.media_set_id
WHERE      bs.database_name = @DatabaseOLDName AND  bs.type= 'D' AND  bs.backup_start_date < @PITRDateTime
ORDER BY   bs.backup_start_date DESC;

SET @message = N'Starting restore of full backup file '+ @OldPhysicalPathName + N', taken ' + CONVERT(nvarchar(30), @FullBackupDateTime, 120);

RAISERROR(@message, 0, 1) WITH NOWAIT;

EXEC sys.sp_executesql @command,
                       N'@DatabaseNewName sysname, @OldPhysicalPathName nvarchar(260), @PITRDateTime datetime',
                       @DatabaseNewName,
                       @OldPhysicalPathName,
                       @PITRDateTime;


SET @command = N'RESTORE DATABASE @DatabaseNewName FROM DISK = @OldPhysicalPathName WITH FILE = 1, NORECOVERY, NOUNLOAD, REPLACE, STATS = 5, STOPAT = @PITRDateTime;';

SELECT     TOP (1) @OldPhysicalPathName = bmf.physical_device_name,@DiffBackupDateTime = bs.backup_start_date
FROM       msdb.dbo.backupset   AS bs INNER JOIN msdb.dbo.backupmediafamily AS bmf ON  bmf.media_set_id = bs.media_set_id
WHERE      bs.database_name = @DatabaseOLDName AND  bs.type  = 'I' AND  bs.backup_start_date >= @FullBackupDateTime AND  bs.backup_start_date< @PITRDateTime
ORDER BY   bs.backup_start_date DESC;

IF @@ROWCOUNT > 0
BEGIN;
    SET @message = N'Starting restore of differential backup file ' + @OldPhysicalPathName + N', taken ' + CONVERT(nvarchar(30), @DiffBackupDateTime, 120);

    RAISERROR(@message, 0, 1) WITH NOWAIT;

EXEC sys.sp_executesql @command,
                       N'@DatabaseNewName sysname, @OldPhysicalPathName nvarchar(260), @PITRDateTime datetime',
                       @DatabaseNewName,
                       @OldPhysicalPathName,
                       @PITRDateTime;
END;

SET @command = N'RESTORE LOG @DatabaseNewName
FROM DISK = @OldPhysicalPathName
WITH FILE = 1, NORECOVERY, NOUNLOAD, REPLACE, STATS = 5, STOPAT = @PITRDateTime;';

DECLARE c CURSOR LOCAL FAST_FORWARD READ_ONLY TYPE_WARNING FOR
SELECT     bmf.physical_device_name,
           bs.backup_start_date
FROM       msdb.dbo.backupset         AS bs
INNER JOIN msdb.dbo.backupmediafamily AS bmf
   ON      bmf.media_set_id = bs.media_set_id
WHERE      bs.database_name = @DatabaseOLDName
AND        bs.type                 = 'L'
AND        bs.backup_start_date    >= COALESCE(@DiffBackupDateTime, @FullBackupDateTime)
ORDER BY   bs.backup_start_date ASC;

OPEN c;

FETCH NEXT FROM c
INTO @OldPhysicalPathName,
     @LogBackupDateTime;

WHILE @@FETCH_STATUS = 0
BEGIN;
    SET @message = N'Starting restore of log backup file '
                   + @OldPhysicalPathName + N', taken '
                   + CONVERT(nvarchar(30), @LogBackupDateTime, 120);
    RAISERROR(@message, 0, 1) WITH NOWAIT;
    EXEC sys.sp_executesql @command,
                           N'@DatabaseNewName sysname, @OldPhysicalPathName nvarchar(260), @PITRDateTime datetime',
                           @DatabaseNewName,
                           @OldPhysicalPathName,
                           @PITRDateTime;

    IF @LogBackupDateTime >= @PITRDateTime
        BREAK;

    FETCH NEXT FROM c
    INTO @OldPhysicalPathName,
         @LogBackupDateTime;
END;

CLOSE c;
DEALLOCATE c;

SET @command = N'RESTORE DATABASE @DatabaseNewName
WITH RECOVERY;';

RAISERROR('Starting recovery', 0, 1) WITH NOWAIT;
EXEC sys.sp_executesql @command,
                       N'@DatabaseNewName sysname, @OldPhysicalPathName nvarchar(260), @PITRDateTime datetime',
                       @DatabaseNewName,
                       @OldPhysicalPathName,
                       @PITRDateTime;
GO

Wednesday, February 16, 2022

SQL Server Always On availability group cluster in the AWS Cloud


Microsoft gives HA features like a charm . Lower to higher deployment costs its giving many features as per business requirements . Replication , Mirroring , Log shipping and Always On many features available to build HA Setup in On Premises . 

Like wise , we can setup all the above features in Cloud as well . In that we can see Always on availability group cluster in this blog 




What is Always On Availability Group?


  • An availability group supports a replicated environment for a discrete set of user databases, known as availability databases. 
  • You can create an availability group for high availability (HA) or for read-scale. An HA availability group is a group of databases that fail over together.  


Environment Setup for Always on Availability Group ,


Launched one Active Directory and two SQL Nodes with below range . Detailed setup for environment steps are below ,


Below are the detailed steps for environment steps ,
Step 1 : Create ag-sql-vpc with 10.0.0.0/16 IPv4 CIDR range


Step 2 : Create two private subnets ( 10.0.3.0/24 , 10.0.1.0/24 ) for SQL Nodes and one public subnet ( 10.0.4.0/24 ) for Active Directory  


Step 3 : Launched the windows instances with two secondary ip's for Failover Cluster and Always on Listener
In this POC Setup , Launched windows instance and installed SQL Server Developer edition . Also we can launch Windows with SQL Server 2016 based on your requirements 







Step 4 : Change the computer properties and rename the instance names accordingly 
Step 5 : Completed the AD Server configuration and its named as ag-sql-AD , After that change DNS server address in network properties in ag-sql-node1 and ag-sql-node2 ( 10.0.4.33 is static IP of AD Server )



Step 6 : Once modified the DNS configuration reboot the server and login with AD administrator account 
Step 7 : Once logged in with AD login , Install the failover clustering and below dependent features in ag-sql-node1 and ag-sql-node2


Configuring Shared Drive for Backup and Restore 


Step 8 : Between the ag-sql-node1 and ag-sql-node2 needs to take backup and log backups for Always on background process

Step 9 : Create folder in ag-sql-node2 and share with everyone in AD account
  
Step 10 : Take one time backup of DW_Mart and DataLake in that shared folder . Created Shared drive will be used while always on group creation 

Failover Cluster Configuration 


Step 11 : Open the Failover Cluster Manager console and Create the cluster . Browse and add the both servers 


Step 12 : Once all the steps finished , create the cluster wizard 



Step 13 : Click agsqlprod failover cluster and modify the cluster core resources . In this we need to add secondary IP for both nodes ( 10.0.1.11 and 10.0.3.11 )

Once we added both secondary IP's one of the IP will be come to online 

If we have not added secondary IP , it will show as an error like below 



Configuring SQL Server Services  


Step 14 : Once all the steps are completed on Failover cluster manager , modify the SQL Service Account to AD service account 

Step 15 : Next right click the SQL Server Service in configuration manager and enable the Always on High Availability on ag-sql-node1 and ag-sql-node2 SQL instances 


Create and Configuring the Availability Group 

Step 16 : Right click the always on group wizard and create the availability group as agsqldb


Step 17 : Based on the requirements add the number of replicas ,


Step 18 : Below are the endpoints and make sure allowed below ports between the cluster nodes 


Step 19 : Then create availability group listener with remaining secondary IP ( 10.0.1.12 and 10.0.3.12 )


Step 20 : Once everything is completed click Next to create availability group 



Once its created we can able to see the Cluster Manager Role in the Failover cluster manager console ,




Ready to sync the Data from Primary to Secondary 



After all availability group is healthy and primary and secondary nodes are synchronized 

Thanks for Reading !!! Any corrections or any doubt please contact me directly !!! 

Wednesday, February 2, 2022

Redis to Elasticache Migration in AWS

Comparing Redis , AWS Elasticache giving multiple options to cache the data in cloud . Its enhanced with two ways of Access control options in PaaS Service 

Below are the few mission handling features we loved in elastic cache ,

  • Detecting data failures and recovery the nodes automatically 
  • Multi AZ features with automatic failover to replication nodes 
  • Redis Cluster mode enabled supports data partitioning up to 500 shards in different zones
  • Redis managed backups , automatic failure detection and log files transfer to Kinesis Firehouse and cloud watch 
  • Easy online migration from any Redis source 
  • Encryption at rest and Encryption in transit are supported 
  • Data volume up to peta bytes with micro second response time  
Let start migrating data from Redis to Elasticache ,

Preparing our source and target Redis nodes for migration 


  1. Launched two EC2 machines and installed redis 5.x version 
  2. Configured replication between master and replica
        To avoid downtime , we are doing online migration from replication server . so we have modified below values in replica server . Its based on your source modify the below values

      3.Configured replication between master and replica
  • protected-mode no
  • bind datablogs.xcnphd.ng.0001.use1.cache.amazonaws.com 6379
  • requirepass should be disabled

       Note : Transferring source master or replica should not have AUTH configured and protected mode disabled 

      4.Launch Elastic cache instance with below steps 

Cluster Mode Disabled ( Redis Engine 5.0.5 or Higher ) 


Multi-AZ Should be Enabled 


Encrypt at rest and in-transit should be disabled


If we have not followed above all steps , we will get below popup while we start the migration 
 

Starting Migration 


In Elastic console Click --> Action --> Migrate data from endpoint 



Once clicked , we can type the Source Redis Endpoint ( Private IP ) and Port ( Customized Port ) in below console ,




Then click Start Migration and follow the logs from Events in same console , 




If we get any error , It will be printed in the events log



Once its started , Restart the Master of Elasticache node one time , In our case we need to restart the replica server because its our source



Once restarted verify the info replication in Elasticache nodes 

Once started successfully , we can't perform any other operation in Elasticache nodes ,


Verifying the Data Migration Progress 


We have inserted sample record in the EC2 Master Node and verified the same in the replica and as well as in Elasticache 


Data looks good , once all the data transferred from IaaS to PaaS . Complete the migration in the Elasticache console and point the application endpoints to Elasticache node 


Thanks for Reading !!!