Thursday, December 15, 2016

Use Azure with SQL server on premises

In SQL 2016 you can integrate in many ways with Microsoft Azure. I decided to test some scenarios which can be useful for many customer environments.


Store part of on premises databases in Azure Blobstorage

Specific filegroups in Azure

This is typical aimed for data that is less used. For example, archived data or history data. 

In a scenario, we have a huge table with transactions from several years but we are only interested to query the current year. Then we can partition the table so older data reside on a partition where the file is stored in Azure. In our example FG_ARCHIVE is in Azure (See the URL point to the file), rest of the current year is partitioned in four partitions, one for each quarter.

Setup the filegroups and partitions schema and function. 


ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_1]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_2]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_3]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_4]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_ARCHIVE]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_1', FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_1.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [FG_1]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_2', FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_2.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [FG_2]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_3', FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_3.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [FG_3]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_4', FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_4.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [FG_4]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_fg_archive', FILENAME = 'https://ifazurebackuptest.blob.core.windows.net/ifdbdatafiles/aisfhub_financialhub_analysis_data_fg_archive.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [FG_ARCHIVE]
GO

CREATE PARTITION FUNCTION [DatePartitionFunctionNormal](date) AS RANGE RIGHT FOR VALUES (N'2016-01-01T00:00:00.000'
, N'2016-04-01T00:00:00.000'
, N'2016-07-01T00:00:00.000'
, N'2016-10-01T00:00:00.000')
GO

CREATE PARTITION SCHEME [myDateRange1] AS PARTITION [DatePartitionFunctionNormal] TO (FG_ARCHIVE, FG_1, FG_2, FG_3, FG_4, [Primary])
GO



















In the Azure portal, you can see the files. Before and after we have distribute data to it.












The possibilities to split and merge partitions into Azure is the same procedure like any SQL server.
Before switching the partition the structure for my example look like this:

Object
P#
Filegroup
Rows
Comparison
Date
dbo.Tablename
1
FG_ARCHIVE
497617
less than
2016-01-01
dbo. Tablename
2
FG_1
148045
less than
2016-04-01
dbo. Tablename
3
FG_2
45273344
less than
2016-07-01
dbo. Tablename
4
FG_3
100942096
less than
2016-10-01
dbo. Tablename
5
FG_4
72833665
less than
NULL


We now first split the last partition 4 and add a new one partition 5. Then merge the partition 2 into partition 1 so the data is moved from filegroup FG_1 to the archive filegroup FG_ARCHIVE. 

First add a new partition for new data with split date 2017-01-01.

ALTER DATABASE [SomeDatabaseName] ADD FILEGROUP [FG_5]
GO

ALTER DATABASE [AISFHUB_financialhub_analysis] ADD FILE
            ( NAME = Logical_filename_FG_5',
                     FILENAME = N'F:\xxxx_Data1\MSSQL\Data\XXXXXXXX_FG_5.ndf' ,
                     SIZE = 102400KB , FILEGROWTH = 102400KB ) TO FILEGROUP [FG_5]
GO

ALTER PARTITION SCHEME myDateRange1 NEXT USED [FG_5];
GO
ALTER PARTITION FUNCTION DatePartitionFunctionNormal() SPLIT RANGE (N'2017-01-01T00:00:00.000');
GO

Then merge data from FG_1 into FG_ARCHIVE. Merge date is 2016-01-01 and new date for partition 1 is now 2016-04-01.

ALTER PARTITION SCHEME myDateRange1 NEXT USED [FG_ARCHIVE];
GO
ALTER PARTITION FUNCTION DatePartitionFunctionNormal() MERGE RANGE (N'2016-01-01T00:00:00.000');
GO

The rows from FG_1 is now in FG_ARCHIVE. 497617 + 148045 is 645662

Object
P#
Filegroup
Rows
Comparison
Date
dbo. Tablename
1
FG_ARCHIVE
645662
less than
2016-04-01
dbo. Tablename
2
FG_2
45273344
less than
2016-07-01
dbo. Tablename
3
FG_3
100942096
less than
2016-10-01
dbo. Tablename
4
FG_4
72833665
less than
2017-01-01
dbo. Tablename
5
FG_5
0
less than
NULL


During the operation of migrate the data from one partition to another we could see the activity in Azure portal.














Test between Azure and on premises

Test nr one is just a simple Clustered index scan on this large table hitting only the data stored in filegroup FG_ARCHIVE. The query was forced to physically read the data from disk for each run. This difference is huge, about 200ms on SSD compared to a span between 5-8 seconds from Azure.

The second test was a customer defined query which only get data that is stored on filegoup FG_ARCHIVE. In this case, we were not able to see this differences as the first test did. The times was more or less quite the same. Probably this was because of the query did scan whole clustered index also on the other partitions.