Store part of on premises databases in Azure Blobstorage
Specific filegroups in Azure
This is typical aimed for data that is less used. For example, archived data or history data.
The possibilities to split and merge partitions into Azure is the same procedure like any SQL server.
First add a new partition for new data with split date 2017-01-01.
Test between Azure and on premises
The second test was a customer defined query which only get data that is stored on filegoup FG_ARCHIVE. In this case, we were not able to see this differences as the first test did. The times was more or less quite the same. Probably this was because of the query did scan whole clustered index also on the other partitions.
In a scenario, we have a huge table with transactions from several years but we are only interested to query the current year. Then we can partition the table so older data reside on a partition where the file is stored in Azure. In our example FG_ARCHIVE is in Azure (See the URL point to the file), rest of the current year is partitioned in four partitions, one for each quarter.
Setup the filegroups and partitions schema and function.
Setup the filegroups and partitions schema and function.
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_1]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_2]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_3]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP [FG_4]
GO
ALTER DATABASE [DatabaseName] ADD FILEGROUP
[FG_ARCHIVE]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_1',
FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_1.ndf' , SIZE = 8192KB , FILEGROWTH =
65536KB ) TO FILEGROUP [FG_1]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_2',
FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_2.ndf' , SIZE = 8192KB , FILEGROWTH =
65536KB ) TO FILEGROUP [FG_2]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_3',
FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_3.ndf' , SIZE = 8192KB , FILEGROWTH =
65536KB ) TO FILEGROUP [FG_3]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_FG_4',
FILENAME = N'F:\Data1\MSSQL\Data\DatabaseName_data_FG_4.ndf' , SIZE = 8192KB , FILEGROWTH =
65536KB ) TO FILEGROUP [FG_4]
GO
ALTER DATABASE [DatabaseName] ADD FILE ( NAME = N'DatabaseName_data_fg_archive', FILENAME = 'https://ifazurebackuptest.blob.core.windows.net/ifdbdatafiles/aisfhub_financialhub_analysis_data_fg_archive.ndf'
, SIZE = 8192KB
, FILEGROWTH =
65536KB ) TO FILEGROUP [FG_ARCHIVE]
GO
CREATE PARTITION FUNCTION [DatePartitionFunctionNormal](date) AS RANGE RIGHT FOR VALUES (N'2016-01-01T00:00:00.000'
, N'2016-04-01T00:00:00.000'
, N'2016-07-01T00:00:00.000'
, N'2016-10-01T00:00:00.000')
GO
CREATE PARTITION SCHEME
[myDateRange1] AS PARTITION
[DatePartitionFunctionNormal] TO (FG_ARCHIVE, FG_1, FG_2, FG_3, FG_4, [Primary])
GO
GO
In the Azure portal, you can see the files. Before and after we have distribute data to it.
The possibilities to split and merge partitions into Azure is the same procedure like any SQL server.
Before switching the partition the structure for my example look like this:
Object
|
P#
|
Filegroup
|
Rows
|
Comparison
|
Date
|
dbo.Tablename
|
1
|
FG_ARCHIVE
|
497617
|
less than
|
2016-01-01
|
dbo. Tablename
|
2
|
FG_1
|
148045
|
less than
|
2016-04-01
|
dbo. Tablename
|
3
|
FG_2
|
45273344
|
less than
|
2016-07-01
|
dbo. Tablename
|
4
|
FG_3
|
100942096
|
less than
|
2016-10-01
|
dbo. Tablename
|
5
|
FG_4
|
72833665
|
less than
|
NULL
|
We now first split the last partition 4 and add a new one partition 5. Then merge the partition 2 into partition 1 so the data is moved from filegroup FG_1 to the archive filegroup FG_ARCHIVE.
First add a new partition for new data with split date 2017-01-01.
ALTER DATABASE [SomeDatabaseName] ADD FILEGROUP [FG_5]
GO
ALTER DATABASE [AISFHUB_financialhub_analysis] ADD FILE
( NAME = Logical_filename_FG_5',
FILENAME = N'F:\xxxx_Data1\MSSQL\Data\XXXXXXXX_FG_5.ndf' ,
SIZE = 102400KB , FILEGROWTH = 102400KB ) TO FILEGROUP [FG_5]
GO
ALTER PARTITION SCHEME myDateRange1 NEXT USED [FG_5];
GO
ALTER PARTITION FUNCTION DatePartitionFunctionNormal() SPLIT RANGE (N'2017-01-01T00:00:00.000');
GO
Then merge data from FG_1 into FG_ARCHIVE. Merge date is 2016-01-01 and new date for partition 1 is now 2016-04-01.
ALTER PARTITION SCHEME myDateRange1 NEXT USED [FG_ARCHIVE];
GO
ALTER PARTITION FUNCTION DatePartitionFunctionNormal() MERGE RANGE (N'2016-01-01T00:00:00.000');
GO
The rows from FG_1 is now in FG_ARCHIVE. 497617 + 148045 is 645662
Object
|
P#
|
Filegroup
|
Rows
|
Comparison
|
Date
|
dbo. Tablename
|
1
|
FG_ARCHIVE
|
645662
|
less than
|
2016-04-01
|
dbo. Tablename
|
2
|
FG_2
|
45273344
|
less than
|
2016-07-01
|
dbo. Tablename
|
3
|
FG_3
|
100942096
|
less than
|
2016-10-01
|
dbo. Tablename
|
4
|
FG_4
|
72833665
|
less than
|
2017-01-01
|
dbo. Tablename
|
5
|
FG_5
|
0
|
less than
|
NULL
|
During the operation of migrate the data from one partition to another we could see the activity in Azure portal.
Test between Azure and on premises
Test nr one is just a simple Clustered index scan on this large table hitting only the data stored in filegroup FG_ARCHIVE. The query was forced to physically read the data from disk for each run. This difference is huge, about 200ms on SSD compared to a span between 5-8 seconds from Azure.
No comments:
Post a Comment