Tuesday, August 4, 2015

Partitioning and simple partitioning switching

Partitioning in SQL server
In this article I would present some good SQL featuers. I got a question from a customer how they can insert data on daily basis into a table which is used quite much so the focus was to minimize the impact for the insert operation. One way to solve this issue can be to make use of the partitioning in SQL and the Partitioning switching. We use one table for staging where we load the data at any time. Then since the switching operation just take some milliseconds we can do it at any time.

CREATE TABLE [dbo].[tblTestPartition](
            [id] [int] NOT NULL,
            [datum] [datetime] NULL,
            [objekt] [varchar](50) NULL
)
GO
CREATE TABLE [dbo].[tblTestPartitionStaging](
            [id] [int] NOT NULL,
            [datum] [datetime] NULL,
            [objekt] [varchar](50) NULL
)
GO
-----------Create the filegroups------------------------
ALTER DATABASE [testDB] ADD FILEGROUP [FGTestPartition1]
GO
ALTER DATABASE [testDB] ADD FILEGROUP [FGTestPartition2]
GO
ALTER DATABASE [testDB] ADD FILEGROUP [FGTestPartition3]
GO
ALTER DATABASE [testDB] ADD FILEGROUP [FGTestPartition4]
GO
------Setup the files, one on each filegroup-----------
ALTER DATABASE [testDB] ADD FILE ( NAME = N'FileTestPartition1',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\TestPartition1.ndf' ,
SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FGTestPartition1]
GO
ALTER DATABASE [testDB] ADD FILE ( NAME = N'FileTestPartition2',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\TestPartition2.ndf' ,
SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FGTestPartition2]
GO
ALTER DATABASE [testDB] ADD FILE ( NAME = N'FileTestPartition3',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\TestPartition3.ndf' ,
SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FGTestPartition3]
GO
ALTER DATABASE [testDB] ADD FILE ( NAME = N'FileTestPartition4',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\TestPartition4.ndf' ,
SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FGTestPartition4]
GO
---------Create the Partition functions-----------------
CREATE PARTITION FUNCTION [TestPartitionRange](datetime) AS RANGE RIGHT FOR VALUES ('2015-01-01', '2014-01-01', '2013-01-01');
GO
CREATE PARTITION FUNCTION [TestPartitionRangeStaging](datetime) AS RANGE RIGHT FOR VALUES ('2015-01-01', '2014-01-01', '2013-01-01');
GO
--------Create the Partition scheme----------------------
CREATE PARTITION SCHEME [TestPartitionRangeScheme]
AS
PARTITION TestPartitionRange
TO ([FGTestPartition1], [FGTestPartition2], [FGTestPartition3], [FGTestPartition4], [Primary]);
GO
CREATE PARTITION SCHEME [TestPartitionRangeSchemeStaging]
AS
PARTITION TestPartitionRangeStaging
TO ([FGTestPartition1], [FGTestPartition2], [FGTestPartition3], [FGTestPartition4], [Primary]);
GO
------Create the clustred index to the partitions--------
CREATE CLUSTERED INDEX [CLIX_Datum] ON [dbo].[tblTestPartition]
(
            [datum] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON TestPartitionRangeScheme (datum)
GO
CREATE CLUSTERED INDEX [CLIX_Datum] ON [dbo].[tblTestPartitionStaging]
(
            [datum] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON TestPartitionRangeSchemeStaging (datum)
GO

With the tables and partitioning objects in place we can now add some test data.
INSERT INTO [dbo].[tblTestPartition] ([id],[datum],[objekt]) VALUES (1, '2015-01-31', 'Test'); --Partition 4
INSERT INTO [dbo].[tblTestPartition] ([id],[datum],[objekt]) VALUES (1, '2014-01-31', 'Test'); --Partition 3
INSERT INTO [dbo].[tblTestPartition] ([id],[datum],[objekt]) VALUES (1, '2013-01-31', 'Test'); --Partition 2
INSERT INTO [dbo].[tblTestPartition] ([id],[datum],[objekt]) VALUES (1, '2012-01-31', 'Test'); --Partition 1
INSERT INTO [dbo].[tblTestPartition] ([id],[datum],[objekt]) VALUES (1, '2011-01-31', 'Test'); --Partition 1

Notice that I did add data also to the partition holding the most accurate data. So if we try to do the switching now it will fail with the error:

Msg 4904, Level 16, State 1, Line 106
ALTER TABLE SWITCH statement failed. The specified partition 4 of target table 'testDB.dbo.tblTestPartition' must be empty.



So let’s remove that record, 
delete from tblTestPartition where datum = '2015-01-31 00:00:00.000'

If we do a select against the destination table we see those 4 records now.



And the data in the staging table has one row.


Now it’s time to do the magic things. We will now move the data from tblTestPartitionStaging in partition 4 to the corresponding partition in the destination table tblTestPartition. We use ALTER TABLE command.

ALTER TABLE tblTestPartitionStaging SWITCH PARTITION 4 TO tblTestPartition PARTITION 4

Now the destination table has those record below and the staging table has zero records.




Friday, May 29, 2015

Get all the database and their size from a server


SELECT    DB_NAME(db.database_id) as DatabaseName,
    CAST((CAST(mfrows.RowSize as decimal(18,2)) * 8) /POWER(1024,2) AS DECIMAL(18,2)) as DataFileSizeGB,
    CAST((CAST(mfrows.RowSize as decimal(18,2)) * 8) /POWER(1024,1) AS DECIMAL(18,2)) as DataFileSizeMB,
    CAST((CAST(mflog.LogSize as decimal(18,2)) * 8) /POWER(1024,2) AS DECIMAL(18,2)) as LogFileSizeGB,
    CAST((CAST(mflog.LogSize as decimal(18,2)) * 8) /POWER(1024,1) AS DECIMAL(18,2)) as LogFileSizeMB
FROM sys.databases db
    LEFT JOIN (
SELECT database_id, SUM(size) RowSize FROM sys.master_files WHERE type = 0 GROUP BY database_id, type
  ) mfrows
  ON mfrows.database_id = db.database_id
    LEFT JOIN (
SELECT database_id, SUM(size) LogSize FROM sys.master_files WHERE type = 1 GROUP BY database_id, type
 ) mflog
 ON mflog.database_id = db.database_id
ORDER BY DB_NAME(db.database_id)

Friday, January 9, 2015

Filestats IO assessment



In this post I will show how to use sys.dm_io_virtual_file_stats to collect I/O statistics over a sertain time. There are many good information written about this DMV but I have not seen any easy one about how to collect and analyse data from the DMV over time.

I will show one way I often use. The post is written with inspiration from Paul Randals excelent article. but I have done some change to fit my needs.


First creat a table to logg the information.


CREATE TABLE [dbo].[FileStatsIO](
[database_id] [smallint] NOT NULL,
[file_id] [smallint] NOT NULL,
[num_of_reads] [bigint] NOT NULL,
[io_stall_read_ms] [bigint] NOT NULL,
[num_of_writes] [bigint] NOT NULL,
[io_stall_write_ms] [bigint] NOT NULL,
[io_stall] [bigint] NOT NULL,
[num_of_bytes_read] [bigint] NOT NULL,
[num_of_bytes_written] [bigint] NOT NULL,
[file_handle] [varbinary](8) NOT NULL,
[timestamp] [datetime] NULL
) ON [PRIMARY]


Then creat a job which run once every hour. If you want you can of course change the time to suit the timeframe you like to have.

INSERT INTO [dbo].[FileStatsIO]
([database_id]
,[file_id]
,[num_of_reads]
,[io_stall_read_ms]
,[num_of_writes]
,[io_stall_write_ms]
,[io_stall]
,[num_of_bytes_read]
,[num_of_bytes_written]
,[file_handle]
,[Timestamp])
SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle], GETDATE()
FROM sys.dm_io_virtual_file_stats (NULL, NULL);


When you have some data we come to the intressting part. Lets analyse it.

;WITH CTE1 AS
(
SELECT 
a.[timestamp],a.[database_id], a.[file_id], a.[file_handle],
rank()  OVER (PARTITION BY [database_id], [file_id] ORDER BY  [timestamp] ASC) AS Rnk,
a.[num_of_reads] AS [a_num_of_reads],
CASE WHEN [num_of_reads] = 0 THEN 0 
ELSE LAG(num_of_reads, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_num_of_reads],
a.[num_of_bytes_read] AS [a_num_of_bytes_read],
CASE WHEN num_of_bytes_read = 0 THEN 0 
ELSE LAG(num_of_bytes_read, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_num_of_bytes_read],
a.[io_stall_read_ms] AS [a_io_stall_read_ms],
CASE WHEN [io_stall_read_ms] = 0 THEN 0 
ELSE LAG(io_stall_read_ms, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_io_stall_read_ms],
a.[num_of_writes] AS [a_num_of_writes],
CASE WHEN num_of_writes = 0 THEN 0 
ELSE LAG(num_of_writes, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_num_of_writes],
a.[io_stall_write_ms] AS [a_io_stall_write_ms],
CASE WHEN io_stall_write_ms = 0 THEN 0 
ELSE LAG(io_stall_write_ms, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_io_stall_write_ms],
a.[io_stall] AS [a_io_stall],
CASE WHEN io_stall = 0
THEN 0 ELSE LAG(io_stall, 1) OVER(PARTITION BY [database_id], [file_id] ORDER BY  [timestamp] ASC) 
END AS [b_io_stall],
a.[num_of_bytes_written] AS [a_num_of_bytes_written],
CASE WHEN num_of_bytes_written = 0 THEN 0 
ELSE LAG(num_of_bytes_written, 1) OVER(PARTITION BY [database_id], [file_id] ORDER BY  [timestamp] ASC) 
END AS [b_num_of_bytes_written]
FROM FileStatsIO a
--SET FILTER HERER  IF YOU WANT--
where a.database_id = 7
)
SELECT database_id, file_id, [timestamp],
--Read stats--
(a_num_of_reads - b_num_of_reads) as reads,
[TotalByteRead] = ((ISNULL(a_num_of_bytes_read,0) - ISNULL(b_num_of_bytes_read,0))),
[AvgBytePerRead] = ((ISNULL(a_num_of_bytes_read,0) - ISNULL(b_num_of_bytes_read,0))) /
(CASE WHEN ISNULL((a_num_of_reads - b_num_of_reads),1) = 0 THEN 1 
ELSE ISNULL((a_num_of_reads - b_num_of_reads),1) END),
a_io_stall_read_ms,
b_io_stall_read_ms,
[ReadLatency(ms)] =
(ISNULL(a_io_stall_read_ms,0) - ISNULL(b_io_stall_read_ms,0)) /
(CASE WHEN ISNULL((a_num_of_reads - b_num_of_reads),1) = 0
THEN 1 ELSE ISNULL((a_num_of_reads - b_num_of_reads),1) END),
--Write stats--
(a_num_of_writes - b_num_of_writes) as Writes,
[TotalByteWrite] = ((ISNULL(a_num_of_bytes_written,0) - ISNULL(b_num_of_bytes_written,0))),
[AvgBytePerWrite] = ((ISNULL(a_num_of_bytes_written,0) - ISNULL(b_num_of_bytes_written,0))) /
(CASE WHEN ISNULL((a_num_of_writes - b_num_of_writes),1) = 0
THEN 1 ELSE ISNULL((a_num_of_writes - b_num_of_writes),1) END),
a_io_stall_write_ms,
b_io_stall_write_ms,
[WriteLatency(ms)] =
(ISNULL(a_io_stall_write_ms,0) - ISNULL(b_io_stall_write_ms,0)) /
(CASE WHEN ISNULL((a_num_of_writes - b_num_of_writes),1) = 0
THEN 1 ELSE ISNULL((a_num_of_writes - b_num_of_writes),1) END)
FROM CTE1
WHERE rnk > 1


Here are some result. Between the snapshot at 13:00 and 14:00 there has been 184705024 bytes written to datafile nr one in the database im looking at. By compare the a_io_stall_write_ms and b_io_stall_write_ms and divide it by num_of_writes we also can se the average write responsetime during the period. Same goes with reads of course.