Friday, January 9, 2015

Filestats IO assessment



In this post I will show how to use sys.dm_io_virtual_file_stats to collect I/O statistics over a sertain time. There are many good information written about this DMV but I have not seen any easy one about how to collect and analyse data from the DMV over time.

I will show one way I often use. The post is written with inspiration from Paul Randals excelent article. but I have done some change to fit my needs.


First creat a table to logg the information.


CREATE TABLE [dbo].[FileStatsIO](
[database_id] [smallint] NOT NULL,
[file_id] [smallint] NOT NULL,
[num_of_reads] [bigint] NOT NULL,
[io_stall_read_ms] [bigint] NOT NULL,
[num_of_writes] [bigint] NOT NULL,
[io_stall_write_ms] [bigint] NOT NULL,
[io_stall] [bigint] NOT NULL,
[num_of_bytes_read] [bigint] NOT NULL,
[num_of_bytes_written] [bigint] NOT NULL,
[file_handle] [varbinary](8) NOT NULL,
[timestamp] [datetime] NULL
) ON [PRIMARY]


Then creat a job which run once every hour. If you want you can of course change the time to suit the timeframe you like to have.

INSERT INTO [dbo].[FileStatsIO]
([database_id]
,[file_id]
,[num_of_reads]
,[io_stall_read_ms]
,[num_of_writes]
,[io_stall_write_ms]
,[io_stall]
,[num_of_bytes_read]
,[num_of_bytes_written]
,[file_handle]
,[Timestamp])
SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle], GETDATE()
FROM sys.dm_io_virtual_file_stats (NULL, NULL);


When you have some data we come to the intressting part. Lets analyse it.

;WITH CTE1 AS
(
SELECT 
a.[timestamp],a.[database_id], a.[file_id], a.[file_handle],
rank()  OVER (PARTITION BY [database_id], [file_id] ORDER BY  [timestamp] ASC) AS Rnk,
a.[num_of_reads] AS [a_num_of_reads],
CASE WHEN [num_of_reads] = 0 THEN 0 
ELSE LAG(num_of_reads, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_num_of_reads],
a.[num_of_bytes_read] AS [a_num_of_bytes_read],
CASE WHEN num_of_bytes_read = 0 THEN 0 
ELSE LAG(num_of_bytes_read, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_num_of_bytes_read],
a.[io_stall_read_ms] AS [a_io_stall_read_ms],
CASE WHEN [io_stall_read_ms] = 0 THEN 0 
ELSE LAG(io_stall_read_ms, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_io_stall_read_ms],
a.[num_of_writes] AS [a_num_of_writes],
CASE WHEN num_of_writes = 0 THEN 0 
ELSE LAG(num_of_writes, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_num_of_writes],
a.[io_stall_write_ms] AS [a_io_stall_write_ms],
CASE WHEN io_stall_write_ms = 0 THEN 0 
ELSE LAG(io_stall_write_ms, 1) OVER(PARTITION BY [database_id], [file_id]  ORDER BY  [timestamp] ASC) 
END AS [b_io_stall_write_ms],
a.[io_stall] AS [a_io_stall],
CASE WHEN io_stall = 0
THEN 0 ELSE LAG(io_stall, 1) OVER(PARTITION BY [database_id], [file_id] ORDER BY  [timestamp] ASC) 
END AS [b_io_stall],
a.[num_of_bytes_written] AS [a_num_of_bytes_written],
CASE WHEN num_of_bytes_written = 0 THEN 0 
ELSE LAG(num_of_bytes_written, 1) OVER(PARTITION BY [database_id], [file_id] ORDER BY  [timestamp] ASC) 
END AS [b_num_of_bytes_written]
FROM FileStatsIO a
--SET FILTER HERER  IF YOU WANT--
where a.database_id = 7
)
SELECT database_id, file_id, [timestamp],
--Read stats--
(a_num_of_reads - b_num_of_reads) as reads,
[TotalByteRead] = ((ISNULL(a_num_of_bytes_read,0) - ISNULL(b_num_of_bytes_read,0))),
[AvgBytePerRead] = ((ISNULL(a_num_of_bytes_read,0) - ISNULL(b_num_of_bytes_read,0))) /
(CASE WHEN ISNULL((a_num_of_reads - b_num_of_reads),1) = 0 THEN 1 
ELSE ISNULL((a_num_of_reads - b_num_of_reads),1) END),
a_io_stall_read_ms,
b_io_stall_read_ms,
[ReadLatency(ms)] =
(ISNULL(a_io_stall_read_ms,0) - ISNULL(b_io_stall_read_ms,0)) /
(CASE WHEN ISNULL((a_num_of_reads - b_num_of_reads),1) = 0
THEN 1 ELSE ISNULL((a_num_of_reads - b_num_of_reads),1) END),
--Write stats--
(a_num_of_writes - b_num_of_writes) as Writes,
[TotalByteWrite] = ((ISNULL(a_num_of_bytes_written,0) - ISNULL(b_num_of_bytes_written,0))),
[AvgBytePerWrite] = ((ISNULL(a_num_of_bytes_written,0) - ISNULL(b_num_of_bytes_written,0))) /
(CASE WHEN ISNULL((a_num_of_writes - b_num_of_writes),1) = 0
THEN 1 ELSE ISNULL((a_num_of_writes - b_num_of_writes),1) END),
a_io_stall_write_ms,
b_io_stall_write_ms,
[WriteLatency(ms)] =
(ISNULL(a_io_stall_write_ms,0) - ISNULL(b_io_stall_write_ms,0)) /
(CASE WHEN ISNULL((a_num_of_writes - b_num_of_writes),1) = 0
THEN 1 ELSE ISNULL((a_num_of_writes - b_num_of_writes),1) END)
FROM CTE1
WHERE rnk > 1


Here are some result. Between the snapshot at 13:00 and 14:00 there has been 184705024 bytes written to datafile nr one in the database im looking at. By compare the a_io_stall_write_ms and b_io_stall_write_ms and divide it by num_of_writes we also can se the average write responsetime during the period. Same goes with reads of course.