Friday, March 11, 2016

TempDB

One of the more important tasks to take care of when you settle into maintaining a new instance is the proper allocation of files to tempdb. How large to make them and the count of how many to create, really depends upon how your developers and their applications are using tempdb. I've seen various other posts that presume to have some sort of algorithm to help you determine this, but I've become somewhat wary of just relying on an outside wild arse guess as an authoritative one-shot setup.

You really face a handful of issues with tempdb. The first is it gets shared by all the processes touching all of the databases on any particular instance. This means even though you can access different files (even across different physical devices) to your databases to avoid contention, they will always unavoidably contend for tempdb. The second issue that pops up is tempdb only shrinks on a SQL service restart (or naturally on a server reboot). The third issue is various software packages (or developers) have their own way to use tempdb: OLTP is normally not too hectic on the thing, but some ETL processes and OLAP aggregation processes can by nasty. Then you always get the developer who joins ten tables with three of them doing full cluster index scans, and tempdb balloons to hold all the temporary data.

For say a terabyte of total databases on an instance, I like to initially set up three or four tempdb data files of ten to forty gigabytes size each. After the instance has been running for a few weeks see if you have any IO stalls on tempdb:
WITH IOFORDATABASE AS
(
SELECT
 DB_NAME(VFS.database_id) AS DatabaseName
,CASE WHEN smf.type = 1 THEN 'LOG_FILE' ELSE 'DATA_FILE' END AS DatabaseFile_Type
,SUM(VFS.num_of_bytes_written) AS IO_Write
,SUM(VFS.num_of_bytes_read) AS IO_Read
,SUM(VFS.num_of_bytes_read + VFS.num_of_bytes_written) AS Total_IO
,SUM(VFS.io_stall) AS IO_STALL
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS VFS
JOIN sys.master_files AS smf
 ON VFS.database_id = smf.database_id AND VFS.file_id = smf.file_id
GROUP BY DB_NAME(VFS.database_id), smf.type
)
SELECT
 ROW_NUMBER() OVER(ORDER BY io_stall DESC) AS RowNumber
,DatabaseName, DatabaseFile_Type
,CAST(1.0 * IO_Read/ (1024 * 1024) AS DECIMAL(12, 2)) AS IO_Read_MB
,CAST(1.0 * IO_Write/ (1024 * 1024) AS DECIMAL(12, 2)) AS IO_Write_MB
,CAST(1. * Total_IO / (1024 * 1024) AS DECIMAL(12, 2)) AS IO_TOTAL_MB
,CAST(IO_STALL / 1000. AS DECIMAL(12, 2)) AS IO_STALL_Seconds
,CAST(100. * IO_STALL / SUM(IO_STALL) OVER() AS DECIMAL(10, 2)) AS IO_STALL_Pct
FROM IOFORDATABASE
ORDER BY IO_STALL_Seconds DESC;
If you have stalls higher on tempdb than your other databases, you need to increase the quantity of files preallocated to it. Also check how large it has grown (easiest to see by just opening windows explorer and looking at the file sizes). If the files are a whole lot larger than what you initially allocated then you should adjust your initial allocations up to something like the midpoint of the difference.

Every few months you should go back and review your tempdb sizes and stalls -- developers will have installed new software or even placed further strain on them with additional user databases. Also use a monitoring tool like RedGate to determine which queries are writing heavily to tempdb: you may be surprised to find they may have been constructed incorrectly.