Friday, March 16, 2018

Physical Design

Likely everyone who makes it to the position of a database administrator has been through the ropes of serving as an elementary developer. By now you already know how to design tables, views, relationships between tables, stored procs, on and on. Early in your administrator career though you will run into the underworkings of the database engine, and recognize that it is time to face straight up and address the physical design of your databases. As much as poorly designed normalization can hose your system's queries, the underlying physical design can wreak just as much havoc.

Physical design covers several related aspects of adjustments to the databases, but deep down it's all about disk IO across the various files. Adding server memory nowadays is a fast and inexpensive affair, so the limiting operational performance in the host of most database systems is the IO throughput. Fast IO can be achieved numerous ways, but once you've established a baseline for the physical hardware and make sure you don't have any bottlenecks in disk controllers or bandwidth to your NAS, then what remains is spreading out the active database files across various drives and controllers to maximize the IO parallelism.

I use a couple different management queries to identify places where files should be split. These two queries below tell you where your usage is heaviest on your SQL instance:
 DB_NAME(VFS.database_id) AS DatabaseName
,CASE WHEN smf.type = 1 THEN 'LOG_FILE' ELSE 'DATA_FILE' END AS DatabaseFile_Type
,SUM(VFS.num_of_bytes_written) AS IO_Write, SUM(VFS.num_of_bytes_read) AS IO_Read
,SUM(VFS.num_of_bytes_read + VFS.num_of_bytes_written) AS Total_IO
,SUM(VFS.io_stall) AS IO_STALL
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS VFS
JOIN sys.master_files AS smf   ON VFS.database_id = smf.database_id
 AND VFS.file_id = smf.file_id
GROUP BY  DB_NAME(VFS.database_id), smf.type
,DatabaseName, DatabaseFile_Type
,CAST(1.0 * IO_Read/ (1024 * 1024) AS DECIMAL(12, 2)) AS IO_Read_MB
,CAST(1.0 * IO_Write/ (1024 * 1024) AS DECIMAL(12, 2)) AS IO_Write_MB
,CAST(1. * Total_IO / (1024 * 1024) AS DECIMAL(12, 2)) AS IO_TOTAL_MB
,CAST(IO_STALL / 1000. AS DECIMAL(12, 2)) AS IO_STALL_Seconds

select db_name(mf.database_id) as database_name, mf.physical_name,
left(mf.physical_name, 1) as drive_letter,
vfs.num_of_writes, vfs.num_of_bytes_written, vfs.io_stall_write_ms,
mf.type_desc, vfs.num_of_reads, vfs.num_of_bytes_read, vfs.io_stall_read_ms,
vfs.io_stall, vfs.size_on_disk_bytes
from sys.master_files mf
join sys.dm_io_virtual_file_stats(NULL, NULL) vfs
on mf.database_id=vfs.database_id and mf.file_id=vfs.file_id
order by vfs.num_of_bytes_written desc
The files that pop out from this with the highest stall percentages should likely be split and spread. Of course you need to pay special attention to tempDB as it gets shared across all of the applications running on each particular instance (I'll have more to say about allocating tempDB in a later post).

Once you've split up and created multiple files for your heavily used logs and databases, it's time to dig into some deeper level detail. This query below shows you table IO activity on a particular database; you should select your database connection (or add a USE statement) to see the details of one database at a time.
SELECT TableName = object_name(s.object_id),
       Reads = SUM(user_seeks + user_scans + user_lookups), Writes =  SUM(user_updates)
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = DB_ID()
GROUP BY object_name(s.object_id)
This will show you which specific tables are being heavily read and written to. Not infrequently I like to take the heavyweights and move them to their own filegroup.

You can do more IO tuning down at the query level, but that's a topic for another post.

Friday, February 16, 2018

The Best Tool

DBAs have all sorts of scripts and tools for managing SQL Server databases. In day to day operations you are down in the weeds with query plans and performance statistics and various monitors. You pop up a level and daily check the database backups and size of the log files. Every once in a while you lift on up to the "instance" level and check out the server logs or disk throughput.

Every other month or so I escape all the way out to the top and review once again what has turned out to be my best tool for managing the shop overall: an Excel spreadsheet. You see new databases seem to materialize on various instances all of the time, and developers even have a habit of spinning up new (poorly configured) instances all by themselves as well. So this spreadsheet helps me keep track of the whole mess and figure out where my DBA love needs to focus next.

Let me go into some detail about what the layout and each column signifies, and why they are important. Each instance has its own section; in the header adjacent to the instance name I show the SQL version, the processor count and speed, and the memory available on the server. Next to that is its backup schedule with an indication of frequency and level of manual intervention. This row of information comes in handy both when the CIO wants to suggest a different configuration on the VM boxes, and when normal backups run out of space and I need to review when they /should/ have happened.

Next comes a subheader row with column labels, and then a row (or more) for each database on that instance. The column next to the database name has the size of the data storage in KB. This is useful as every few months when I return to update the spreadsheet, I can readily tell which databases are most actively growing. Next comes the recovery model, Simple or Full (more about this in this previous post).

Next the location of the mdf database file; if the database has secondary filegroups then I show the location of the secondary files below this (with as many rows as necessary to cover all the secondaries). Then the location of the log file and the location of the backups. Note that to make the document more legible I abbreviate the often long path references with a double-letter pseudo disk designation, that is described in a legend at the bottom of each section.

Listing the file locations serves a couple of useful purposes. When you are first assembling this spreadsheet the discovery and documentation process helps you avoid the high risk of having all the files on a single physical drive (a notoriously dangerous arrangement). Later, when reviewing disk latency, you may find the spreadsheet useful in avoiding disk bottlenecks. Finally in the dreaded clutches of disaster recovery, you will know where your lifesaving logs and backups are located.

Next comes a column I have labeled Write Coupling. This and the next column, Software Volatility, relate to how you design your recovery models. For each database I provide a soft subjective rating from very low to very high of how tightly the dB writes are coupled to intricate processes, and how much the underlying update software is changing (more on this in the later post about recovery models).

Next follows a column labeled SME, for subject matter expert. Typically one of the lead developers at your company knows the most about this particular database, and if you need to migrate tables or take the database offline for maintenance then this is the person with whom you will get clearance. When you are first cobbling together this spreadsheet it will be smart to hold a few meetings with the developers to nail this down.

Finally a column for Notes that mostly alerts me to tasks and technical debt that needs to be worked down, or any special circumstances that excuse a database from normal maintenance processes.

You can find all sorts of costly tools to assist you in managing an array of databases, but far and away the most useful and cost effective is this simple spreadsheet.

Tuesday, January 16, 2018

Table Cleanup

For large databases actively under development, it's not unusual to accrue a fairly large collection of useless and abandoned tables. Here's a query you can run to find them... in this example my cutoff is a year.  The polite thing to do is notify the staff of your intention to archive the large tables (if they so indicate) before you delete them. Script out the definitions of all of those you plan to delete anyhow, just in case somebody suddenly misses them.
SET @DAYSunTouched = 365

;WITH UnUsedTables (TableName , TotalRowCount, CreatedDate , LastModifiedDate )
AS (
  SELECT AS TableName
     ,PS.row_count AS TotalRowCount
     ,DBTable.create_date AS CreatedDate
     ,DBTable.modify_date AS LastModifiedDate
  FROM sys.all_objects  DBTable
     JOIN sys.dm_db_partition_stats PS ON OBJECT_NAME(PS.object_id)
  WHERE DBTable.type ='U'
     AND DBTable.modify_date <= (SELECT sqlserver_start_time FROM sys.dm_os_sys_info)
     and DBTable.modify_date < (getdate() - @DAYSunTouched)
              FROM sys.dm_db_index_usage_stats
              WHERE OBJECT_ID = DBTable.object_id )
SELECT DISTINCT TableName , TotalRowCount, CreatedDate , LastModifiedDate
FROM UnUsedTables
ORDER BY TotalRowCount desc

Thursday, December 14, 2017

Protecting Key Tables

In most production systems you've got a half to a couple dozen tables that contain configuration or system-wide critical information. Things like State abbreviation tables, printer routing tables, user rights tables, hierarchies, workflow parameters, flag mapping key-value pair descriptions, et cetera. Generally you want to lock these tables down pretty tightly so that you don't allow somebody to accidentally update or delete a row, without some kind of more general awareness of what is happening. The way to do this is with a Database Role. At first this concept may seem confusing so I'll try to tease out the particulars in a way that slowly reveals its sense.

You already are familiar with the Instance level Server Roles, things like sysadmin and public. When you click on the Security folder off of the instance level, and then the folder under that called Server Roles, you can see all of the Instance level roles.

You also have seen the Database level Roles, for example after you add a new user login and then set up their access to the databases. You are familiar with db_datareader, db_datawriter, db_ddladmin, db_owner, et cetera. If you want to see these roles from the top down (rather than from the Logins mapping level) then click on a database, expand the Security folder under that database, and then click on the Roles folder, and then double-click the Database Roles folder.

This is where you add a Role to protect the tables within a database. Right click and select New database Role.  Give it a name (I like to use the name KeyTablesReadOnly). In a moment we'll revisit the remainder of this setup, but first a side digression. The point I'm making here though is that data protection roles belong in the database level of Roles, not in the Instance level.

How do you identify what tables to protect? A good clue on a fairly mature production database is to sort the tables in the Object Explorer Details pane by the column titled Data Space Used. Those that are small in size (and non-zero) likely contain the key configuration and mapping information. If you're using something like "Toad" you can select and paste these table names into a spreadsheet (which comes in handy shortly).

You probably will need to confirm these table names with the Development Manger so send that spreadsheet off for approval before proceeding. When you get confirmation, go back and double-click on that KeyTablesReadOnly database Role. The properties pane has two areas: "General" is for assigning the role to specific users, and "Securables" is for setting the protection level on objects within the database. On the tables that you want to protect, check the Deny column next to Delete, Insert, and Update, and then go ahead and add all the users (or you may leave a couple of users with full rights if they specifically are responsible for maintaining these tables).

If somebody needs to do an update, then the fast way to adjust them is to go to the Instance security folder, click on Logins, then their User mapping, highlight the database affected, and then remove them from the KeyTablesReadOnly role.

Tuesday, November 14, 2017

Memory Used

In an earlier post I mentioned some general guidelines for optimizing each instance of the SQL engine. Once you've got the memory set up it is quite interesting however to observe how the engine is actually using it. Run these two queries; they both show pretty much the same thing but you may find one more useful than the other.
(CASE WHEN ([is_modified] = 1) THEN 'Dirty' ELSE 'Clean' END) AS 'Page State',
(CASE WHEN ([database_id] = 32767) THEN 'Resource Database' ELSE DB_NAME (database_id) END) AS 'Database Name',
COUNT (*) AS 'Page Count'
FROM sys.dm_os_buffer_descriptors
GROUP BY [database_id], [is_modified]
ORDER BY [Page Count], [is_modified];

SELECT count(*)AS cached_pages_count, (count(*) * 8)/1024 As Mbytes,  db_name(database_id)
FROM sys.dm_os_buffer_descriptors
GROUP BY db_name(database_id) ,database_id
ORDER BY cached_pages_count DESC;
The amount of memory used naturally varies depending upon which queries are running at any particular time, but by monitoring this when you are under "memory pressure" you can easily identify the culprit database. This can then guide you whether it makes sense to reallocate that database to a different SQL instance, or perhaps allocate a deeper inspection to what queries are running against it.

To that end, you can run this script:
SELECT        session_id, text, requested_memory_kb,
    granted_memory_kb, used_memory_kb
FROM            sys.dm_exec_query_memory_grants MG
CROSS APPLY sys.dm_exec_sql_text(sql_handle) 

that will show you the /current/ allocation for any running queries. If you need greater detail down at the table index level, you can run this script:
       count(*)as cached_pages_count, as objectname, as indexname, obj.index_id as indexid
from sys.dm_os_buffer_descriptors as bd
    inner join
SELECT     p.object_id AS objectid, object_name(p.object_id) AS name, p.index_id, au.allocation_unit_id
FROM       sys.allocation_units au INNER JOIN
sys.partitions p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3)      
        union all
SELECT     p.object_id AS objectid, object_name(p.object_id) AS name, p.index_id, au.allocation_unit_id
FROM       sys.allocation_units au INNER JOIN
sys.partitions p ON au.container_id = p.partition_id AND au.type = 2
    ) as obj
on bd.allocation_unit_id = obj.allocation_unit_id
left outer join sys.indexes ind on  obj.objectid = ind.object_id and  obj.index_id = ind.index_id
where bd.database_id = db_id() and bd.page_type in ('data_page', 'index_page')
group by,, obj.index_id
order by cached_pages_count desc