Tuesday 20 November 2012

When database code just doesn't smell right

Have you ever had a feeling that although your (or your developers') code works fine and passes all the unit / functional tests, perhaps it isn't all that it might be? It may be that you have inherited historic code, the requirements have developed organically, or perhaps that it was simply written in a hurry as a proof of concept that has stayed in production far longer than you would like to admit.

There are best practice tools such as SQL Cop to help identify this in databases, but perhaps you want something a little different. There can be a requirement to do this not only on a compiled database, but also on a series of script files. Perhaps these differentiate your database for different scenarios, or perhaps they're utility scripts you simply want to have a heads up on any hidden nasties that they contain.

This was something that had obviously struck Dave Ballantyne, and so he's created TSQLSmells, an opensource powershell script which analyses SQL script files for common "smells". Whilst it's a work in progress at the moment, Dave is actively developing it, welcoming and responding to feedback.

I've had a working Adventureworks database in a continuous integration job for some time, so this seemed the perfect test implementation to apply to TSQLSmells.



Firstly - the requirements. TSQLSmells needs powershell version 3, so you may need to update your copy to use it. This is a free download from Microsoft, so whilst I'd recommend trying it on a testbed first, it shouldn't cause any issues.

Next, let's look at how to run the script. The command line I use is as follows:

powershell -ExecutionPolicy Bypass -File "c:\cifiles\TSQLSmells.ps1" -FolderPath "%teamcity.build.checkoutDir%\Database" -ConfigFolder "C:\CIfiles" -minSeverity  10  -SupressGUI

Now, you will see I'm calling powershell bypassing the execution policy. You could update your build agent to set this appropriately, or even sign the script if you wanted. The File parameter points to the location of the powershell script. FolderPath is used to denote where the scripts under test are. I'm pointing it at a directory where my source controlled database is located. You can use a FileSpec parameter to denote when your files have a different extension to .sql (the default).

TSQLSmells denotes a "severity" for each error it searches for. These are configurable within the errors.txt file (which is a simple CSV format), and this allows you to exclude certain types of error if you do not want these to be reported upon. This is where the MinSeverity parameter comes in, as this is the threshold for reporting errors - simply set it to be not lower than the value you wish to exclude. You can also exclude specific error numbers with the ExcludeErrors parameter.

The output to the script is put in the config folder, and called Smells.xml. Whilst this isn't ideal from a CI point of view, it's a relatively simple matter to update the output location in the script so that the file is output within the CI job folder. I store the file as an artifact on each build, so that  it can be referred to, and stored with test results.



Although the Smells.xml file is not in a format which the CI system directly understands, it is an easily understandable format and would not be difficult to associate with an XSD file for display. (I am exploring with Dave at the moment whether the output could be altered for better integration with CI systems in the future.)




If you don't want to run the script as part of a CI process, it does have a GUI element to it - you'll note I used the SupressGUI parameter above, if you exclude this parameter then the GUI will be run to return the results to you.

The choice to use powershell rather than SQL to analyse the database scripts has both pros and cons, the trade-off being that whilst you can examine specific scripts outside the database, you don't have the easy extensibility for those not familiar with powershell. This may include many database devs/DBAs, although powershell is becoming more widely adopted. I feel that this is a distinguishing feature which gives TSQLSmells an edge - it allows scripts to be examined outside of a database, and to allow them to be examined before they are put into the final environment. Although I am in this example using a source controlled database, there is no reason why the database need be involved at all. This gives an ability for a sanity check to be light-weight - and therefore run ad-hoc more frequently.

I think that this script shows great promise and will for me form another method of screening code for potential issues. Whilst it's not a cast-iron guarantee of a problem, it's a useful way of highlighting potential problems, and gives a list of files to check in more detail.

No comments:

Post a Comment