Sunday 30 December 2007

New SQL Customer Advisory Team blog

Well more of a fully-blown website than a blog, and their blog wasn't bad (check my links on the left-hand side). Definitely worth keeping tabs on. Thanks to the ever-useful Euan Garden's blog for this tidbit.

Tuesday 18 December 2007

Retain Nulls

This slowed me down for a few minutes today - so might be useful (well fairly anyway).

Importing a text file using SSIS, and writing to a table in a Data Flow Task. Problem was that empty date fields in the text source were causing '1753-01-01 00:00:00.000' to be written to the table - we obviously wanted NULLs. The solution was to right-click on the Flat File Source -> click 'Show Advanced Editor...' click on the 'Component Properties' tab and then set the 'RetainNulls' property to 'True'. Obvious really but I'll add a picture just in case you didn't follow:


The code for doing this programmatically you say? Why here it is:

Dim flatFileSource As IDTSComponentMetaData90 = flow.ComponentMetaDataCollection.New()
Dim srcInstance As CManagedComponentWrapper = flatFileSource.Instantiate()

srcInstance.SetComponentProperty("RetainNulls", True)

Bob's yer Uncle - NULLs instead of 1753.

Don't forget that the Flat File Task and associated Flat File Data Source component need to have date fields configured to the DT_DBTIMESTAMP data type.

Visual Studio 2008 reviewed

Interesting review (well overview anyway) of the new version of Visual Studio on the Reg here. Towards the end of the aticle it mentions that the 'Database version' is integrated into the main edition. This is in fact 'Data Dude' (or 'DB Pro' as apparently it should be known). It strikes me that this is reason enough to get 2008 on its own. The pain of installing another sku over the top of Visual Studio and SQL Server has caused me considerable effing and blinding. Now its just a question of ticking a box on the install.

Monday 10 December 2007

Regular expressions again

Right. Now I understand regular expressions a little better here's an update to the previous post. Rather than use the rather clunky .NET string manipulation of the previous post this is a little neater:

Dim dateRegEx As New RegularExpressions.Regex("\d{4}[-]\d{2}[-]\d{2}[-]\d{2}.\d{2}.\d{2}.\d{6}")

If dateRegEx.IsMatch(StringToFix) Then

outString = dateRegEx.Replace(StringToFix, _
"(?\d{4})-(?\d{2})-(?\d{2})-(?\d{2}).(?\d{2}).(?\d{2}).(?\d{3})(?\d{3})", _
"${year}-${month}-${day} ${hour}:${min}:${sec}.${milli}")

End If

This does everything the previous code fragment does but with less than half the code. The regular expressions work as follows:

\d - in this example has to be used with a repetition count (the number in the {}-brackets). Essentially this matches the number of digits in the curly brackets. This was probably where I was going wrong in the previous post. So that tidies up the match.

The Replace function has quite a large number of overloads but the one being used is here.

the ?<year> notation in the second replace parameter identifies a regular expression group (in this case 'year'), essentially splitting up the string being evaluated, for manipluation in the third parameter.

The ${} notation is explained here, and in this example I'm just using it to concatenate the various bits of string groups created by using ?<group name> notation together.

Thursday 6 December 2007

Regular expressions are your friend

Well apparently there are people reading this - 23 so far if Google Analytics is to be believed. Anyway on with the post...

We're importing some frankly bizarre DB2 flat files using SSIS, one of the features of which is a date format yyyy-mm-dd-hh.mm.ss.mmmmm which SQL obviously won't accept as a datetime format (they're also fixed width and delimited - work that one out). For this and other reasons the easiest thng to do is to parse the file in a Script Task. The RegularExpressions .NET class is dead handy for this (Useful resource here ).

Add a script task do the necessary getting of the file using System.IO class and split the line using the Split method (note that the seperator is a Char data type not string), we need to reference the right RegularExpression class:

Imports System.Text.RegularExpressions

Then create an immutable regular expression:

Dim dateRegEx As New Regex("[0-9][0-9][0-9][0-9][-][0-9][0-9][-][0-9][0-9][-][0-9][0-9][.][0-9][0-9][.][0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")

Then match the string you want to fix with the regular expression and do your reformatting work:

If dateRegEx.IsMatch(StringToFix) Then
revString = StrReverse(StringToFix)
outString = Replace(Replace(revString, "-", " ", , 1), ".", ":")
outString = StrReverse(Replace(outString, ":", ".", , 1)).Substring(0, 23)
End If

Not the cleverest regular expression for date-matching but sufficient for our purposes. Incidentally I found that the SSIS version of .NET didn't seem to like using the /d notation instead of [0-9]. Anyway doing this without RegEx would be a real pain.

There is actually a regular expression task at www.sqlis.com but we wanted to parse the file prior to loading into SSIS. Script Task and RegEx to the rescue...

Monday 3 December 2007

Simple custom logging for SSIS

A neat trick found here for an easy way of doing custom logging to SQL Server. Essentially you can hack around with the stored procedure that SSIS calls to log events (sp_dts_addlogentry).

An obvious usage would be to simplify the extremely verbose logging messages provided by SSIS by filtering the majority into a seperate table and 'cherry-picking' the really useful ones and INSERTing them into a 'core' logging table. I'll leave it up to the reader (if there are any!) to work out the 'usefulness' algorithm for now, although I may post on this again if I end up implementing it.

Custom roll-up performance problems

A colleague of mine came across an issue when working with custom roll-ups in SSAS 2005, so I thought I'd post it here for future reference. Apparently since SP2 performance is significantly degraded (in our case by about 10 times) when a custom member formula is used to calculate a measure. MS are quoted here as suggesting that changing a the Cache Policy setting in the msmdsrv.ini to 9 and restarting SSAS solves the problem. This worked for us. Problem is that its a server-wide setting so lets hope it won't cause a problem somewhere else in the future.

Why bother using custom rollups? Well they're particularly useful when creating accounting reports because the aggregations sometimes depend on values of other members, rather than on directly related measures.

Custom roll-up performance problems

Sam - a colleague of mine - came across an issue when working with custom roll-ups in SSAS 2005, so I thought I'd post it here for future reference. Apparently since SP2 performance is significantly degraded (in our case by about 10 times) when a custom member formula is used to calculate a measure. MS are quoted
here
as suggesting that changing a the Cache Policy setting in the msmdsrv.ini to 9 and restarting SSAS solves the problem. This worked for us. Problem is that its a server-wide setting so lets hope it won't cause a problem somewhere else in the future.

Why bother using custom rollups? Well they're particularly useful when creating accounting reports because the aggregations sometimes depend on values of other members, rather than on directly related measures.