Friday, April 11, 2008

Verifying Server Connection Prior to Perform a Task

I recently ran across with the following request:


Create an SSIS package that would grab login related information from each server on our environment, this information would be used by our internal auditors. As you can imagine, this would be impossible to do manually, so I thought about using SSIS to do that.

Someone in my team had previously created an SSIS package that was doing part of what was needed, using a server list (from a table) and a for each loop container. For each server on the list, it would perform a series of tasks that would update a table that was going to be used as report for the auditors, which is not important for this exercise.



When I took the package that my teammate created and ran for the first few times, it kept failing pretty much at the same point, then I went to check the server list to see if there were any servers that I could at least tell that didn't exist, or was decommissioned, or was down for whatever reason and that was true, the list was outdated and some servers on the list we have are no longer online, or were decommissioned or were down for whatever reason, that's when I realized that I had to check for the connectivity to the server prior to perform my task.

What was needed?
A way to check for server.database connectivity prior to perform my audit tasks, otherwise the package would fail and I don't the package to fail in case a server is down, unreachable or I get a time out. In case of a connectivity failure I want the SSIS package to log that failure activity to a table and go to the next server in the list, that's pretty much the only catch, the package can’t failed when a server is down.

My approach:
First, I took a piece of code from Jamie's blog (Link) and modified a little bit for my needs, Jamie's code isn't exactly what I needed but it gave a lot to start with. Keep it in mind that I'm not a .NET guy, so you have any suggestions as far as my code looks like feel free to post.


So, basically, inside of the For Each Loop Container I already had, I added the script as the first component before the other tasks (see code below in blue) and created 3 Package variables:

I created 3 package variables and all 3 of them with a package scope.
User::Server/String
User::Database/String
User::ConnFailed/Boolean, default value of False

The Server variable will hold the server name that I have stored on a table, the same goes for the Database variable, the ConnFailed variable is the variable the script will write the connection result to and this will be used to decide what next step to take. If the connection fails, the script will then set ConnFailed=True.

I tested my code with a test table, I created a table with good/real server names (servers that I knew were up and running) and some bogus ones to see how the script would react when a server was down or unreachable and the variable was being updated successfully. Great, I have almost everything I need to get this whole thing done.

Now I have a way to check if a server is up and running, but wait, I also need a way to check what servers were down/timed out/unreacheble after the package runs, so I can later manually check what servers were down and find out why and troubleshoot any issues from that point on. I can't just assume a server was down and leave that outside my request. Remember, this is an audit request, all the servers must be reported.

To do that, I created a table called SQLConnResults, this table is going to store Server, Database, ConnFailed and ConnDate and of course I'm going to be using the variables that I created previously as insert values to this table, so basically the hard work has been done already, all I need to do is insert the value to the table.

CREATE TABLE [dbo].[SQLConnResults](
[Server] [varchar](50) NULL,
[Database] [varchar](50) NULL,
[ConnFailed] [bit] NULL,
[ConnDate] [smalldatetime])


So, I added an Execute SQL Task right after the script, mapped the package variables into my task (User::Server, User::Database, User::ConnFailed on this order) and created a simple insert statement like the one below.

insert into SQLConnResults
values (?,?,?,getdate())

Great, a tested my package and so far it's working, I have a list of servers, I test connectivity to the each server and my package is logging this activity (Success/Failure) to a table, if a server is down the package is not failing it just goes to the next server in the list.
Notice on the output window on the picture below that it fails to connect to a server called FTLDIGOR01 (bogus server), but the package doesn’t fail.



Cool, now that I have almost everything I need, now I just need a way to have my next tasks, the actual audit tasks to check somewhere when a server is up (OK, run this task) or when a server is down (Don’t run, do something else), and to do that, I used Precedence Constraints which is really simple to use.

After the Execute SQL Task, I added 2 Precedence Constraints, 1 would go to the audit task and the other one would go to another script task that would reset the ConnFailed variable to False.

For the one that's going to the audit task, I set the Evaluation Operation to "Expression" and the Expression to: @ConnFailed==False. For the other one that was going to the reset variable script task, I set the Evaluation Operation to "Expression" and the Expression to: @ConnFailed==True.

This means that, if the ConnFailed variable was set to False, meaning, the first script was able to connect to the give server.database, then go to your next step (audit tasks...), if not, which would mean that the first script wasn't able to the connect to the given server.database, then just reset the variable to false and give me the next server on the list.

Observations:
One thing to keep in mind is that you have to reset the ConnFailed variables as the last step on your For Each Loop, otherwise they can get set to “True” once and you’re going to have wrong connectivity results on your SQLConnResults table and you’re going to mess up your whole process.

The script that I have on this post, will create custom messages on the output window of Visual Studio, so basically it would still tell you what servers it failed to connect to (server name and the whole connection string) so you can troubleshoot connectivity from there.

Hope this helps someone out there.

Regards,
Igor Santos