Scripted Restart of a Hanging Windows Service

I’ve been having trouble lately with Adobe ColdFusion 9, in particular an ODBC connection to an Oracle 10g database.  The ColdFusion ODBC Server (the swsoc.exe process) is hanging under load, not only failing to return some queries, but permanently hanging one of its threads. Once they are all hung, the service doesn’t respond at all, and ColdFusion hangs as well.

Temporary solution (until I rewrite the app to use SQL Server) is to restart the ODBC Server service. I’ve been doing that manually when my monitor informs me the site is not responding, maybe 3-4 times per day. But I want to see how it performs if I proactively restart the service every hour. For that, we require some scripting.

The biggest issue is that Windows’ service start/stop/restart interfaces, whether the traditional “net stop” or the PowerShell “Stop-Service” commands, has a very long timeout for stopping a service (looks to be 60 seconds). In the manual case, if it doesn’t stop right away, I go kill the swsoc.exe process and the service restart continues very quickly.  But how to script this?

The trick is to put the restart request in a background job (idea found here), and check on it in the foreground to see if it was successful.  After waiting a shorter period (10 seconds), if it is still in the “stopping” state, then we can kill the process outright and let the restart commence.  Double check (after 5 more seconds) that the service has started, and if it is stopped (the restart failed), specifically start it again.

Start-Job -ScriptBlock {Restart-Service -Name "ColdFusion 9 ODBC Server" -Force }

#give it 5 seconds to stop
Start-Sleep -Seconds 10

$SERVICESTATE = (Get-Service | where{$_.Name -eq "ColdFusion 9 ODBC Server"}).Status
if( $SERVICESTATE -eq "Stopping" -or $SERVICESTATE -eq "StopPending")
{
    # still stopping so force process stop
    Stop-Process -Name "swsoc" -Force
} 

#give it 5 seconds to start before we try it again
Start-Sleep -Seconds 5

$SERVICESTATE = (Get-Service | where{$_.Name -eq "ColdFusion 9 ODBC Server"}).Status
if( $SERVICESTATE -eq "Stopped" )
{
    Start-Service -Name "ColdFusion 9 ODBC Server" -Force
}

Save it as a .ps1 file. Make sure PowerShell allows execution of local scripts (in PowerShell, run “set-executionpolicy remotesigned”).

To schedule this to run, create a new scheduled task:

  • Triggered daily at a particular time, repeat every hour for 1 day.
  • Action is to run a program:
    “C:WindowsSystem32WindowsPowerShellv1.0powershell.exe”
  • Arguments contain the script name: “-File c:PathToScript.ps1”
  • Set to run as Administrator, with highest permissions

Dirty Solutions to Tricky Problems

So a client has an Exchange 2003 server that routinely gums up after BackupExec does its work. This thing has defied all manner of troubleshooting, with regard to antivirus, disk location, server utilization, etc., so the only remaining solution is to restart the information store service every morning. (Yes, I know, we really should figure out what the problem is.)

Instead of making the IT person get up every morning at 7 am to do it, how ’bout a little scripting magic? Windows is no UNIX, but we can try.

First, some useful commands to stop and start a service:

net stop MSExchangeIS /y
net start MSExchangeIS /y

Works peachy if the service is actually responding. When it’s stuck, it doesn’t stop on request. You have to kill store.exe in Task Manager. But how do you script that? With PsTools, silly!

So in between that stop and start request, we add:

pskill store /accepteula

Make sure pskill is somewhere in the path of the user executing the batch file. The /accepteula switch is to prevent it from sticking at the EULA which pops up on first use, and perhaps again? — but since this is automated, you’d never know that it stuck, just that your information store never restarted.

Important here, by the way, is to try to stop the service before you kill it. That way if the thing is responding, we don’t send it the shock of a rough shutdown. Pskill will fail gracefully if the service is already stopped.

Put these bad boys in a batch file and run it after the backup completes. Presto change-o, an information store that is ready for the business day.

A side note: It seems that running Exchange on a Domain Controller is a bad idea. But this is Windows Small Business Server, so that’s exactly what we have. One major problem is that shutting the system down takes a full 30 minutes, because Windows kills Active Directory before Exchange and it sits spinning its wheels not knowing AD will never respond. Possible solution (not tested yet) is to script an Exchange shutdown by group policy before Windows itself starts shutdown. This one is for implementation another day…