Debugging startup code of services and COM servers

Updated: 01.08.2006

Introduction
Image File Execution Options (IFEO)
DebugBreak
Tracing
Remote debugging with WinDbg and NTSD
StartDbg
What about COM servers?
Startup timeouts

Introduction

How do we usually debug a service? In most cases, we use Services applet to start the service process, and then attach Visual Studio or WinDbg debugger to it. We can suspend the service, check call stacks, set breakpoints, then let the service continue. Breakpoints will be hit, and we will be able to debug the service just like any other application.

But what if we need to debug the code executed by the service when it is starting up? For example, the service’s main() function, or constructors of global C++ objects? It takes time to attach debugger to the service, and by the time when it has finally attached, the startup code has most likely been already executed. So it is too late.

Is it possible to attach debugger to the service in some other way, so that we could debug the startup code? Yes, it is possible. In this article we will explore how to debug the startup code of a service, and also we will briefly take a look at a related problem – debugging the startup code of out-of-process COM servers.

Image File Execution Options (IFEO)

Is it possible to start the service under debugger, just like a “normal” application? If we could do it, it would be possible to debug the startup code of the service. At first glance, it looks like it cannot be done, because services are started by a dedicated application – Service Control Manager (SCM). But fortunately, the operating system provides a way to intercept the attempts to launch any Win32 application, and start another application instead. This feature can be used to start the service under debugger.

In Registry, there is a special key called Image File Execution Options (IFEO, for short). The full name of this key is HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options. If we want to start our service (or any other application) under debugger, we should create a subkey of this key, named after our service’s executable name (e.g. if our service’s executable is myservice.exe, the subkey should also be named “myservice.exe”). Under this subkey, we should create an entry, called “Debugger”, and use it to specify the path to our debugger’s main executable file. After the path, we can provide the command line parameters needed by the debugger in order to start our application.

For example, if we want to start our service under Visual Studio 2003 debugger, the Registry setup could look like this:

HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\myservice.exe
  Debugger = "c:\progs\msvs\common7\ide\devenv.exe /debugexe" (REG_SZ)

(note that on some operating system versions there can be a limit on the length of this Registry setting, and it is recommended to keep the path to the debugger's executable as short as possible; in some cases, 'subst' command can be used as a workaround to reduce the length of the path)

Next time we attempt to start our service, the operating system will start the executable specified in "Debugger" entry instead, passing it all the command line parameters specified in the same entry, plus the path to the executable file of our service application. It means that if we would configure IFEO to start our service (as shown above), and if our service's executable file is c:\myapps\myservice.exe, the operating system would launch VS2003 debugger (devenv.exe) with the following parameters:

/debugexe c:\myapps\myservice.exe

After the debugger has started, we can set breakpoints in the necessary places, and then press F5 or F10 to finally start our service executable (the debugger knows which executable to start, because the path to the executable has been passed to the debugger on the command line). Now we can debug any part of our service's code, including the startup.

If you use Visual Studio 2005 debugger, “Debugger” setting should be set to “vsjitdebugger.exe”, without path. If you use Visual Studio 6 debugger, the setting should be set to “<path_to_msdev_exe>\msdev.exe”.

Visual Studio 2005 debugger works in a bit different way from the older debuggers. When vsjitdebugger.exe is launched via IFEO, it gives the user an option to use an existing instance of Visual Studio to debug the service process, or start a new instance. After the user has made the choice, the debugger will be attached, but the service process is not suspended and continues running. Therefore we do not have an opportunity to set breakpoints and then resume the process. If we want to debug the startup code of the service, the only acceptable option is to select an existing instance of Visual Studio, with all the necessary source files already open, and all breakpoints already set.

Problem solved? Unfortunately, not. It turns out that we can only use this method to debug services that are allowed to “interact with desktop”. That is, only services that run under LocalSystem account, and only if "Allow service to interact with desktop" option is checked in the service's properties.

Service properties

If our service is registered to run under other user account than LocalSystem, IFEO approach cannot be used to launch an interactive debugger like Visual Studio. Yes, the debugger will start, but we will not be able to see it, because it will be connected to a non-interactive desktop. Therefore we have to look elsewhere if we need to debug the startup code of a service running under a non-LocalSystem account.

DebugBreak

For some services, the old friend DebugBreak can offer a good solution. Just insert a call to DebugBreak (or __debugbreak, or simply __asm int 3) at the beginning of the service's main function. When the service starts, DebugBreak will be called, and it will ask the system to attach the debugger to our service. Will it? Or will not?

The problem with DebugBreak approach is that it relies heavily on the operating system's just-in-time (JIT) debugger feature. When DebugBreak is called, it raises EXCEPTION_BREAKPOINT exception. When raised in the service's main function, the exception will go unhandled, and thus it will reach the system-provided trap for unhandled exceptions – kernel32!UnhandledExceptionFilter function (here you can find more details about it).

If there is a registered JIT debugger in the system, UnhandledExceptionFilter will start it and attach to our service. The debugger can then dismiss the exception and let the service continue. Therefore, if we want to use Visual Studio or WinDbg debugger to debug our service, we should register it as the just-in-time debugger. Here is how to do it:

Visual Studio 2003 and 2005:
Tools | Options | Debugging | Just-In-Time | check "Native"
Visual Studio 6.0:
Tools | Options | Debug | check "Just-in-time debugging"
WinDbg:
run "windbg -I" on the command line

Why did I say that DebugBreak's reliance on the JIT debugger is, in fact, a problem? Because just-in-time debugging feature in Windows is designed so that JIT debuggers can operate reliably only if they are started by an application running under an administrative user account. If our service is running under a non-administrative user account (e.g. LocalService or NetworkService), DebugBreak will not be able to launch JIT debugger for it. As a result, DebugBreak – based approach can be used only with services running under LocalSystem or another user account with administrative rights.

The reason why JIT debuggers cannot work under non-administrative user accounts is that UnhandledExceptionFilter function forces the JIT debugger to attach to WinSta0\Default desktop. Non-administrative user accounts do not have enough rights to access this desktop, and, as a result, the JIT debugger usually fails in the early startup phases (during initialization of user32.dll, or console creation). NTSD debugger with -noio option is an exception from this rule, as described here.

Tracing

So we need to debug the startup code of a service running under a non-administrative user account (e.g. LocalService or NetworkService), but IFEO and DebugBreak approaches cannot help us. What to do? The simplest answer is, of course, tracing. If the service's startup code is peppered with trace statements, we can send the tracing output to a file, or capture it with DebugView, and thus debug. I like tracing, but, of course, it cannot replace live debugging in some cases. Lets look for better ways.

Remote debugging with WinDbg and NTSD

You probably know that NTSD and WinDbg debuggers (both part of Debugging Tools for Windows) form a powerful combination for remote debugging scenarios. You can start NTSD on a remote system, and then connect to it from WinDbg, using sockets or pipes. What is interesting is that the same configuration can be used on a single machine, when we need to debug processes running in non-interactive sessions. Yes, for example, services. Image File Execution Options can help us here again. Lets assume that Debugging Tools package is installed on our system in c:\dbgtools directory. Now we can use IFEO to configure our service to start under NTSD debugger:

HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\myservice.exe
  Debugger = "c:\dbgtools\ntsd.exe -server tcp:port=5001:6000" (REG_SZ)

Note “-server tcp:port=5001:6000” options – they ask NTSD debugger to select any free port in the specified range (5001..6000 in this case) and start listening for a socket connection.

When we try to start our service, NTSD is launched instead. Now we can use WinDbg to connect to NTSD using sockets and debug:

>windbg.exe -remote tcp:server=localhost,port=5001:6000

After WinDbg has connected to NTSD, we can set breakpoints wherever necessary, and let the service process continue. Now we can debug the startup code of our service even if it is running under a non-administrative user account.

Finally it looks like we have found the ultimate solution? Yes, almost. The only two problems I see with this approach are the following:

  • We have to use WinDbg and cannot use Visual Studio debugger
  • After we have started our service under NTSD, we cannot opt out of debugging; we have to connect WinDbg to NTSD and manually let our service run, even if this time we do not want to debug it

StartDbg

Being a long time fan of Visual Studio debugger (mostly for its productive and logical user interface), I wanted to use it to debug services of any kind, including those that run under non-administrative user accounts. Therefore, I wrote a simple tool, StartDbg, which allows me to do it.

The idea behind StartDbg is simple – we start it instead of our target service, and use it to start our service in suspended state, so that we could have enough time to attach Visual Studio debugger before our service's startup code starts executing. Assume that our service's ImagePath entry in Registry currently contains "c:\myapps\myservice.exe". To use StartDbg, we change it to "c:\startdbg\startdbg.exe c:\myapps\myservice.exe". As a result, when we start our service (for example, using Services applet), startdbg.exe will be started instead, and will receive "c:\myapps\myservice.exe" as a command line parameter.

Here is the list of actions performed by StartDbg:

  • Start the service using CreateProcess function, in suspended state (using CREATE_SUSPENDED flag).
  • Inject a small piece of code into the service process' address space.
  • Hijack the service's main executable's entry point and modify it to point to the injected piece of code.
  • Resume the service (it will pass control to the injected piece of code).
  • The injected code will suspend the service again, this time by calling Sleep function with a predefined timeout. While the service is suspended, we can attach the debugger to the service's process and set breakpoints.
  • After the timeout has expired, injected code returns from the sleeping state and lets the service continue normally by passing control to the original entry point.

(The real implementation is a bit more complicated than that, to provide better status reporting)

Here you can find more information about StartDbg, including detailed usage instructions, as well as its source code.

StartDbg is not without limitations, though – it does not allow us to debug the entry points of the DLLs loaded by the service at startup (and we have to fall back to NTSD/WinDbg combination if we need to do it).

What about COM servers?

While I have been concentrating on services so far, exactly the same can be said about out-of-process COM servers. If we need to debug the startup code of an out-of-process COM server, we can use the same range of techniques (IFEO, DebugBreak, NTSD/WinDbg and StartDbg).

Startup timeouts

If you have already tried one of the approaches described in this article (especially IFEO and DebugBreak), you might have noticed a strange effect: if you do not let the service process continue after 2 minutes since its startup has begun, the service process is terminated. Why is that? Because Service Control Manager (SCM) only gives a service 2 minutes to start up and call StartServiceCtrlDispatcher function. If the service does not call StartServiceCtrlDispatcher in 2 minutes, SCM assumes that the service process is hung, and terminates it. If we need to debug complicated startup code (e.g. if we are working on a framework for rapid development of services, or trying to find a bug in the code called by the constructor of a global object), we might need more than 2 minutes for that.

Fortunately, this timeout can be changed in Registry:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control
  ServicesPipeTimeout = 1200000 (REG_DWORD, in milliseconds)

Setting this value to, for example, 1200000 will give us 20 minutes of debugging time. System restart is required to put the new value into effect.

Another possible source of unwanted timeouts when debugging the startup code of a service is Services applet. There is a simple solution – use "net start" command to start the service.

A similar problem exists for out-of-process COM servers. After the COM server has been launched, it has only 2 minutes to call CoRegisterClassObject function to report that it has started successfully. If CoRegisterClassObject has not been called in 2 minutes, the server's process will be terminated. Unfortunately, there is no Registry setting that could be used to change this timeout. But with the help of some reverse engineering, it is not difficult to determine that the timeout's value is stored in the variable called rpcss!gServerStartTimeout in the COM Service Control Manager's process (it is one of several processes called "svchost" - the one that is started with "-k rpcss" command line parameters).

After we have determined which process contains the COM SCM (you can use e.g. Process Explorer to find svchost process started with "-k rpcss" parameters), we can use the following CDB command to modify the value of this variable:

>cdb -pv -p <pid> -c "ed rpcss!gServerStartTimeout 0n180000;q"

(if necessary, more information about using CDB can be found in this article)

Contact

Have questions or comments? Free free to contact Oleg Starodumov at firstname@debuginfo.com.