The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

debugging: stepping thru code

How does "stepping through code" work in IDEs? How do they implement this feature?
Tuesday, March 13, 2007
IDEs typically run command-line tools on your behalf and present the results in one or more IDE windows.  So the short answer is: "the same way you do."
JRobert Send private email
Tuesday, March 13, 2007
When you compile in "Debug Mode", code is inserted (and I believe a file is generated) that identifies for each source statement, what code is generated.

Then, when you click 'step' on a source statement, the next 'clump' of code is executed, until the code is reached indicating a 'statement' has been completed.  Execution of the code stops at that point, and control is returned to the debugger.  Also, the display of whatever variables have been modified are changed.

Why do you ask?
Tuesday, March 13, 2007
There is a map in EXE file:
Line Number -> Offset in EXE image

Once the EXE is loaded - the offsets are adjusted to match
the 'real' memory location. Now, when you step through code - you
usually step through some lines or just one line. Using that map
debugger knows the address at which to stop again and show the
variables, memory dumps, etc.
asmguru62 Send private email
Tuesday, March 13, 2007
> Why do you ask?

I just want to understand how the internals of a debugger work. I should perhaps take a look at gdb's source code.
Tuesday, March 13, 2007
Another technique is to implement a C interpreter in the debugger. Obviously it is very easy to step through the code but doesn't necessarily match what the compiler will produce.
Often used on embedded platforms especially where you are cross compiling.
Martin Send private email
Tuesday, March 13, 2007
At least on unix-like systems there is some assistance from the OS for debugging: there are OS APIs that allow one process to control execution of another process, set break points and monitor the CPU and memory state. The task of the debugger then becomes one of translating the CPU and menory state into a position in the source code and values for named variables. These same APIs are used by profilers to measure how much time is spent in different sections of your program.

The translation from machine state to source code state is achieved with some help from the compiler, which can be told to generate a mapping from memory offsets to source code lines and symbol names. This can be generated either as part of the executable or as a separate symbol file, depending on the platform. Most platforms today support debug symbols as part of the executable.

There was some reasonable coverage of the OS API for process debugging and monitoring in "The Design and Implementation of the 4.3 BSD UNIX Operating System" by Leffler, McKusick, Karels and Quarterman. There is also a newer edition of the same book for 4.4 BSD that covers this material in Chapter 4, Process Management.
Jeff Dutky Send private email
Tuesday, March 13, 2007
> How do they implement this feature?

I think they set a temporary "breakpoint" on the next statement and then run (which hits the breakpoint, which breaks back into the debugger, which clears the breakpoint).

Alternatively, some CPUs may support a single-step "trap flag" (in the CPU flags register), which breaks (signals a trap) after executing one instruction.
Christopher Wells Send private email
Tuesday, March 13, 2007
Suppose you have two lines of code,

1, ThisVar = 1;
2, print(ThisVar);

The debugger knows the lines take how much machine code. So when you step through line 1, the debugger will put a "int 3" before the very front of line 2, then the debugger resume the program, and the program will stop at line 2.

Here is only one method. Depending on the program type (machine code or byte code) and what information the compiler generates, other methods can be used.
Koms Bomb Send private email
Tuesday, March 13, 2007
> I just want to understand how the internals of a debugger work. I should perhaps take a look at gdb's source code.

So maybe you should learn some assembly language.
Koms Bomb Send private email
Tuesday, March 13, 2007
Holy cow, a whole discussion on debuggers with no words on injecting software interrupts.  It must be magic.
Tuesday, March 13, 2007
Well, the wikipedia entry recomends the book: "How Debuggers Work: Algorithms, Data Structures, and Architecture", which is $23 on Amazon. Or, you can dig through the related purchases until you find a text that you'd enjoy reading.
Thursday, March 15, 2007
Hey old.fart, I was just thinking the same thing.

coder, all the pieces are there in the previous responses, more or less. Here's how "step" works in a typical debugger.

When you compile in Debug mode, the compiler generates a table containing the address of the machine instructions that represent the start each line of code.

In order to "step", the program you're debugging needs to have "stopped" already, usually due to hitting a breakpoint or being halted by the debugger.

When you hit the "step" button, the debugger sets a breakpoint at the address which is the beginning of the "next line", and then lets the program being debugged start running again.

When the breakpoint is hit, the debugger automatically removes it, and you're ready to start the process over again.

There are some complications. If the instruction stream for the "current line" contains a branch instruction of some sort, then a breakpoint also has to be set at the destination of that branch, too. This is where the difference between "step in" and "step over" comes into play, as well. If there's a subroutine call in the "next line", "step in" will set a breakpoint at the start of the called function instead of at the "next line" in the current function.

As for how "setting a breakpoint" works, usually the debugger simply replaces the instruction at the target address with an instruction that causes the debugger to become active again. When the debugger becomes active, it puts the original instruction back.

Typically, a debugger will use a "software interrupt" instruction for this purpose. Many architectures have a defined opcode just for this - on x86 processors, int 3 is used for this. On other architectures, an invalid instruction is used, or a simple jump instruction that leads back into the debugger's code.

Some processors have registers that can be used to implement breakpoints without having to manipulate the instruction stream, or a special mode that causes a software interrupt to fire automatically after every instruction is executed.

In a multitasking environment, you can't often make use of these facilities, though. Having a breakpoint register still active after a task switch occurs is a recipe for chaos, so most OSes just disallow access to those registers from user programs.
Mark Bessey Send private email
Friday, March 16, 2007
Thank you all!
Friday, March 16, 2007

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz