========================================================================
Copyright (c) 2000 Jurriaan Kalkman
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1
or any later version published by the Free Software Foundation;
with no Invariant Sections, with no Front-Cover Texts, and with no
Back-Cover Texts. A copy of the license is available on the Internet
at http://www.gnu.org or is available on demand from the author.
========================================================================
So you've made this great rogue-like, and it seems to crash now and
again. Perhaps you're not as great a coder as you thought :-). Or,
you develop it under GNU/Linux, and most other people run it under
Windows, and it crashes.
If it crashes and you have the debugger GDB handy, you can type 'bt'
to get a nice stack-trace, so you can see what routines called the
one that caused the crash. This is often very useful information.
Consider the situation where you have a routine that changes some part
of the dungeon (lighting a square for example) and you find out that once
in a while it is called with an x-coordinate of 0, and this causes a
crash. It would be very useful to know from where this routine is called,
particularly if it is called from 60 places or so and you cannot check
them all.
There is a solution for that, if you use gcc. It works under any compiler
that is gcc-based, but requires signals to be available. These environ-
ments include any unix-system I know of, DOS (DJGPP) and win32
(cygwin32). OS/2 has a gcc-compiler, but it doesn't support signals, I've
been told.
1 Introduction
2 Signal handlers
3 What is on the stack
4 When to stop
5 Coding it
5.1 The windows 'get-an-address-off-the-stack macro'
5.2 The GNU/Linux 'get-an-address-off-the-stack macro'
5.2 The decoding routine for the addresses on the stack
5.3 The signal handler
5.4 Compilation
6 Sample output
1 Introduction
--------------
Gcc has certain builtin functions known as __builtin_frame_address
and __builtin_return_address.
These functions allow you to determine which functions were calling
the crashing function.
All examples and files are taken from Angband/64, which can be found
at http://www.xs4all.nl/~thunder7. Added to this article are bits
and pieces of files in the source-archive. If you want to know more,
read the whole source. Compile it, experiment with it, then go code
your own.
The beauty of this solution is that you can let your program read it's
own executable-file, and use the debug-information that is in there to
display what was going on at the moment of the crash. This in contrast
with my earlier attempt at this, where you needed an extra file, gene-
rated at compile time, to translate addresses into function names.
2 Signal handlers
-----------------
First of all, you need a signal-handler for signals like
SIGFPE (floating point error)
SIGKILL (^c or something like it)
SIGSEGV (pointer gone wild)
etc.
3 What is on the stack
----------------------
Then you need to find out what are the addresses on the stack.
This looks simple:
__builtin_return_address(0) is the current address (say function C)
__builtin_return_address(1) is the address in function B which called C
__builtin_return_address(2) is the address in function A which called B
these functions return 32 bits addresses. (except if you're on Alpha :-) )
4 When to stop
--------------
now the problem is when to stop, or, how deep is the stack?
In GNU/Linux and DOS, this is simple: as soon as __builtin_return_address
returns 0, the end is reached.
In win32 (or at least the cygwin32 cross-compiler I use here), this is
more of a problem, I've found out that there is no 0 at the end, it
simply crashes. So I've used the second builtin function there, called
__builtin_frame_address, and with trial and error found out that it seems
to work quite well if you make sure you only follow
__builtin_return_address as long as the upper 2 bytes from
__builtin_return_address match the upper 2 bytes from
__builtin_frame_address. This means you stay in the same frame. Now I'm
no expert, and I cannot explain why this is so. It works for me, YMMV.
5 Coding it
-----------
Grabbing these addresses should not be done with subroutines, because
that would introduce another frame on the stack :-). So there are some
huge macro-definitions needed.
Then we borrow (a lot of) code from the addr2line program in the GNU
binutils suite. The addr2line program prints out the name of the source-
file, the exact linenumber and the name of the function, when you supply
the name of the executable and the address. Both are known, so we are
in business.
I suggest first reading the source for the addr2line program. It's only
about 350 lines, and is really easy to understand. Then you'll be able to
see that all I added in files.c is just a simplification: I've deleted
the argument/option parsing, I've deleted the logic that let addr2line
handle files in a non-standard object-format, I've copied together some
procedures, and now there is just something like 85 lines left.
5.1 The windows 'get-an-address-off-the-stack macro'
----------------------------------------------------
#ifdef WINDOWS
#define handle_stack_address(X) \
if (baseframe == 0L) \
{ \
baseframe=(unsigned long)__builtin_frame_address(0); \
baseframe=((baseframe & 0xffff0000L) >> 16); \
dlog(DEBUGSAVE,"files.c: handle_signal_abort: baseframe now %08lx\n",
\
baseframe); \
} \
if (continue_stack_trace && \
((((unsigned long)__builtin_frame_address((X)) & 0xffff0000L)>>16)
== baseframe) && \
((X) < MAX_STACK_ADDR)) \
{ \
stack_addr[(X)]= (unsigned long)__builtin_return_address((X)); \
dlog(DEBUGSAVE,"files.c: handle_signal_abort: stack %d %08lx frame %d
%08lx\n", \
(X), __builtin_return_address((X)), (X),
__builtin_frame_address((X))); \
} \
else if (continue_stack_trace) \
{ \
continue_stack_trace = FALSE; \
}
#endif
note that we use baseframe to check if we stay in the same frame. This
is based upon experimentation at my side.
5.2 The GNU/Linux 'get-an-address-off-the-stack macro'
------------------------------------------------------
#define handle_stack_address(X) \
if (continue_stack_trace && ((unsigned long)__builtin_frame_address((X))
!= 0L) && ((X) < MAX_STACK_ADDR)) \
{ \
stack_addr[(X)]= (unsigned long)__builtin_return_address((X)); \
dlog(DEBUGSAVE,"files.c: handle_signal_abort: stack %d %08lx frame %d
%08lx\n", \
(X), __builtin_return_address((X)), (X),
__builtin_frame_address((X))); \
} \
else if (continue_stack_trace) \
{ \
continue_stack_trace = FALSE; \
}
#endif
5.2 The decoding routine for the addresses on the stack
-------------------------------------------------------
/* this is a adapted version of addr2line */
/* addr2line.c -- convert addresses to line number and function name
Copyright 1997, 98, 99, 2000 Free Software Foundation, Inc.
Contributed by Ulrich Lauther <Ulrich.Lauther@zfe.siemens.de>
This file is part of GNU Binutils.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Look for an address in a section. This is called via
bfd_map_over_sections. */
static void find_address_in_section (bfd *abfd, asection *section, PTR
data)
{
bfd_vma vma;
bfd_size_type size;
if (dump_found) return;
if ((bfd_get_section_flags (abfd, section) & SEC_ALLOC) == 0)
{
return;
}
vma = bfd_get_section_vma (abfd, section);
if (dump_pc < vma)
{
return;
}
size = bfd_get_section_size_before_reloc (section);
if (dump_pc >= vma + size)
{
return;
}
dump_found = bfd_find_nearest_line (abfd, section, dump_syms, dump_pc -
vma,
&dump_filename, &functionname,
&dump_line);
}
/* Read hexadecimal addresses from stdin, translate into
file_name:line_number and optionally function name. */
/* changed, it takes a single address as argument */
static void translate_address (bfd *abfd, int address,
char *function_name, char *source_name, int
*source_line)
{
char addr_hex[100];
sprintf(addr_hex,"%x", address);
dump_pc = bfd_scan_vma (addr_hex, NULL, 16);
dump_found = false;
bfd_map_over_sections (abfd, find_address_in_section, (PTR) NULL);
if (! dump_found)
{
strcpy(function_name, "??");
strcpy(source_name, "??");
*source_line = 0;
}
else
{
if (functionname == NULL || *functionname == '\0')
{
strcpy(function_name, "??");
if (dump_filename == NULL)
{
strcpy(source_name, "??");
}
else
{
strcpy(source_name, dump_filename);
}
*source_line = dump_line;
}
else
{
strcpy(function_name, functionname);
if (dump_filename == NULL)
{
strcpy(source_name, "??");
}
else
{
strcpy(source_name, dump_filename);
}
*source_line = dump_line;
}
/* fflush() is essential for using this command as a server
child process that reads addresses from a pipe and responds
with line number information, processing one address at a
time. */
}
fflush (stdout);
}
void dump_stack(void)
{
s16b i;
static unsigned long stack_addr[MAX_STACK_ADDR];
bfd *abfd;
char **matching;
long storage;
long symcount;
bfd_init();
/* clean the stack addresses if necessary */
for (i=0; i < MAX_STACK_ADDR; i++)
{
stack_addr[i] = (unsigned long)0;
}
handle_stack_address(0); handle_stack_address(1);
handle_stack_address(2); handle_stack_address(3);
handle_stack_address(4); handle_stack_address(5);
handle_stack_address(6); handle_stack_address(7);
handle_stack_address(8); handle_stack_address(9);
handle_stack_address(10); handle_stack_address(11);
handle_stack_address(12); handle_stack_address(13);
handle_stack_address(14); handle_stack_address(15);
handle_stack_address(16); handle_stack_address(17);
handle_stack_address(18); handle_stack_address(19);
handle_stack_address(20); handle_stack_address(21);
handle_stack_address(22); handle_stack_address(23);
handle_stack_address(24); handle_stack_address(25);
handle_stack_address(26); handle_stack_address(27);
handle_stack_address(28); handle_stack_address(29);
handle_stack_address(30); handle_stack_address(31);
handle_stack_address(32); handle_stack_address(33);
handle_stack_address(34); handle_stack_address(35);
handle_stack_address(36); handle_stack_address(37);
handle_stack_address(38); handle_stack_address(38);
/* dump stack frame */
i = MAX_STACK_ADDR-1;
while ( (i >=0) && (stack_addr[i] == 0)) i--;
if (i < 0)
{
dlog(DEBUGALWAYS,"files.c: dump_stack: unable to get any addresses
off the stack.\n");
return;
}
abfd = bfd_openr (argv0, NULL);
if (abfd == NULL)
{
cptr errmsg = bfd_errmsg( bfd_get_error() );
dlog(DEBUGALWAYS,"files.c: dump_stack: abfd == NULL; bfd error =
%s\n", errmsg);
return;
}
if (bfd_check_format (abfd, bfd_archive))
{
cptr errmsg = bfd_errmsg( bfd_get_error() );
dlog(DEBUGALWAYS,"files.c: dump_stack: bfd_check_format return-value
!= 0; bfd error = %s\n", errmsg);
return;
}
if (! bfd_check_format_matches (abfd, bfd_object, &matching))
{
cptr errmsg = bfd_errmsg( bfd_get_error() );
dlog(DEBUGALWAYS,"files.c: dump_stack: format doesn't match; bfd
error = %s\n", errmsg);
return;
}
if ((bfd_get_file_flags (abfd) & HAS_SYMS) == 0)
return;
storage = bfd_get_symtab_upper_bound (abfd);
if (storage < 0)
{
cptr errmsg = bfd_errmsg( bfd_get_error() );
dlog(DEBUGALWAYS,"files.c: dump_stack: storage < 0; bfd error =
%s\n", errmsg);
return;
}
dump_syms = (asymbol **) xmalloc (storage);
symcount = bfd_canonicalize_symtab (abfd, dump_syms);
if (symcount < 0)
{
cptr errmsg = bfd_errmsg( bfd_get_error() );
dlog(DEBUGALWAYS,"files.c: dump_stack: symcount < 0; bfd error =
%s\n", errmsg);
return;
}
for (; i>=0 ; i--)
{
{
char function_name[2048];
char source_name[2048];
int source_line;
translate_address (abfd, stack_addr[i], function_name,
source_name, &source_line);
dlog(DEBUGALWAYS,"files.c: stack_dump: stack frame %2d address
%08lx = %s (%s %d) \n",
i, stack_addr[i], function_name, source_name,
source_line);
}
}
/* and cleaning up afterwards */
if (dump_syms != NULL)
{
free (dump_syms);
dump_syms = NULL;
}
bfd_close (abfd);
}
5.3 The signal handler
----------------------
Now define some signal handlers like this:
#ifdef SIGFPE
(void)signal(SIGFPE, handle_signal_abort);
#endif
#ifdef SIGILL
(void)signal(SIGILL, handle_signal_abort);
#endif
#ifdef SIG
TRAP
(void)signal(SIGTRAP, handle_signal_abort);
#endif
#ifdef SIGIOT
(void)signal(SIGIOT, handle_signal_abort);
#endif
#ifdef SIGKILL
(void)signal(SIGKILL, handle_signal_abort);
#endif
and define a function handle_signal_abort like:
static void handle_signal_abort(int sig)
{
bool save_ok = FALSE;
FILE *fff = NULL;
char filename[1024];
s16b i;
bool dump_ok = FALSE;
/* Clear the bottom lines */
Term_erase(0, 20, 80);
Term_erase(0, 21, 80);
Term_erase(0, 22, 80);
/* Give a warning */
Term_putstr(1, 20, -1, TERM_RED, "You suddenly see a gruesome SOFTWARE
BUG leap for your throat!");
Term_xtra(TERM_XTRA_NOISE, 0);
/* Access the help file */
strcpy(filename, ANGBAND_DIR_USER);
strcat(filename, "crash.txt");
#if defined(MACINTOSH) && !defined(applec)
/* Global -- "text file" */
_ftype = 'TEXT';
#endif
/* Drop priv's */
safe_setuid_drop();
/* Open the non-existing file */
fff = my_fopen(filename, "w");
/* Grab priv's */
safe_setuid_grab();
/* Invalid file */
if (fff)
{
fprintf(fff,"Your game has just crashed. Please forward the
following\n");
fprintf(fff,"information to the maintainer (email to
thunder7@xs4all.nl)\n\n");
fprintf(fff,"\nAlso, please add any information you feel is
relevant:\n");
fprintf(fff,"especially, what were you doing at the time this
happened?\n\n");
fprintf(fff,"Angband/64 beta %d release %d (%d.%d.%d)\n\n",
VERSION_BETA, VERSION_RELEASE, VERSION_MAJOR,
VERSION_MINOR, VERSION_PATCH);
fprintf(fff,"STACK TRACE:\n\n");
dump_stack();
fprintf(fff,"\nCONFIGURATION:\n\n");
fprintf(fff, "debuglevel 0x%08lx\n", debuglevel);
#ifdef ALLOW_COMPRESSION
fprintf(fff,"compression support compiled in\n");
#endif
and so on. An emergency-save routine is also nice to have here!
At the end, make sure your program ends with something like exit(1).
5.4 Compilation
---------------
You'll need to make sure bfd.h can be found when compiling, and
link with -lbfd -liberty. I won't deny that the last two can be
a bit of a bother. They come out of the binutils-suite, which can
be found on any gnu-repository. They are, however, *not* included in
binary distributions of binutils, you'll have to build your own from
source. On GNU/Linux building binutils is straightforward
(./configure; make; make install) but I had to go to some lengths to get
those libraries for go32 and cygwin32. Ah well, sensible people use
GNU/Linux anyway.
6 Sample output
----------------
Your game has just crashed. Please forward the following
information to the maintainer (email to thunder7@xs4all.nl)
Also, please add any information you feel is relevant:
especially, what were you doing at the time this happened?
Angband/64 beta 5 release 5 (2.7.10)
STACK TRACE:
stack frame 10 address 0804abb1 = _start (?? 0)
stack frame 9 address 401fb213 = ?? (?? 0)
stack frame 8 address 08117a69 = main
(/home/jurriaan/games/myang/src/main.c 640)
stack frame 7 address 0810b4b7 = play_game
(/home/jurriaan/games/myang/src/dungeon.c 2084)
stack frame 6 address 0810a355 = handle_dungeon
(/home/jurriaan/games/myang/src/dungeon.c 1386)
stack frame 5 address 08109ee7 = process_player
(/home/jurriaan/games/myang/src/dungeon.c 1116)
stack frame 4 address 08109545 = process_command
(/home/jurriaan/games/myang/src/dungeon.c 717)
stack frame 3 address 080f8565 = do_cmd_wizard
(/home/jurriaan/games/myang/src/wizard2.c 3264)
stack frame 2 address 080f72fb = wiz_create_crash
(/home/jurriaan/games/myang/src/wizard2.c 2411)
stack frame 1 address 402019b8 = ?? (?? 0)
stack frame 0 address 080b1b10 = handle_signal_abort
(/home/jurriaan/games/myang/src/files.c 4302)
CONFIGURATION:
debuglevel 0x80000040
compression support compiled in
using other RNG
monster flow support compiled in
DISPLAY MODULES:
main-xaw compiled in
main-x11 compiled in
main-gcu compiled in
OPTIONS SET:
Quick messages
Display coordinates on screen
Print experience needed to advance
Auto open doors when colliding
Pick things up by default
<etc. Angband/64 has a *lot* of options.>
========================================================================
|