Three functions are provided here:
fd_dump: lists all FDs
fd_dump_conn: lists all FDs holding a connection
fd_dump_listener: lists all FDs holding a listener
They take no argument, and dump some of the known info. E.g. for
a connection, ctrl, xprt, flags, mux, sessions, frontend's name
and session's age are reported. Example:
(gdb) fd_dump_conn
fd 31: rm=0 tm=0x2 um=0 st=0x21 refc=0x1 tkov=0 gen=0 conn=0x7fffe803b600: flg=0x300 err=0 ctrl=0xdf51c0 xprt=0xdf5c80 mux=0xbaeee0 sess=0x7ffff003b570: fe=0x1e45b00 id=foo age=0ms
They are particularly slow because they iterate over all possible FDs,
so better limit them to the desired types.
The thread_dump function dumps the list of known threads and a few info
on them (pointer, current run queue, flags etc). This should help more
easily spot a particular one and find stuck ones.
E.g:
(gdb) thread_dump
Tid 0: pth=0x7ffff7e797c0 mono=2222322327950732 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 1: pth=0x7ffff78d8640 mono=2222322327928085 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 2: pth=0x7ffff6b7e640 mono=2222322327927150 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 3: pth=0x7ffff637d640 mono=2222322327924878 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 4: pth=0x7ffff5b7c640 mono=2222322327925676 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 5: pth=0x7ffff537b640 mono=2222322327929524 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 6: pth=0x7ffff4b7a640 mono=2222322327926817 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
Tid 7: pth=0x7fffdffff640 mono=2222322327947960 now_ms=4294947291 fl=0x38 rq=-1 cq=0 current=(nil)
New functions task_dump_wq and task_dump_rq can be used to dump tasks
in a wait queue or in a run queue respectively. For the wait queue (the
most common usage), one needs to pass either the thread-local's timers,
or the thread group ones for shared tasks:
task_dump_wq &ha_tgroup_ctx[0].timers
task_dump_wq &ha_thread_ctx[0].timers
For the run queue, task_dump_rq will take the thread's rqueue:
task_dump_rq &ha_thread_ctx[0].rqueue
The output is the task pointer and a dump of the task* struct per line,
then a total count at the end.
The ebtree descent functions currently use $arg0 as is and it's up to
the user to manually type the required casts that are never obvious
(particularly when coming from a pointer). Let's put the eb_root* cast
in the function to be more user-friendly.
This utility takes in argument the path to a core dump, and it looks
for the archive signature of libraries embedded with "set-dumpable libs",
and either emits the offset and size of stdout, or directly dumps the
contents so that the tar file can be extracted directly by piping the
output to tar xf.
The pools memory usage calculation was done using ints by default, making
it harder to identify large ones. Let's switch to unsigned long for the
size calculations.
More and more often, core dumps retrieved on systems that build with
-fPIE by default are becoming unexploitable. Even functions and global
symbols get relocated and gdb cannot figure their final position.
Ironically the post_mortem struct lying in its own section that was
meant to ease its finding is not exempt from this problem.
The only remaining way is to inspect the core to search for the
post-mortem magic, figure its offset from the file and look up the
corresponding virtual address with objdump. This is quite a hassle.
This patch implements a simple utility that opens a 64-bit core dump,
scans the program headers looking for a data segment which contains
the post-mortem magic, and prints it on stdout. It also places the
"pm_init" command alone on its own line to ease copy-pasting into the
gdb console. With this, at least the other commands in this directory
work again and allow to inspect the program's state. E.g:
$ ./getpm core.57612
Found post-mortem magic in segment 5:
Core File Offset: 0xfc600 (0xd5000 + 0x27600)
Runtime VAddr: 0x5613e52b6600 (0x5613e528f000 + 0x27600)
Segment Size: 0x28000
In gdb, copy-paste this line:
pm_init 0x5613e52b6600
It's worth noting that the program has so few dependencies that it even
builds with nolibc, allowing to upload a static executable into containers
being debugged and lacking development tools and compilers. The build
procedure is indicated inthe source code.
These is a collection of functions I'm occasionally using to navigate
in core dumps. Only working ones were extracted.
Those requiring knowledge of global variables (e.g. pools, proxy list)
use the one extracted from the post_mortem struct. That one is defined
in post-mortem.gdb and needs to be initialized using "pm_init post_mortem"
or "pm_init <pointer>". From this point a number of global variables are
accessible even if symbols are missing; those ones are then used by other
functions to dump streams, threads, pools, proxies etc.
The files can be sourced or copy-pasted into a gdb session. It's worth
trying to keep them up-to-date, as the old ones used to navigate through
tasks are no longer usable due to massive changes.