山高疑日近,海阔觉天低

GDB - Core Dumps

Outcome

Able to enable core dumps and debug them using GDB.

Introduction

When a program crashes, the memory and some state information at that point can be placed in a core dump file. A common cause of a core dump generation is a segmentation fault, which is caused by attempting to access an illegal memory location. This can include use after free, buffer overflow, and dereferencing the NULL pointer. GDB can be used to read the core dump file and view some useful information about the program when it crashed.

Dumping core is also very useful in situations where faults occur intermittently. It allows you to inspect what might have happened even in situations where the fault is difficult to trigger.

 

Core Dump Settings

To enable core dumps, first check the maximum core dump size:

$ ulimit -c

If the result of this is zero (i.e. no core dump will be produced), set the limit to the maximum,也可以设置成其他值,比如ulimit -c 10000意思core最大10MB,超过这个大小Core会被裁剪:

$ ulimit -c unlimited

A core dump will now be generated and placed in the location specified by /proc/sys/kernel/core_pattern. Check this location by running:

$ cat /proc/sys/kernel/core_pattern

On CSE systems (and many other systems), the default settings result in the output:

core

This means that any core dumps will be placed in the current directory in a file named core.

You can change this location using:

解释链接

$ echo %e.core.%p.%s.%t>/proc/sys/kernel/core_pattern
//获得如下结果:app名字.core.pid.signalNum.time
M2202.core.2040.11.1657722153

Generating a Core Dump

Compile the code for use with GDB.也可以不带”g”但这样的话没有符号表,只能确定在那个函数崩溃,不能具体定位到哪一行

$ gcc -g <any other flags> -o file_name file_name.c

Run the program as normal

$ ./<file_name>
Segmentation fault (core dumped)

An error message like the one above should appear if the program crashes.

Starting a GDB session

Start a GDB session with the program binary and coredump file

$ gdb <binary-file> <core-dump-file>

GDB is helpful to inspect the stack frame and the state of variables and registers when the program crashed. Commands such as whereupdownprintinfo localsinfo argsinfo registers and list can be helpful in this situation.where≈bt,所以不用where,up与down用来导航stack frame,info locals显示本地变量

详细解释

It is useful to remember that, while debugging core dumps, the program is not actually running, so commands related to the execution of the program such as stepnext and continue are unavailable.

Coredumps and WSL

Core dumps are currently unavailable on WSL (see the github issue for more details). An alternative option is to run the program in gdb and have it crash (which provides access to similar commands such as where and info locals).


Example

In this example, we will be debugging a code that creates a linked list then prints it out. During the execution of the code, however, a segmentation fault is generated. We will inspect the corresponding core dump to determine the source of the bug.

broken_linked_list.c

//Makes a linked list of length 7 and prints it out
#include 
#include 
#include 


struct node {
    int data;
    struct node *next;
};

struct node *create_node(int data);
struct node *create_list(int length);
void print_list(struct node *list, int length);

int main(void){
    int length1 = 7;
    struct node *list1 = create_list(length1);
    print_list(list1, length1);

    return 0;
}

struct node *create_node(int data){
    struct node *new = malloc(sizeof(struct node));
    assert(new != NULL);
    new->data = data;
    new->next = NULL;
    return new;
}

struct node *create_list(int length) {

    struct node *head = NULL;
    if (length > 0) {
        head = create_node(0);
        int i = 1;
        struct node *curr = head;
        while (i < length) { curr->next = create_node(i);
            curr = curr->next;
            i++;
        }
    }
    return head;
}

void print_list(struct node *list, int length){
    struct node *curr = list;
    int i = 0;
    while (i <= length) { printf("%d->", curr->data);
        curr = curr->next;
        i++;
    }
    printf("X\n");
}

Note

It is assumed that you have the knowledge introduced in the Basic Use, Breakoints, Viewing Data and Navigating Your Program modules.

When the program above is compiled and run, the following output is produced:

$ gcc -g -o broken_linked_list broken_linked_list.c
$ ./broken_linked_list
Segmentation fault (core dumped)

This ouput means that the program crashed because it accessed a part of memory that it is not allowed to.

First, we want to find the line that it crashed on. There should now be a file called core inside the current directory (if not, see the Core Dump Settings section).

Start a GDB session for the core dump.

$ gdb broken_linked_list core

Immediately, GDB will output the line it crashed on.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055be9593e283 in print_list (list=0x55be96c20260, length=7)
    at broken_linked_list.c:51
51          printf("%d->", curr->data);

We now know from this output that invalid memory was accessed on line 51, so we look at the memory that is accessed on that line. There is one memory access on this line curr->data, so we are either not allowed to dereference curr or we are not allowed to read data. Let’s print out the current pointer.

(gdb) print curr
$1 = (struct node *) 0x0

We know that we are not allowed to dereference the NULL (zero) pointer so we have found why our program has segfaulted. However, we are not sure about why it is dereferencing a NULL pointer. Let’s look at the local variables and see if they hold any clues.

(gdb) info locals
curr = 0x0
i = 7

When the program crashed, i is 7, which means it is on the 8th iteration of the loop. Our linked list is only 7 nodes long so it should never reach ‘node 8’. If we have constructed our linked list correctly the ‘8th node’ is a NULL pointer.

Let’s check out some variables, such as the arguments passed into the fuctions. ::
(gdb) info args list = 0x55be96c20260 length = 7

Our linked list is indeed 7 nodes long, and we can check that list is a valid pointer by printing the dereferenced struct.

(gdb) print *list
$2 = {data = 0, next = 0x55be96c20280}

We know the arguments are correct, so the issue must be inside the function.

We can use list to look at the code around the current line.

(gdb) list
46
47  void print_list(struct node *list, int length){
48      struct node *curr = list;
49      int i = 0;
50      while (i <= length) {
51          printf("%d->", curr->data);
52          curr = curr->next;
53          i++;
54      }
55      printf("X\n");

 

赞(0) 打赏
未经允许不得转载:Mr.Zhang » GDB - Core Dumps

你的打赏是我的动力

登录

找回密码

注册