CWYAlpha

Just another WordPress.com site

Thought this was cool: Learnng C with GDB – Blog – Hacker School

leave a comment »


Comments: “Learnng C with GDB – Blog – Hacker School”

URL: https://www.hackerschool.com/blog/5-learning-c-with-gdb

Learning C with gdb

Aug 27, 2012

Coming from a background in higher-level languages like Ruby, Scheme, or
Haskell, learning C can be challenging. In addition to having to wrestle with
C’s lower-level features like manual memory management and pointers, you have
to make do without a REPL. Once you get used to exploratory programming in a
REPL, having to deal with the write-compile-run loop is a bit of a bummer.

It occurred to me recently that I could use gdb as a pseudo-REPL for C.
I’ve been experimenting with using gdb as a tool for learning C, rather
than merely debugging C, and it’s a lot of fun.

My goal in this post is to show you that gdb is a great tool for learning
C. I’ll introduce you to a few of my favorite gdb commands, and then I’ll
demonstrate how you can use gdb to understand a notoriously tricky part of C:
the difference between arrays and pointers.

An introduction to gdb

Start by creating the following little C program, minimal.c:

int main()
{
 int i = 1337;
 return 0;
}

Note that the program does nothing and has not a single printf statement.
Behold the brave new world of learning C with gdb!

Compile it with the -g flag so that gdb has debug information to work
with, and then feed it to gdb:

$ gcc -g minimal.c -o minimal
$ gdb minimal

You should now find yourself at a rather stark gdb prompt. I promised you
a REPL, so here goes:

(gdb) print 1 + 2
$1 = 3

Amazing! print is a built-in gdb command that prints the evaluation of a
C expression. If you’re unsure of what a gdb command does, try running help
name-of-the-command
at the gdb prompt.

Here’s a somewhat more interesting example:

(gbd) print (int) 2147483648
$2 = -2147483648

I’m going to ignore why 2147483648 == -2147483648; the point is that even
arithmetic can be tricky in C, and gdb understands C arithmetic.

Let’s now set a breakpoint in the main function and start the program:

(gdb) break main
(gdb) run

The program is now paused on line 3, just before i gets initialized.
Interestingly, even though i hasn’t been initialized yet, we can still look
at its value using the print commnd:

(gdb) print i
$3 = 32767

In C, the value of an uninitialized local variable is undefined, so gdb
might print something different for you!

We can execute the current line with the next command:

(gdb) next
(gdb) print i
$4 = 1337

Examining memory with x

Variables in C label contiguous chunks of memory. A variable’s chunk is
characterized by two numbers:

The numerical address of the first byte in the chunk.
The size of the chunk, measured in bytes. The size of a variable’s
chunk is determined by the variable’s type.

One of the distinctive features of C is that you have direct access to a
variable’s chunk of memory. The & operator computes a variable’s address,
and the sizeof operator computes a variable’s size in memory.

You can play around with both concepts in gdb:

(gdb) print &i
$5 = (int *) 0x7fff5fbff584
(gdb) print sizeof(i)
$6 = 4

In words, this says that i‘s chunk of memory starts at address
0x7fff5fbff5b4 and takes up four bytes of memory.

I mentioned above that a variable’s size in memory is determined by its
type, and indeed, the sizeof operator can operate directly on types:

(gdb) print sizeof(int)
$7 = 4
(gdb) print sizeof(double)
$8 = 8

This means that, on my machine at least, int variables take up four
bytes of space and double variables take up eight.

Gdb comes with a powerful tool for directly examing memory: the x
command. The x command examines memory, starting at a particular address.
It comes with a number of formatting commands that provide precise control
over how many bytes you’d like to examine and how you’d like to print them;
when in doubt, try running help x at the gdb prompt.

The & operator computes a variable’s address, so that means we can feed
&i to x and thereby take a look at the raw bytes underlying i‘s value:

(gdb) x/4xb &i
0x7fff5fbff584: 0x39 0x05 0x00 0x00

The flags indicate that I want to examine 4 values, formatted as hex
numerals, one byte at a time. I’ve chosen to examine four bytes because
i‘s size in memory is four bytes; the printout shows i‘s raw byte-by-byte
representation in memory.

One subtlety to bear in mind with raw byte-by-byte examinations is that on
Intel machines, bytes are stored in “little-endian” order: unlike human
notation, the least significant bytes of a number come first in memory.

One way to clarify the issue would be to give i a more interesting value
and then re-examine its chunk of memory:

(gdb) set var i = 0x12345678
(gdb) x/4xb &i
0x7fff5fbff584: 0x78 0x56 0x34 0x12

Examining types with ptype

The ptype command might be my favorite command. It tells you the type of
a C expression:

(gdb) ptype i
type = int
(gdb) ptype &i
type = int *
(gdb) ptype main
type = int (void)

Types in C can get complex
but ptype allows you to explore them interactively.

Pointers and arrays

Arrays are a surprisingly subtle concept in C. The plan for this section
is to write a simple program and then poke it in gdb until arrays start to
make sense.

Code up the following arrays.c program:

int main()
{
 int a[] = {1,2,3};
 return 0;
}

Compile it with the -g flag, run it in gdb, then next over the
initialization line:

$ gcc -g arrays.c -o arrays
$ gdb arrays
(gdb) break main
(gdb) run
(gdb) next

At this point you should be able to print the contents of a and
examine its type:

(gdb) print a
$1 = {1, 2, 3}
(gdb) ptype a
type = int [3]

Now that our program is set up correctly in gdb, the first thing we should
do is use x to see what a looks like under the hood:

(gdb) x/12xb &a
0x7fff5fbff56c: 0x01 0x00 0x00 0x00 0x02 0x00 0x00 0x00
0x7fff5fbff574: 0x03 0x00 0x00 0x00

This means that a‘s chunk of memory starts at address
0x7fff5fbff5dc. The first four bytes store a[0],
the next four store a[1], and the final four store
a[2]. Indeed, you can check that sizeof knows
that a‘s size in memory is twelve bytes:

(gdb) print sizeof(a)
$2 = 12

At this point, arrays seem to be quite array-like. They have their own
array-like types and store their members in a contiguous chunk of memory.
However, in certain situations, arrays act a lot like pointers! For instance,
we can do pointer arithmetic on a:

(gdb) print a + 1
$3 = (int *) 0x7fff5fbff570

In words, this says that a + 1 is a pointer to an int and holds the
address 0x7fff5fbff570. At this point you should be reflexively passing
pointers to the x command, so let’s see what happens:

(gdb) x/4xb a + 1
0x7fff5fbff570: 0x02 0x00 0x00 0x00

Note that 0x7fff5fbff570 is four more than
0x7fff5fbff56c, the address of a‘s first byte in
memory. Given that int values take up four bytes, this means
that a + 1 points to a[1].

In fact, array indexing in C is syntactic sugar for pointer arithmetic:
a[i] is equivalent to *(a + i). You can try this
in gdb:

(gdb) print a[0]
$4 = 1
(gdb) print *(a + 0)
$5 = 1
(gdb) print a[1]
$6 = 2
(gdb) print *(a + 1)
$7 = 2
(gdb) print a[2]
$8 = 3
(gdb) print *(a + 2)
$9 = 3

We’ve seen that in some situations a acts like an array and in others it
acts like a pointer to its first element. What’s going on?

The answer is that when an array name is used in a C expression, it
“decays” to a pointer to the array’s first element. There are only two
exceptions to this rule: when the array name is passed to
sizeof and when the array name is passed to the &
operator.

The fact that a doesn’t decay to a pointer when passed to the
& operator brings up an interesting question: is there a
difference between the pointer that a decays to and
&a?

Numerically, they both represent the same address:

(gdb) x/4xb a
0x7fff5fbff56c: 0x01 0x00 0x00 0x00
(gdb) x/4xb &a
0x7fff5fbff56c: 0x01 0x00 0x00 0x00

However, their types are different. We’ve already seen that the decayed
value of a is a pointer to a‘s first element;
this must have type int *. As for the type of
&a, we can ask gdb directly:

(gdb) ptype &a
type = int (*)[3]

In words, &a is a pointer to an array of three integers.
This makes sense: a doesn’t decay when passed to
&, and a has type int [3].

You can observe the distinction between a‘s decayed value and
&a by checking how they behave with respect to pointer
arithmetic:

(gdb) print a + 1
$10 = (int *) 0x7fff5fbff570
(gdb) print &a + 1
$11 = (int (*)[3]) 0x7fff5fbff578

Note that adding 1 to a adds four to
a‘s address, whereas adding 1 to
&a adds twelve!

The pointer that a actually decays to is
&a[0]:

(gdb) print &a[0]
$11 = (int *) 0x7fff5fbff56c

Conclusion

Hopefully I’ve convinced you that gdb a neat exploratory environment for
learning C. You can print the evaluation of expressions,
examine raw bytes in memory, and tinker with the type system
using ptype.

If you’d like to experiment further with using gdb to learn C, I have a
few suggestions:

Use gdb to work through the
Ksplice pointer challenge.
Investigate how structs are stored in memory. How do they compare to
arrays?
Use gdb’s disassemble command to learn assembly programming! A
particularly fun exercise is to investigate how the function call stack
works.
Check out gdb’s “tui” mode, which provides a grahical ncurses layer on
top of regular gdb. On OS X, you’ll likely need to install gdb from
source.


Alan is a facilitator at Hacker School. He’d like to thank David
Albert
, Tom Ballinger, Nicholas Bergson-Shilcock, and Amy Dyer for helpful feedback.


Curious about Hacker School? Read
about us,
see
what our alumni say,
and
apply
to our fall batch.


from Hacker News 200: https://www.hackerschool.com/blog/5-learning-c-with-gdb?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+hacker-news-feed-200+%28Hacker+News+200%29

Written by cwyalpha

九月 5, 2012 在 9:08 上午

发表在 Uncategorized

发表评论

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Google+ photo

You are commenting using your Google+ account. Log Out / 更改 )

Connecting to %s

%d 博主赞过: