recently (yay no more school) I've been teaching myself about the ELF file format. I've largely been following the documentation here: http://www.skyfree.org/linux/references/ELF_Format.pdf.
It was all going great, and I wrote this program to give me info about an ELF file's sections:
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <elf.h>
void dumpShdrInfo(Elf32_Shdr elfShdr, const char *sectionName)
{
printf("Section '%s' starts at 0x%08X and ends at 0x%08X\n",
sectionName, elfShdr.sh_offset, elfShdr.sh_offset + elfShdr.sh_size);
}
int search(const char *name)
{
Elf32_Ehdr elfEhdr;
Elf32_Shdr *elfShdr;
FILE *targetFile;
char tempBuf[64];
int i, ret = -1;
targetFile = fopen(name, "r+b");
if(targetFile)
{
/* read the ELF header */
fread(&elfEhdr, sizeof(elfEhdr), 1, targetFile);
/* Elf32_Ehdr.e_shnum specifies how many sections there are */
elfShdr = calloc(elfEhdr.e_shnum, sizeof(*elfShdr));
assert(elfShdr);
/* set the file pointer to the section header offset and read it */
fseek(targetFile, elfEhdr.e_shoff, SEEK_SET);
fread(elfShdr, sizeof(*elfShdr), elfEhdr.e_shnum, targetFile);
/* loop through every section */
for(i = 0; (unsigned int)i < elfEhdr.e_shnum; i++)
{
/* if Elf32_Shdr.sh_addr isn't 0 the section will appear in memory*/
if(elfShdr[i].sh_addr)
{
/* set the file pointer to the location of the section's name and then read the name */
fseek(targetFile, elfShdr[elfEhdr.e_shstrndx].sh_offset + elfShdr[i].sh_name, SEEK_SET);
fgets(tempBuf, sizeof(tempBuf), targetFile);
#if defined(DEBUG)
dumpShdrInfo(elfShdr[i], tempBuf);
#endif
}
}
fclose(targetFile);
free(elfShdr);
}
return ret;
}
int main(int argc, char *argv[])
{
if(argc > 1)
{
search(argv[1]);
}
return 0;
}
After running it a few times on a couple files I noticed something weird. The '.text' section always began at a very low virtual address (we're talking smaller than 1000h). After digging around with gdb for a while, I noticed that for every section, sh_addr was equal to sh_offset.
This is what I'm confused about - Elf32_Shdr.sh_addr is documented as being "the address at which the first byte should reside", while Elf32_Shdr.sh_offset is documented as being "the byte offset from the beginning of the file to the first byte in the function". If those are both the case, it doesn't really make sense to me that they're both equal. Why is this?
Now, I know there are sections that contain uninitialized data (.bss I think), so it would make sense that that data would not appear in the file but would appear in the process's memory. This would mean that for every section that comes after the aforementioned one, figuring out it's virtual address would be a lot more complicated than a simple variable.
That being said, is there a way of actually determining a section's virtual address?