1
votes

I am trying to understand the loader of the c++/g++ compilers and the convention it uses .

I have four source files .

Hello.h
Hello.cpp
Hello1.cpp
main.cpp

Hello.h

#include <iostream>

class Hello1
{
public:
int a;
void sayHello();
};

Hello.cpp

 #include"Hello.h"

        void Hello1::sayHello()
        {
        std::cout<<this->a;
        }

Hello1.cpp

#include"Hello.h"

void Hello1::sayHello()
{
std::cout<<"Hello";
}

main.cpp

#include"Hello.h"

    int main()
    {
    Hello1 hello;
    hello.a=5;
    hello.sayHello();
    return 0;
    }

Preprocessing and assembling passes for each file individually and also

c++ -c main.cpp
also produces a main.o . But when linking and loading to producing an executable ie c++ main.o it gives an error saying the function definition cannot be found
main.o: In function main':
main.cpp:(.text+0x19): undefined reference toHello1::sayHello()'
collect2: ld returned 1 exit status
I know that if i name the class Hello and include a corresponding Hello.cpp the loader will find the function definition and execute the member function . But if i change the name of the class inside the header file Hello.h from Hello to Hello1 the object file is created without a problem and the compiler knows that a class Hello1 exists and allocates memory for it ( guessing the success of the c++ -c command ) but the loader can't find the function body of sayHello() . This seems likes it's not looking into Hello.cpp or Hello1.cpp because the Hello.h has a different class apart from class Hello

So how does the loader load the function definition even in a normal case ? does it reference the filename Hello.h and look for a Hello.cpp , or does it reference a class name Hello1 and look for a Hello1.cpp , Or does it have a constraint check to see if the .h and class names are the same and then only look for a .cpp of the same name and ignore the rest of the classes in the header file ?

It would be great if some c++ guru could give me some insights to what basis the loader picks up the definitions included in #include in a normal c++ file , Also in this case how to reference the definition of sayHello() by using different names itself , is it possible at all ? or can a header file only entertain interface to classes having the same name

2
Could you post the command line that is currently used for the linking step?Medinoc
"if i change the name of the class inside the header file Hello.h from Hello to Hello1". Your Hello.h file already declares a Hello1 class, which is referenced from your main(). Your question contradicts itself, and makes no sense.Sam Varshavchik
Wow. I've clearly referenced the standard implementation right before your one line cut and paste , from which I'm deviating . The line clearly is a reference to the standard implementation and not to the current modification which has brought this doubt up ...Deepak Nair
@Medinoc just a simple 'c++ main.o' , since the input is an object file gcc/c++ recognizes that and calls ld [...] main.oDeepak Nair
Shouldn't it be c++ main.o hello.o?Medinoc

2 Answers

3
votes

Short version: You provide a set of files that provide a list of symbols. You (or the build system) are responsible for providing the "right" list of symbols (and their defintion) by specifiying the correct files. It doesn't matter whether those files are called Hello, Hello1, foo or bar (+ the appropriate suffix)


Let's take a look at the result of c++ -c main.cpp via objdump -t -C main.o

SYMBOL TABLE:
00000000 l df *ABS* 00000000 main.cpp
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l O .bss 00000001 std::__ioinit
00000050 l F .text 00000042 __static_initialization_and_destruction_0(int, int)
00000092 l F .text 0000001a _GLOBAL__sub_I_main
00000000 l d .init_array 00000000 .init_array
00000000 l d .note.GNU-stack 00000000 .note.GNU-stack
00000000 l d .eh_frame 00000000 .eh_frame
00000000 l d .comment 00000000 .comment
00000000 g F .text 00000050 main
00000000 *UND* 00000000 Hello1::sayHello()
00000000 *UND* 00000000 __stack_chk_fail
00000000 *UND* 00000000 std::ios_base::Init::Init()
00000000 *UND* 00000000 .hidden __dso_handle
00000000 *UND* 00000000 std::ios_base::Init::~Init()
00000000 *UND* 00000000 __cxa_atexit

There's a symbol main, it's a function and it "needs" some other symbols that have not been found in this compilation unit.
To illustrate this let's modify main.cpp a little

#include"Hello.h"
#include <iostream>

// noinline, so that the compiler "keeps" this a function + function calls
void __attribute__ ((noinline)) foo() 
{
  std::cout << "ho ho ho" << std::endl;
}

int main()
{
  Hello1 hello;
  hello.a=5;
  foo();
  hello.sayHello();
  return 0;
}

Now the output of objdump... is

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 main.cpp
00000000 l    d  .text  00000000 .text
00000000 l    d  .data  00000000 .data
00000000 l    d  .bss   00000000 .bss
00000000 l     O .bss   00000001 std::__ioinit
00000000 l    d  .rodata    00000000 .rodata
00000084 l     F .text  00000042 __static_initialization_and_destruction_0(int, int)
000000c6 l     F .text  0000001a _GLOBAL__sub_I__Z3foov
00000000 l    d  .init_array    00000000 .init_array
00000000 l    d  .note.GNU-stack    00000000 .note.GNU-stack
00000000 l    d  .eh_frame  00000000 .eh_frame
00000000 l    d  .comment   00000000 .comment
00000000 g     F .text  0000002f foo()
00000000         *UND*  00000000 std::cout
00000000         *UND*  00000000 std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
00000000         *UND*  00000000 std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
00000000         *UND*  00000000 std::ostream::operator<<(std::ostream& (*)(std::ostream&))
0000002f g     F .text  00000055 main
00000000         *UND*  00000000 Hello1::sayHello()
00000000         *UND*  00000000 __stack_chk_fail
00000000         *UND*  00000000 std::ios_base::Init::Init()
00000000         *UND*  00000000 .hidden __dso_handle
00000000         *UND*  00000000 std::ios_base::Init::~Init()
00000000         *UND*  00000000 __cxa_atexit

As you can see there's no *UND* foo(), the compiler could resolve that symbol+call on its own.
Ok, now what does the linker do? It get's a list of input files and makes a list of all the symbols defined in those files. Then it looks for the dependencies and tries to resolve them. main "needs" a symbol Hello1::sayHello() (the -C option made it look like this, see https://en.wikipedia.org/wiki/Name_mangling).
If there is such a symbol in the linker's symbol list (and it fits) then the dependency can be resolved. If there is no such symbol you get the "undefined reference to" / "unresolved symbol" error message.
I.e. you have to provide an object (file) that defines the needed symbol or else the linker will fail. What name this file has doesn't matter.

Hello.o provides a symbol Hello1::sayHello() and it would satisfy the requirements of the reference in main.oc

...
00000000 g     F .text  0000001f Hello1::sayHello()
00000000         *UND*  00000000 std::cout
00000000         *UND*  00000000 std::ostream::operator<<(int)
00000000         *UND*  00000000 std::ios_base::Init::Init()
00000000         *UND*  00000000 .hidden __dso_handle
00000000         *UND*  00000000 std::ios_base::Init::~Init()
00000000         *UND*  00000000 __cxa_atexit
..

and so does Hello1.o

...
00000000 g     F .text  0000001e Hello1::sayHello()
00000000         *UND*  00000000 std::cout
00000000         *UND*  00000000 std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
00000000         *UND*  00000000 std::ios_base::Init::Init()
00000000         *UND*  00000000 .hidden __dso_handle
00000000         *UND*  00000000 std::ios_base::Init::~Init()
00000000         *UND*  00000000 __cxa_atexit
...

So if you call (or let c++/gcc make that call) ld [...] main.o Hello.o the definition of the symbol Hello1::sayHallo() is taken from Hello.o, if you call ld [...] main.o Hello1.o Hello1.o's Hello1::sayHallo() is used.
Now call c++ main.cpp Hello.cpp Hello1.cpp and you'll get a "Hello.cpp:(.text+0x0): re-definition of `Hello1::sayHello()' error because there are two symbols with the same name (and no mechanism how to resolve that problem....).

2
votes

You need to tell the linker which file object (.o) file to use. Hello.o or Hello1.o. So your command-line would be like this:

c++ main.o Hello.o

or

c++ main.o Hello1.o

If you try to use both, you will get an error like this:

$ c++ main.o Hello1.o Hello.o
Hello.o: In function `Hello1::sayHello()':
Hello.cpp:(.text+0x0): multiple definition of `Hello1::sayHello()'
Hello1.o:Hello1.cpp:(.text+0x0): first defined here
collect2: ld returned 1 exit status

In answer to your last question, no, the name of the header-file (.h and .cpp-file) does not need to match the name of the class defined inside.

So this is legal:

foo.h

class Bar 
{
 public:
 void someFunc();
}