hash tables vs linked lists

Question

So i want to do a football (soccer for americans) games simulator and i want to store a series of games in a data structure, the problem is this i tried to do it with arrays,but since it has fixed sise, i had to resize it and it can be a pain to resize, plus its not that efficient.

Then i saw that there were linked lists and u can add or remove an element very easily making it better than the arrays but the problem is that to access an element u have to do a linear search which is O(N).

But then i saw that u can use a hash table and my teachers (im doing this for a college project) gave an hash table algorithm that uses linked lists, with double hashing, linear probing and external chaining.

The thing is to some degree hash tables can be a great optimiztion for linked lists since u can access its elements in O(1), but the problem is that its size is fixed and because of that it will eventualy need to be resized (with that also needing to spend more memory in expanding it and more time).

So should i choose doing it with linked lists or hash tables?

Thanks for the help.

And in the end I would simply do what seems easiest. Modern computers are really fast. They also have lots of memory, so as I said elsewhere: Simply use a really huge array and write in your manual that the maximum number of games you can accomodate is 100,000 or whatever. — Peter - Reinstate Monica
Is this C or C++? In C++ you have std::vector and other dynamic data structures. You could also use a database library so you have persistent storage of your games. — Dave S
And in this case u dont know the number of games, u just add until u dont want to add more, theres no limit thats the problem — Martim Correia

Costantino Grana Costantino Grana · Accepted Answer · 2020-05-18T11:35:41

[...] i want to store a series of games in a data structure, the problem is this i tried to do it with arrays,but since it has fixed sise,

In C you have pointers, so sequences of objects can be allocated dynamically.

i had to resize it and it can be a pain to resize,

Why is it a pain? You can do it very easily like this:

#include <stdlib.h>
struct game {
    int example_field;
};

int main(void) 
{
    struct game *pgames = NULL;
    size_t ngames = 0;

    for (size_t i = 0; i < 10; ++i) {
        struct game g = { i };

        // Add game
        pgames = realloc(pgames, (ngames + 1)* sizeof(struct game));
        pgames[ngames++] = g;
    }

    free(pgames);
    return EXIT_SUCCESS;
}

plus its not that efficient.

This is your main mistake. Don't talk about efficiency until you measure it. If you have a way of measuring your performance, it's great. If you don't, don't assume. Computational complexity is different from efficiency. And your model for computational complexity is disregarding how things are implemented in hardware: cache, branch prediction, vector instructions.

My code before was as simple as possible, but required one reallocation per element. You can do better like this:

#include <stdlib.h>
struct game {
    int example_field;
};

int main(void) 
{
    struct game *pgames = NULL;
    size_t ngames = 0, cgames = 0; // cgames is the storage capacity

    for (size_t i = 0; i < 10; ++i) {
        struct game g = { i };

        // Add game
        if (ngames == cgames) {
            cgames = cgames * 2 + 1;
            pgames = realloc(pgames, cgames * sizeof(struct game));
        }
        pgames[ngames++] = g;
    }

    free(pgames);
    return EXIT_SUCCESS;
}

Then i saw that there were linked lists and u can add or remove an element very easily making it better than the arrays

"easily" is just calling a function. Not different than using dynamic sequences/arrays. linked lists are not "better" than arrays. Everything depends on the operations you will perform. In my experience, I've never observed lists perform better than dynamic sequences. Measure, then choose.

but the problem is that to access an element u have to do a linear search which is O(N).

No, to search you have to do a linear search. How often will you search?

But then i saw that u can use a hash table and my teachers (im doing this for a college project) gave an hash table algorithm that uses linked lists, with double hashing, linear probing and external chaining.

This looks simple indeed... :-)

The thing is to some degree hash tables can be a great optimiztion for linked lists since u can access its elements in O(1),

Sometimes.

but the problem is that its size is fixed

Not necessary.

and because of that it will eventualy need to be resized (with that also needing to spend more memory in expanding it and more time).

So it's not fixed.

So should i choose doing it with linked lists or hash tables?

My suggestion? Use a dynamically allocated vector. If you observe any point which is too slow (meaning that it makes the program unusable), profile it. Using a profiler. A specific piece of software which is designed to that aim. Not a bunch of time functions and printfs. And be sure to profile the optimized code (Thanks @ZanLynx for pointing out).

Let me use a famous quote:

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” ― Sir Arthur Conan Doyle, Sherlock Holmes

hash tables vs linked lists

3 Answers