0
votes

So i want to do a football (soccer for americans) games simulator and i want to store a series of games in a data structure, the problem is this i tried to do it with arrays,but since it has fixed sise, i had to resize it and it can be a pain to resize, plus its not that efficient.

Then i saw that there were linked lists and u can add or remove an element very easily making it better than the arrays but the problem is that to access an element u have to do a linear search which is O(N).

But then i saw that u can use a hash table and my teachers (im doing this for a college project) gave an hash table algorithm that uses linked lists, with double hashing, linear probing and external chaining.

The thing is to some degree hash tables can be a great optimiztion for linked lists since u can access its elements in O(1), but the problem is that its size is fixed and because of that it will eventualy need to be resized (with that also needing to spend more memory in expanding it and more time).

So should i choose doing it with linked lists or hash tables?

Thanks for the help.

3
Isn't the number of games in a season known? - Peter - Reinstate Monica
And in the end I would simply do what seems easiest. Modern computers are really fast. They also have lots of memory, so as I said elsewhere: Simply use a really huge array and write in your manual that the maximum number of games you can accomodate is 100,000 or whatever. - Peter - Reinstate Monica
Is this C or C++? In C++ you have std::vector and other dynamic data structures. You could also use a database library so you have persistent storage of your games. - Dave S
No this is c, i dont even know how to program in c++ - Martim Correia
And in this case u dont know the number of games, u just add until u dont want to add more, theres no limit thats the problem - Martim Correia

3 Answers

2
votes

[...] i want to store a series of games in a data structure, the problem is this i tried to do it with arrays,but since it has fixed sise,

In C you have pointers, so sequences of objects can be allocated dynamically.

i had to resize it and it can be a pain to resize,

Why is it a pain? You can do it very easily like this:

#include <stdlib.h>
struct game {
    int example_field;
};

int main(void) 
{
    struct game *pgames = NULL;
    size_t ngames = 0;

    for (size_t i = 0; i < 10; ++i) {
        struct game g = { i };

        // Add game
        pgames = realloc(pgames, (ngames + 1)* sizeof(struct game));
        pgames[ngames++] = g;
    }

    free(pgames);
    return EXIT_SUCCESS;
}

plus its not that efficient.

This is your main mistake. Don't talk about efficiency until you measure it. If you have a way of measuring your performance, it's great. If you don't, don't assume. Computational complexity is different from efficiency. And your model for computational complexity is disregarding how things are implemented in hardware: cache, branch prediction, vector instructions.

My code before was as simple as possible, but required one reallocation per element. You can do better like this:

#include <stdlib.h>
struct game {
    int example_field;
};

int main(void) 
{
    struct game *pgames = NULL;
    size_t ngames = 0, cgames = 0; // cgames is the storage capacity

    for (size_t i = 0; i < 10; ++i) {
        struct game g = { i };

        // Add game
        if (ngames == cgames) {
            cgames = cgames * 2 + 1;
            pgames = realloc(pgames, cgames * sizeof(struct game));
        }
        pgames[ngames++] = g;
    }

    free(pgames);
    return EXIT_SUCCESS;
}

Then i saw that there were linked lists and u can add or remove an element very easily making it better than the arrays

"easily" is just calling a function. Not different than using dynamic sequences/arrays. linked lists are not "better" than arrays. Everything depends on the operations you will perform. In my experience, I've never observed lists perform better than dynamic sequences. Measure, then choose.

but the problem is that to access an element u have to do a linear search which is O(N).

No, to search you have to do a linear search. How often will you search?

But then i saw that u can use a hash table and my teachers (im doing this for a college project) gave an hash table algorithm that uses linked lists, with double hashing, linear probing and external chaining.

This looks simple indeed... :-)

The thing is to some degree hash tables can be a great optimiztion for linked lists since u can access its elements in O(1),

Sometimes.

but the problem is that its size is fixed

Not necessary.

and because of that it will eventualy need to be resized (with that also needing to spend more memory in expanding it and more time).

So it's not fixed.

So should i choose doing it with linked lists or hash tables?

My suggestion? Use a dynamically allocated vector. If you observe any point which is too slow (meaning that it makes the program unusable), profile it. Using a profiler. A specific piece of software which is designed to that aim. Not a bunch of time functions and printfs. And be sure to profile the optimized code (Thanks @ZanLynx for pointing out).

Let me use a famous quote:

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” ― Sir Arthur Conan Doyle, Sherlock Holmes

0
votes

enter image description here

Each data structure has it’s own different way, or different algorithm for sorting, inserting, finding, …etc. This is due to the nature of the data structure. There are algorithms used with specific data structure, where some other can’t be used.

The more efficient & suitable the algorithm, the more you will have an optimized data structure.

Chances are, you’re going to rely on the built-in algorithms used with the data structures in your language. These algorithms are very well optimized and battle tested.

Linked List

Pros::

Inserting and deleting: O(1)

Sequential Access: O(N)

Inserting and deleting operations refers to the operation itself, as you might need to sequentially access all the nodes until the node you’are looking for.

Inserting and deleting is much easier with doubly linked list.

Cons

No Direct Access; Only Sequential Access

Searching: O(N)

Sorting: O(NLogN)

Hash Tables:

Pros

Inserting and deleting: O(1) + Hashing & Indexing (amortized).

Direct access: O(1) + Hashing & Indexing.

It takes a little processing for the hashing and indexing. But the good thing about that is it’s the same amount of processing every time, even if the hash table gets very large. When the hash table gets full, it will increase it’s size. And, when the number of filled buckets is much smaller than the size of the hash table, it will then decrease it’s size.

Both operations take a complexity of O(N). That’s why insertion and deletion takes O(1) amortized.

Cons:

Some overhead as require a little more space in memory than arrays.

Retrieval of elements doesn’t guarantee a specific order.

Searching for a value (without knowing it’s key).

You can read it thoroughly and still if get any doubt feel free to ask.

0
votes

Hash table are not automagically O(1) in complexity, if hash collision occurs hash table are usually backed by array (O(log(n))) or linked list(O(n)) (although in those case n is reduced to the number of element having a collision).

Strategies would have to be chosen based on what operation would be more prevalent:

  • You can still work with an array and over-allocate its size, only growing and shrinking when reaching certain threshold, if your array is sorted you can perform binary search which would have a O(log(n)) complexity.
  • You can use a (balanced ?) tree which would be as flexible as a linked list and wold offer a good balance of performance in deletion/insertion/read (O(log(n)))

But if your teacher provided you a hash algorithm, you would be wiser to your teacher's provided materials.