Mnesia database design for storing message that needs to be sent in the future

Question

I am writing a ejabberd module where the user controls when the message is delivered to the recipient instead of delivering immediately(like birthday wishes sending in advance). This is done by adding a custom xml element to the message stanza like the following

<message xmlns="jabber:client" from="test2@ubuntu" to="test1@ubuntu/32375806281445450055240436" type="chat">
  <schedule xmlns="ank" year="2015" month="10" day="19" hour="22" minute="36" second="13"/>
  <body>hi</body>
</message>

Now these scheduled messages has to be stored in the mnesia database and send to the recipient when the time arrives.

Approach 1: One approach is to create a table for every user, when the message is received, store the message to the users table and set a timer to process the message and delete when done like the following sample code

timer:apply_after(SecondsDelay, ?MODULE, post_message_delete, [TableName, RecordUniqueKeyHash, From, To, Packet]).

The post_message_delete method will send the message when called after the timer expires using the route method as shown in the following and delete the record from the mnesia database.

    ejabberd_router:route(From, To, Packet)

Creating a table for every user is not feasible due to the limitations on the number of tables in mnesia.

Approach 2: Another approach is to store all the users messages in one single table and set the timer(same as above) for every message as the message arrives and once the message is processed delete it.

The whole idea of using the mnesia database is to process the messages reliably in the case of ejabberd server crash.

To achieve this we use a pid field in the record of every message. There is a pid field for every message record that contains the pid of the process that is processing this message. Initially it is undefined(When the message arrives at the filter_packet hook) but after when the message processing method is spawned it updates the pid in the record in the mnesia database.

So if the server crashes on reboot in the modules start method all the messages are iterated and checked if the pid is alive(is_process_alive), if not alive then spawn the processing method on the message which will update with the new process pid, process the message and delte once done.

Drawbacks The drawback of this method is that even if a message has to be delivered far in the future(next month or next year) still a process is running for this message and there are as many processes running as there as many messages.

Approach 3:

To over come the drawbacks of Approach 2, scan the database every hour and accumulate the messages that has to be delivered only for the next hour and process it.

The drawback with this approach is that the database is scanned every hour that might impact performance.

Approach 4:

To over come the performance of Approach 3, we can create tables for every year_month and spawn the message processing function only on the current months table.

What other approach is best suited for this use case using mnesia database?

I believe this question would be sort of opinion based on stack overflow, but will fit in programmers.stackexchange.com, as their help center states that "software architecture and design" are on topic there. The question itself is not erlang-specific actually. — Lol4t0
@Lol4t0 when referring other sites, it is often helpful to point that cross-posting is frowned upon — gnat
@gnat, Hey! but this still does not mean questions not to be moved to the right places! If you got a duplicate a wrong place post should be removed. — Lol4t0
As Erlang processes are lightweight; depends on your machine resources, having lots of processes in idle state doesn't really hurt your system. — Hamidreza Soleimani
Also consider using projects like github.com/erlware/erlcron which can brings new solutions and ideas to what you are trying to design. — Hamidreza Soleimani

Kutae Shaw Kutae Shaw · Accepted Answer · 2017-12-20T08:25:13

Even though it's an old question, but it may one day become an issue for somebody else.

I think mnesia is the wrong choice for this kind of data store use case. Redis from version 2.8.0 has a keyspace event notification features when certain operations are performed including key expiration commands set by EXPIRE, EXPIREAT and other variants. This information can reach your code by the PUBSUB feature. See Redis Keyspace Notifications on how to start.

Generate a unique key(K) probably UUID for every birthday message. Store the message, the entire XML, to send under the given generated K.

Store this message key as a value under a key called K:timer using the SET command with TTL set to the time difference between now and the birthday timestamp in seconds, OR use the EXPIREAT to set the message expiration time to the Unix timestamp of the birthday itself. When TTL expires, pubsub clients get notified of the event with the information of the key to expire K:timer. Extract the K and fetch the message with it. Send your message and delete it afterwards.

ISSUES TO CONSIDER:

1: Multiple pubsub clients may get notified of the same expiration event. This may cause the same message to be sent more than once. Implement some sort of locking to prevent this.

2: Redis PUBSUB is a fire and forget message passing construct. So if a client goes down and comes up again, it may have missed event notifications during this time window. One way to ensure reliability is to store the key, K, under different key variants of K:timer, K:timer:1, K:timer:2, K:timer:3,... at increasing TTL offsets(1, 2, 3, minutes in between) to target a worst time window during which the unavailable client may become available.

3: Redis is in-memory. Storing lots of large messages will cost you RAM. One way to solve this is to store only the message key K in redis and store the message (XML) with the same key, K, in any disk base key-value store like Riak, Cassandra etc.

Mnesia database design for storing message that needs to be sent in the future

1 Answers