Remove running node from mnesia cluster

Question

I have a requirement to remove a running node from an mnesia cluster. This is a legitimate node that needs to have some maintenance performed. However, we want to keep this node running and servicing requests. I found this post. Which helps remove it from the additional nodes. However, once you re-start mnesia on the orphan node, it returns to the other nodes in the cluster.

From each of the non-orphan nodes, I run a script that does the following:

    rpc:call('node_to_be_orphaned', mnesia, stop, []),
    mnesia:del_table_copy(schema, 'node_to_be_orphaned'),

^^ At this point mnesia:system_info(db_nodes) shows that the node has indeed been removed.

    rpc:call('node_to_be_orphaned', mnesia, start, []),

Now it's back. Ugh!

So, I then tried to flip it and remove the other nodes from the orphan first adding the following.

    rpc:call(ThisNode, mnesia, stop, []),
    rpc:call('node_to_be_orphaned', mnesia, del_table_copy, [schema, node()]),
    rpc:call(ThisNode, mnesia, start, []),

This just creates a loop with no difference.

Is there a way to take a node out of mnesia clustering while leaving it up-and-running?

Any and all guidance is greatly appreciated

Dis you try removing the cluster configuration from system.config before restarting ? — Arunmu
Try having the orphan node delete the contents of its schema directory after deleting the table copy. Not sure if that would be effective or not, but I have witnessed this behavior before, and I think that's what I did to solve it. Dirty hack, if it works. — Soup d'Campbells

Berzemus Berzemus · Accepted Answer · 2014-07-22T12:09:33

The schema is what is bothering you. You can add nodes, but removing them while keeping the table copies is, err, difficult. This is what happens when a node is connected to a distributed schema, besides receiving a new schema :

Adding a node to the list of nodes where the schema is replicated will affect two things. First it allows other tables to be replicated to this node. Secondly it will cause Mnesia to try to contact the node at start-up of disc-full nodes.

This is what the documentation says about disconnecting a node from a distributed table while still keeping the schema running on the node:

The function call mnesia:del_table_copy(schema, mynode@host) deletes the node 'mynode@host' from the Mnesia system. The call fails if mnesia is running on 'mynode@host'. The other mnesia nodes will never try to connect to that node again. Note, if there is a disc resident schema on the node 'mynode@host', the entire mnesia directory should be deleted. This can be done with mnesia:delete_schema/1. If mnesia is started again on the the node 'mynode@host' and the directory has not been cleared, mnesia's behaviour is undefined.

An existing distributed schema can't be kept on a disconnected node. You have to recreate one, and copy the table info.

If you wish to keep the current schema on your node, you could remove any shared table from it and use purely local tables instead.

If you really wish to remove the node from the schema, you could export the data, erase the schema and create a new, undistributed one, and import the data, for testing and development.

Here are some useful functions you could use in both cases:

Copying a mnesia table

Mnesia tables can be easily copied, like in this example I just wrote (and tested) for the sheer fun of it:

copy_table(FromTable,ToTable) ->
    mnesia:create_table(ToTable, [
                        {attributes, mnesia:table_info(FromTable,attributes)},
                        {index, mnesia:table_info(FromTable,index)},
                        % Add other attributes to be inherited, if present
                        {record_name,FromTable},
                        {access_mode, read_write},
                        {disc_copies,[node()]}
                        ]),

    Keys = mnesia:dirty_all_keys(FromTable),
    CopyJob = fun(Record,Counter) ->
            mnesia:write(ToTable,Record,write),
            Counter + 1
            end,

    mnesia:transaction(fun() -> mnesia:foldl(CopyJob,0,FromTable) end).

This function would copy any table (distributed or not) to a merely local onem keeping it's attributes and record definitions. You would have to use mnesia:read

Exporting/importing a mnesia table to/from a file

This other functions export a mnesia table to a file, and import it back again. They would need some minor tweaks to import them to an arbitrary named table. (you could use mnesia:ets/1 for the sheer experience of it):

export_table(Table) ->
    Temp = ets:new(ignoreme,[bag,public]),
    Job  = fun(Key) ->
        [Record] = mnesia:dirty_read(Table,Key),
        ets:insert(Temp,Record) end,
    Keys = mnesia:dirty_all_keys(Table),
    [Job(Key) || Key <- Keys],
    Path = lists:concat(["./",atom_to_list(Table),".ets"]),
    ets:tab2file(Temp,Path,[{extended_info,[md5sum,object_count]}]),
    ets:delete(Temp).





import_table(Table) ->
    Path = lists:concat(["./",atom_to_list(Table),".ets"]),
    {ok,Temp} = ets:file2tab(Path,[{verify,true}]),
    {atomic,Count} = mnesia:transaction(fun() ->
        ets:foldl(fun(Record,I) -> mnesia:write(Record),I+1 end
             ,0
             ,Temp)
        end),
    ets:delete(Temp).

Remove running node from mnesia cluster

1 Answers