3
votes

I have a test which mocks out the app-wide Repo. Most of the time, the tests are green. When I run the test in a loop, using the same seed, maybe 10% of the runs succeed, but with a GenServer terminating message:

15:39:34.632 [error] GenServer ShortTermMessageStore terminating
** (CaseClauseError) no case clause matching: {:error, %Postgrex.Error{connection_id: 11136, message: nil, postgres: %{code: :foreign_key_violation, constraint: "chat_messages_room_id_ref_fkey", detail: "Key (room_id_ref)=(c78940ab-2514-493e-81fe-64efc63c7bb0) is not present in table \"chat_rooms\".", file: "ri_triggers.c", line: "3324", message: "insert or update on table \"chat_messages\" violates foreign key constraint \"chat_messages_room_id_ref_fkey\"", pg_code: "23503", routine: "ri_ReportViolation", schema: "public", severity: "ERROR", table: "chat_messages", unknown: "ERROR"}}}
    (chat_web) lib/chat_web/repo.ex:78: ChatWeb.Repo.append_message_to_store/3
    (chat_web) lib/short_term_message_store.ex:23: ChatWeb.ShortTermMessageStore.handle_cast/2
    (stdlib) gen_server.erl:601: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:667: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message: {:"$gen_cast", {:published, "USERID", "c78940ab-2514-493e-81fe-64efc63c7bb0", %{"id" => "123", "room_id" => "c78940ab-2514-493e-81fe-64efc63c7bb0"}}}

I believe I tracked the problem down to a single test. When I comment out that test, I can run the tests for minutes at a time without seeing a single warning.

The code in question is:

test_with_mock "publish_message appends message to room", Repo, [], [append_message_to_store: fn(_, _, _) -> true end] do
  room_id = "c78940ab-2514-493e-81fe-64efc63c7bb0"
  {:ok, _room} = open_room room_id
  sock = open_socket()
        |> subscribe_and_join!(RoomChannel, "room:#{room_id}")

  push sock, "publish_message", %{"id" => "123", "room_id" => room_id}

  assert_broadcast "publish_message", %{"id" => "123", "room_id" => room_id, "sequence" => 1}
  {:ok, mailbox} = Rooms.mailbox(room_id)
  assert [%{"id" => "123", "room_id" => "c78940ab-2514-493e-81fe-64efc63c7bb0", "sequence" => 1}] = mailbox
end

defmodule RoomChannel do
  def handle_in("publish_message", payload, socket) do
    {:ok, message} = Rooms.publish(payload["room_id"], payload)
    broadcast(socket, "publish_message", message)

    ShortTermMessageStore.publish(socket.assigns[:user_id], payload["room_id"], payload)

    {:reply, {:ok, %{payload: payload}}, socket}
  end
end

defmodule ShortTermMessageStore do
  def publish(user_id, room_id, message) do
    GenServer.cast(ShortTermMessageStore.address, {:published, user_id, room_id, message})
  end

  def handle_cast({:published, user_id, room_id, message}, state) do
    Logger.debug fn -> "STMS: #{inspect(message)}" end
    Repo.append_message_to_store(user_id, room_id, message)
    {:noreply, state}
  end
end

The flow of code is: RoomChannel's handle_in({:publish_message}) is called. This calls into business logic, then calls the ShortTermMessageStore. This does a GenServer.cast, and then returns. The test runs picks up where it left off, and the test and mock are torn down. I believe that in certain cases, the test and mock are torn down too soon: the cast hasn't started yet, and the real implementation of Repo is run. This generates the message I pasted above, where we try run a SQL INSERT statement, but it fails because the DB hasn't been setup for foreign keys.

I know that by design, casts aren't supposed to support replies. For the benefit of my tests, how can I help? Do I need to sleep before returning? That seems gross and inelegant.

I've found What is the idiomatic testing strategy for GenServers in Elixir? which has good elements, but in my case, I'm not testing the async object. It's just a side-effect of calling into another function.

I also found Testing asynchronous code in Elixir which makes me think I might have to monitor something.

2
Have you confirmed that the tests always pass if you add a long sleep, say 5 seconds?Dogbert
Sleeping 100ms allows my tests to pass, so yes, I'm pretty certain that this test is the problem, and that the order in which the test is torn down impacts the logs.François Beausoleil
Could you try doing this instead of sleep: :sys.get_state(ShortTermMessageStore.address). That should wait until all messages sent to ShortTermMessageStore.address are processed before returning the state. Let me know if it works when you run the tests in a loop.Dogbert
@Dogbert, updated. The test fails for real instead of generating a simple message.François Beausoleil
"module ShortTermMessageStore is not available" that's really strange.. ShortTermMessageStore is the actual complete name of the module right?Dogbert

2 Answers

7
votes

You can wait until the GenServer has processed all pending messages by calling :sys.get_state/1 on the Pid. :sys.get_state/1 makes a synchronous call to the GenServer to get its state, which completes after all the pending messages have been processed by it.

So, if you add the following code to the end of your test, your test will wait until ShortTermMessageStore.address has processed all messages it has already received.

:sys.get_state(ShortTermMessageStore.address)
0
votes

After thinking about this, I mocked the ShortTermMessageStore as well:

test "publish_message appends message to room" do
  with_mocks([
    {ShortTermMessageStore, [], [publish: fn(_, _, _) -> true end]},
    {Repo, [], [is_authorized_to_join: fn(_, _) -> true end, is_authorized_to_publish_message_to_room: fn(_, _) -> true end, append_message_to_store: fn(_, _, _) -> true end]},
  ]) do
    {:ok, _room} = open_room "c78940ab-2514-493e-81fe-64efc63c7bb0"
    sock = open_socket()
          |> subscribe_and_join!(RoomChannel, "room: c78940ab-2514-493e-81fe-64efc63c7bb0")

    push sock, "publish_message", %{"id" => "123", "room_id" => "c78940ab-2514-493e-81fe-64efc63c7bb0"}

    assert_broadcast "publish_message", %{"id" => "123", "room_id" => "c78940ab-2514-493e-81fe-64efc63c7bb0", "sequence" => 1}
    {:ok, mailbox} = Rooms.mailbox("c78940ab-2514-493e-81fe-64efc63c7bb0")
    assert [%{"id" => "123", "room_id" => "c78940ab-2514-493e-81fe-64efc63c7bb0", "sequence" => 1}] = mailbox
  end
end

Using this mock, I was able to run the tests many minutes in a row, without failures or warning messages.