21
votes

I have a very simple scenario involving a database and a JMS in an application server (Glassfish). The scenario is dead simple:

1. an EJB inserts a row in the database and sends a message.
2. when the message is delivered with an MDB, the row is read and updated. 

The problem is that sometimes the message is delivered before the insert has been committed in the database. This is actually understandable if we consider the 2 phase commit protocol:

1. prepare JMS
2. prepare database
3. commit JMS
4. ( tiny little gap where message can be delivered before insert has been committed)
5. commit database

I've discussed this problem with others, but the answer was always: "Strange, it should work out of the box".

My questions are then:

  • How could it work out-of-the box?
  • My scenario sounds fairly simple, why isn't there more people with similar troubles?
  • Am I doing something wrong? Is there a way to solve this issue correctly?

Here are a bit more details about my understanding of the problem:

This timing issue exist only if the participant are treated in this order. If the 2PC treats the participants in the reverse order (database first then message broker) that should be fine. The problem was randomly happening but completely reproducible.

I found no way to control the order of the participants in the distributed transactions in the JTA, JCA and JPA specifications neither in the Glassfish documentation. We could assume they will be enlisted in the distributed transaction according to the order when they are used, but with an ORM such as JPA, it's difficult to know when the data are flushed and when the database connection is really used. Any idea?

1
questions: is MDB running on same server? if yes is MDB also using JPA to update the record? if yes are you using second level cache (I read you are using Hibernate in the Other post)? And finally if yes (using cache) can I know what implementation of cache you are using?Elister
@Elister. Everything runs in the same server. We used JPA everywhere. Second level cache was disabled altogether. (The workaround we found was to use a native query select * for update to read the row in the MDB. Then it waits until 1st transaction is committed.)ewernli
Could you please show some pseudo code? I'd like to know if you update the db in one EJB method and send the JMS message in another one (and maybe wrap the whole thing in a third method), if you use different EJBs, etc.Pascal Thivent
@Pascal I had created a reproducing test case. Here it is the source code and the instructions: forums.java.net/jive/message.jspa?messageID=353154#391321ewernli
WebSphere 7 has added this support. Look at the "Commit priority for transactional resources" section publib.boulder.ibm.com/infocenter/wasinfo/fep/index.jsp?topic=/…Aravind Yarram

1 Answers

12
votes

You are experiencing the classic XA 2-PC race condition. It does happen in production environments.

There are 3 things coming to my mind.

  1. Last agent optimization where JDBC is the non-XA resource.(Lose recovery semantics)
  2. Have JMS Time-To-Deliver. (Deliberately Lose real time)
  3. Build retries into JDBC code. (Least effect on functionality)

Weblogic has this LLR optimization avoids this problem and gives you all XA guarantees.