This sounds reasonable, but within our usage, other processes were beings blocked and rolled back which were unrelated to the ActiveMQ integration messages. At a high level, the sending system would batch send messages to integration points A and B which are of different types. Type A might be a file system write, while B is an ActiveMQ message send.
If the message ordering was as mentioned above, all would have been fine if ActiveMQ failover nodes were all down. The file for message integration A would send, B would wait forever to send, and nothing is after it, so no harm done. Problem is, the process for the batch send of the mixed messages types will run again, and thread for sending integration B, or ActiveMQ is still waiting. The backend of the messaging system is a simple database table, which stores the messages for sends, and clears them after. So now a second process thread attempts to send message B integration again, but rolls back given that its locked in the initial attempt, and we begin to see a symptom of the original problem, the blocking.
The original failover string in the ActiveMQ connection factory looked like the following:
failover:(tcp://10.1.1.1:61616,tcp://10.1.1.2:61616)?randomize=falseNotice no timeout, resulting in the blocking forever. In our case mentioned above, we don't want the connection factory to wait forever, we want it to timeout, so the next batch message sending process can attempt the send. Referencing the following ActiveMQ documentation:
Obviously you might think, "why would all the nodes in the ActiveMQ failover be down?". The answer is, they were down.