Pending Transaction
During regular operation of the HSMs and with all nodes running normally, there shouldn't be too many Pending Transaction situations. These are replication transactions that for some reason started the distributed transaction protocol(Two Phase Commit) but could not be completed. A pending transaction involves at least two pool nodes, one of which is the coordinator and the other a participant. In most cases, the HSM 's own replication mechanism can recover on its own and resolve the issue that was preventing the transaction from completing. In other cases, for example a hardware defect or communication problem may require operator intervention.
Dinamo - Local Management Console
┌──────────────┤ ├──────────────┐
│ │
│ No pending transaction found. │
│ │
│ ┌────┐ │
│ │ OK │ │
│ └────┘ │
│ │
│ │
└────────────────────────────────┘
Service running... Replication Domain: <list>
A pending transaction will leave the entire pool locked for newwrite operations;read operations continue as normal.
When there is a transaction, the screen will display various information about it:
- Now: current date and time (when the screen was opened), for information purposes only;
- Date/Time: date and time(timestamp) when the transaction was generated;
- GUID: unique transaction identifier, all nodes involved in a replicated transaction recognize the transaction by the same identifier;
- State: the state of the Tow Phase Commit protocol where the transaction stopped;
- Type: type of transaction, such as key creation or removal, user creation or removal.
- Source: user responsible for the transaction, the one who made the initial service request;
- Target: user affected by the transaction, can be the same as indicated in Source, or another if the transaction involves operations on a remote user partition (via permissioning);
- Nodes: the list of nodes known to the HSM at the time of the transaction; it is with this list of nodes that the HSM will attempt to complete the replication. The first node in the list is always the coordinator. The node marked PA(Pending Ack) is the one that has not completed the replication protocol communication. This is usually a good starting point for investigating the causes of the problem.
Dinamo - Local Management Console
┌─────────────────────────┤ ├─────────────────────────┐
│ │
│ Now : 2023-12-17 16:17:00 │
│ Date/Time : 2023-12-17 16:16:32 │
│ │
│ GUID : 0102030405060708 │
│ State : Phase 2 - Coordinator │
│ Type : Lock/Probe/Test │
│ Source : ET_NULL_USR │
│ Target : ET_NULL_OBJ │
│ Nodes : 172.17.0.3 │
│ 172.17.0.3 - PA │
│ │
│ ┌────┐ │
│ │ OK │ │
│ └────┘ │
│ │
└──────────────────────────────────────────────────────┘
Service running... Replication Domain: <list>
The replication mechanism has an automaticrecovery service, which from time to time will try some measures, such as retransmission, to eliminate the hang and complete the transaction. The time interval at which the automatic recovery service runs is defined in the Policy option of the replication menu.
The reason for the appearance of a Pending Transaction can most often be something simple, such as a node with stopped service or a communication problem with the rest of the pool (network cable, switch port, link down, etc.), and sometimes it may need further investigation. All the information shown is relevant to identifying the cause and solving the problem.
Some Pending Transaction information can also be displayed by the HSM Remote Console.