In the /var/mqm/errors/AMQERR*.LOG file the following error was visible:
11/01/11 16:20:23 – Process(1408.3) User(mqm) Program(amqrmppa_nd)
AMQ6118: An internal WebSphere MQ error has occurred (20806013)
EXPLANATION:
An error has been detected, and the MQ error recording routine has been called.
ACTION:
Use the standard facilities supplied with your system to record the problem
identifier, and to save the generated output files. Contact your IBM support
center. Do not discard these files until the problem has been resolved.
There was an FDC file present also in the same directory.
It contained the following information header:
+—————————————————————————–+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Tuesday November 01 14:52:07 GMT 2011 |
| Host Name :- xxxxxxxx (HP-UX B.11.23) |
| PIDS :- 5724H7208 |
| LVLS :- 6.0.1.0 |
| Product Long Name :- WebSphere MQ for HP-UX (Itanium platform) |
| Vendor :- IBM |
| Probe Id :- XC028018 |
| Application Name :- MQM |
| Component :- xcsReleaseMutexSem |
| SCCS Info :- lib/cs/unix/generic/amqxlfmx.c, 1.147.1.1 |
| Line Number :- 3229 |
| Build Date :- Oct 21 2005 |
| CMVC level :- p600-100-051021 |
| Build Type :- IKAP – (Production) |
| UserID :- 00002440 (mqm) |
| Program Name :- amqrmppa_nd |
| Addressing mode :- 64-bit |
| Process :- 25271 |
| Thread :- 3 |
| QueueManager :- MQINTUX |
| ConnId(1) IPCC :- 26 |
| Major Errorcode :- xecL_E_NOT_OWNER |
| Minor Errorcode :- OK |
| Probe Type :- INCORROUT |
| Probe Severity :- 2 |
| Probe Description :- AMQ6125: An internal WebSphere MQ error has occurred. |
| FDCSequenceNumber :- 0 |
| |
+—————————————————————————–+
Check the “Component” and “Major Errorcode” sections of the header.
I was seeing “xecL_E_NOT_OWNER” for the “xcsReleaseMutexSem” component.
This prompted me to check the semaphores (HP-UX) using command “ipcs -sa”.
I could see that there were some semaphores hanging around as the “mqm” user (shown in the header of the FDC file also).
So, clearing them as root and using “ipcrm -s” and the ID from the “ipcs -sa” command output, fixed the issue and prevented the need for a reboot.
The issue was caused by running MQ as the wrong UNIX user (it wasn’t supposed to be running as the mqm user in our environment).