We have a process which communicates with an external via MQ. The external system runs on a mainframe maching (IBM z/OS), while we run our process on a CentOS Linux platform. So far we never had any issues.
Recently we started receiving messages from them with non-printable EBCDIC characters embedded in the message. They use the characters as a compressed ID, 8 bytes long. When we receive it, it arrives on our queue encoded in UTF (CCSID 1208).
They need to original 8 bytes back in order to identify our response messages. I'm trying to find a solution in Java to convert the ID back from UTF to EBCDIC before sending the response.
I've been playing around with the JTOpen library, using the AS400Text class to do the conversion. Also, the counterparty has sent us a snapshot of the ID in bytes. However, when I compare the bytes after conversion, they are different from the original message.
Has anyone ever encountered this issue? Maybe I'm using the wrong code page?
Thanks for any input you may have.
Bytes from counterparty(Positions [5,14]):
00000 F0 40 D9 F0 F3 F0 CB 56--EF 80 04 C9 10 2E C4 D4 |0 R030.....I..DM|
Program output:
UTF String: [R030ôîÕ؜IDMDHP1027W 0510]
EBCDIC String: [R030ôîÃÃÂIDMDHP1027W 0510]
NATIVE CHARSET - HEX: [52303330C3B4C3AEC395C398C29C491006444D44485031303237572030353130]
CP500 CHARSET - HEX: [D9F0F3F066BE66AF663F663F623FC9102EC4D4C4C8D7F1F0F2F7E640F0F5F1F0]
Here is some sample code:
private void readAndPrint(MQMessage mqMessage) throws IOException {
mqMessage.seek(150);
byte[] subStringBytes = new byte[32];
mqMessage.readFully(subStringBytes);
String msgId = toHexString(mqMessage.messageId).toUpperCase();
System.out.println("----------------------------------------------------------------");
System.out.println("MESSAGE_ID: " + msgId);
String hexString = toHexString(subStringBytes).toUpperCase();
String subStr = new String(subStringBytes);
System.out.println("NATIVE CHARSET - HEX: [" + hexString + "] [" + subStr + "]");
// Transform to EBCDIC
int codePageNumber = 37;
String codePage = "CP037";
AS400Text converter = new AS400Text(subStr.length(), codePageNumber);
byte[] bytesData = converter.toBytes(subStr);
String resultedEbcdicText = new String(bytesData, codePage);
String hexStringEbcdic = toHexString(bytesData).toUpperCase();
System.out.println("CP500 CHARSET - HEX: [" + hexStringEbcdic + "] [" + resultedEbcdicText + "]");
System.out.println("----------------------------------------------------------------");
}
new String(subStringBytes);
- this is using your default encoding. Do you know what it is, and do you know that it supports all possible byte combinations that you might get, and do you know if it's reversible? – parsifalC3E2
, which is an invalid UTF-8 sequence:C3
is the start of a two-byte sequence, butE2
is not a valid second byte; it's only valid as the first byte of a 3-byte sequence. – parsifalnew String(subStringBytes)
uses your platform default encoding. Maybe that's UTF-8 for you, maybe it isn't. Worse, it might be UTF-8 for you and not UTF-8 on whatever platform you use for deployment. – parsifal