1
votes

I'm receiving binaries from an IoT device. I'm trying to convert some identifier consisting of a list of up to 13 bytes to the smallest readable string possible.

For that I've been decoding it to Base64 then convert the bytes to hex so that

byte[] bytes = {0x24, 0x54, 0x4b, 0x00, 0x31, 0x00, 0x0e, 0x50, 0x33, 0x42, 0x58, 0x35};

becomes

4CAD4FDC15F9

Base64 decoding of normal ascii bytes

However, when I receive bytes in extended ascii (in the debugger, the bytes appear as negative values), the conversion to base64 returns an empty array of bytes.

Base64 decoding of extended ascii bytes

I have been using org.apache.tomcat.util.codec.binary.Base64 which in its documentation do mention that it's not taking into account extended ASCII characters as mentioned in the documentation:

Since this class operates directly on byte streams, and not character streams, it is hard-coded to only * encode/decode character encodings which are compatible with the lower 127 ASCII chart (ISO-8859-1, Windows-1252, * UTF-8, etc).


I also tried java.util.Base64, it works with the first array of bytes, and throws an exception with the second array of bytes:

public static String getBase64HexDeviceIdFromSerialBytes(byte[] serial) {
    byte[] base64Bytes = java.util.Base64.getMimeDecoder().decode(ArrayUtils.subarray(serial, 0, 12));
    String hex = BytesUtils.bytesToHex(base64Bytes);
    return hex;
}

19:39:31.946 [main] ERROR com.trackener.backend.api.device.service.DeviceService - message processing failed java.lang.IllegalArgumentException: Last unit does not have enough valid bits at java.util.Base64$Decoder.decode0(Base64.java:734) at java.util.Base64$Decoder.decode(Base64.java:526)


How to manage to do this conversion from bytes to this small base64 string with those special characters in extended ascii ? If I could do this with another method (to get a unique code with as few characters as possible from an array of bytes), I'd be happy too.

1
I think you're doing it backward. You don't want to decode the input bytes, you want to encode them in base64.DavidW
I second @DavidW. Base64 is an encoding to represent binary data as printable strings. Like Hex but with 64 as a base instead of 16. So you have bytes and you encode them to a Base64 string. And a Base64 string decodes back to bytes.vanje
Consider Base85, too.Tom Blodget
thank you for your comments, I got confused indeed. I thought I could get fewer characters with that initial solution but it doesn't work. Base85 brought special characters while I wanted only alpha numeric characters but thanks for the proposition, I learned something!Jeremie

1 Answers

3
votes

You've confused encode and decode. If you have binary data (8-bit bytes) that you want in a "safe" form, you want to encode them as base64. And once they are encoded, you don't need to convert them to hex to display them; the point of encoding is that the result is printable ASCII.

public static String getBase64HexDeviceIdFromSerialBytes(byte[] serial) {
    return java.util.Base64.getEncoder().encodeToString(ArrayUtils.subarray(serial, 0, 12));
}