In Objective C is there a way to convert a multi-byte unicode byte array into an NSString, where it will allow the conversion to succeed even if the array data is a partial buffer (not on a complete character boundary)?
The application of this is when receiving byte buffers in a stream, and you want to parse the string version of the data buffer (but there is more data to come, and your buffer data doesn't have complete multi-byte unicode).
NSString's initWithData:encoding:
method does not work for this purpose, as shown here...
Test code:
- (void)test {
char myArray[] = {'f', 'o', 'o', (char) 0xc3, (char) 0x97, 'b', 'a', 'r'};
size_t sizeOfMyArray = sizeof(myArray);
[self dump:myArray sizeOfMyArray:sizeOfMyArray];
[self dump:myArray sizeOfMyArray:sizeOfMyArray - 1];
[self dump:myArray sizeOfMyArray:sizeOfMyArray - 2];
[self dump:myArray sizeOfMyArray:sizeOfMyArray - 3];
[self dump:myArray sizeOfMyArray:sizeOfMyArray - 4];
[self dump:myArray sizeOfMyArray:sizeOfMyArray - 5];
}
- (void)dump:(char[])myArray sizeOfMyArray:(size_t)sourceLength {
NSString *string = [[NSString alloc] initWithData:[NSData dataWithBytes:myArray length:sourceLength] encoding:NSUTF8StringEncoding];
NSLog(@"sourceLength: %lu bytes, string.length: %i bytes, string :'%@'", sourceLength, string.length, string);
}
Output:
sourceLength: 8 bytes, string.length: 7 bytes, string :'foo×bar'
sourceLength: 7 bytes, string.length: 6 bytes, string :'foo×ba'
sourceLength: 6 bytes, string.length: 5 bytes, string :'foo×b'
sourceLength: 5 bytes, string.length: 4 bytes, string :'foo×'
sourceLength: 4 bytes, string.length: 0 bytes, string :'(null)'
sourceLength: 3 bytes, string.length: 3 bytes, string :'foo'
As can be seen, converting the "sourceLength: 4 bytes" byte array fails, and returns (null)
. This is because the UTF-8 unicode '×' character (0xc3 0x97) is only partially included.
Ideally there would be a function that I can use that would return the correct NString, and tell me how many bytes are "left over".