I'm working on porting some Delphi 7 code to XE4, so, unicode is the subject here.
I have a method where a string gets written to a TMemoryStream, so according to this embarcadero article, I should multiply the length of the string (in characters) times the size of the Char type to get the length in bytes that is needed for the length (in bytes) parameter to WriteBuffer.
so before:
rawHtml : string; //AnsiString
...
memorystream1.WriteBuffer(Pointer(rawHtml)^, Length(rawHtml);
after:
rawHtml : string; //UnicodeString
...
memorystream1.WriteBuffer(Pointer(rawHtml)^, Length(rawHtml)* SizeOf(Char));
My understanding of Delphi's UnicodeString type is that it's UTF-16 internally. But my general understanding of Unicode is that not all unicode characters can be represented even in 2 bytes, that some corner case foreign characters will take 4 bytes. Another of embarcadero's articles seems to confirm that my suspicions, "In fact, it isn’t even always true that one Char is equal to two bytes!"
So...that leaves me wondering whether Length(rawHtml)* SizeOf(Char)
is really going to be robust enough to be consistently accurate, or whether there's a better way to determine the size of the string that will be more accurate?
TStringStream
instead ofTMemoryStream
? – teranTStream
which means the internal structure of both work the same - it's just how you interact with it that's different. So even aTFileStream
orTResourceStream
are applicable to use in your case, that is, if you were sending Files or Resources to your browser anyway. – Jerry Dodge