4
votes

For Serialization of Primitive Array, i'am wondering how to convert a Primitive[] to his corresponding byte[]. (ie an int[128] to a byte[512], or a ushort[] to a byte[]...) The destination can be a Memory Stream, a network message, a file, anything. The goal is performance (Serialization & Deserialization time), to be able to write with some streams a byte[] in one shot instead of loop'ing' through all values, or allocate using some converter.

Some already solution explored:

Regular Loop to write/read

//array = any int[];
myStreamWriter.WriteInt32(array.Length);
for(int i = 0; i < array.Length; ++i)
   myStreamWriter.WriteInt32(array[i]);

This solution works for Serialization and Deserialization And is like 100 times faster than using Standard System.Runtime.Serialization combined with a BinaryFormater to Serialize/Deserialize a single int, or a couple of them.

But this solution becomes slower if array.Length contains more than 200/300 values (for Int32).

Cast?

Seems C# can't directly cast a Int[] to a byte[], or a bool[] to a byte[].

BitConverter.Getbytes()

This solution works, but it allocates a new byte[] at each call of the loop through my int[]. Performances are of course horrible

Marshal.Copy

Yup, this solution works too, but same problem as previous BitConverter one.

C++ hack

Because direct cast is not allowed in C#, i tryed some C++ hack after seeing into memory that array length is stored 4 bytes before array data starts

ARRAYCAST_API void Cast(int* input, unsigned char** output)
{
   // get the address of the input (this is a pointer to the data)
   int* count = input;
   // the size of the buffer is located just before the data (4 bytes before as this is an int)
   count--;
   // multiply the number of elements by 4 as an int is 4 bytes
   *count = *count * 4;
   // set the address of the byte array
   *output = (unsigned char*)(input);
}

and the C# that call:

byte[] arrayB = null;
int[] arrayI = new int[128];
for (int i = 0; i < 128; ++i)
   arrayI[i] = i;

// delegate call
fptr(arrayI, out arrayB);

I successfully retrieve my int[128] into C++, switch the array length, and affecting the right adress to my 'output' var, but C# is only retrieving a byte[1] as return. It seems that i can't hack a managed variable like that so easily.

So i really start to think that all theses casts i want to achieve are just impossible in C# (int[] -> byte[], bool[] -> byte[], double[] -> byte[]...) without Allocating/copying...

What am-i missing?

1
Can you be more specific at what are you trying to do? You are serializing arrays? Serialize where? HDD can be your real bottleneck. And perhaps you should use byte[] to hold your original data (not need to serialize then, but retrieving data is tricky). - Sinatr
I'm surprised that the Regular Loop is so bad with performance. It strikes me that it should be better and merely looping of two or three hundred values alone doesn't sound like anything that should cause performance problems. Perhaps you should investigate this further rather than writing it off as unavoidable performance impact. - Chris
One note on the "C++ hack" I will note that you can do messing with pointers and stuff in c# if you want to. I've not done it myself but msdn.microsoft.com/en-us/library/f58wzh21(VS.80).aspx might be a starting point if you wanted to look at it. I've come across it in the context of fast manipulation of bitmap data in images but given that is also just manipulating arrays it might work for you if you really need performance. - Chris
@Sinatr i've edited my top message, but the goal is just to binary Serialize Arrays of primitives Values (can be byte, sbyte, short, ushort, int, sint, long, slong, double, decimal, DateTime, TimeSpan or bool) as quickly as i can. The destination doesn't matters, it can be a MemoryStream, a network messages, or for Files. - jlevet
@Chris i forgot to say but i've already tried fixed blocks unsuccessfully, i even tried direct IL code to cast, but i always have an InvalidCastException as result - jlevet

1 Answers

4
votes

How about using Buffer.BlockCopy?

// serialize
var intArray = new[] { 1, 2, 3, 4, 5, 6, 7, 8 };
var byteArray = new byte[intArray.Length * 4];
Buffer.BlockCopy(intArray, 0, byteArray, 0, byteArray.Length);

// deserialize and test
var intArray2 = new int[byteArray.Length / 4];
Buffer.BlockCopy(byteArray, 0, intArray2, 0, byteArray.Length);
Console.WriteLine(intArray.SequenceEqual(intArray2));    // true

Note that BlockCopy is still allocating/copying behind the scenes. I'm fairly sure that this is unavoidable in managed code, and BlockCopy is probably about as good as it gets for this.