1468
votes

How can you convert a byte array to a hexadecimal string, and vice versa?

30
The accepted answer below appear to allocate a horrible amount of strings in the string to bytes conversion. I'm wondering how this impacts performanceWim Coenen

30 Answers

1484
votes

Either:

public static string ByteArrayToString(byte[] ba)
{
  StringBuilder hex = new StringBuilder(ba.Length * 2);
  foreach (byte b in ba)
    hex.AppendFormat("{0:x2}", b);
  return hex.ToString();
}

or:

public static string ByteArrayToString(byte[] ba)
{
  return BitConverter.ToString(ba).Replace("-","");
}

There are even more variants of doing it, for example here.

The reverse conversion would go like this:

public static byte[] StringToByteArray(String hex)
{
  int NumberChars = hex.Length;
  byte[] bytes = new byte[NumberChars / 2];
  for (int i = 0; i < NumberChars; i += 2)
    bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
  return bytes;
}

Using Substring is the best option in combination with Convert.ToByte. See this answer for more information. If you need better performance, you must avoid Convert.ToByte before you can drop SubString.

514
votes

Performance Analysis

Note: new leader as of 2015-08-20.

I ran each of the various conversion methods through some crude Stopwatch performance testing, a run with a random sentence (n=61, 1000 iterations) and a run with a Project Gutenburg text (n=1,238,957, 150 iterations). Here are the results, roughly from fastest to slowest. All measurements are in ticks (10,000 ticks = 1 ms) and all relative notes are compared to the [slowest] StringBuilder implementation. For the code used, see below or the test framework repo where I now maintain the code for running this.

Disclaimer

WARNING: Do not rely on these stats for anything concrete; they are simply a sample run of sample data. If you really need top-notch performance, please test these methods in an environment representative of your production needs with data representative of what you will use.

Results

Lookup tables have taken the lead over byte manipulation. Basically, there is some form of precomputing what any given nibble or byte will be in hex. Then, as you rip through the data, you simply look up the next portion to see what hex string it would be. That value is then added to the resulting string output in some fashion. For a long time byte manipulation, potentially harder to read by some developers, was the top-performing approach.

Your best bet is still going to be finding some representative data and trying it out in a production-like environment. If you have different memory constraints, you may prefer a method with fewer allocations to one that would be faster but consume more memory.

Testing Code

Feel free to play with the testing code I used. A version is included here but feel free to clone the repo and add your own methods. Please submit a pull request if you find anything interesting or want to help improve the testing framework it uses.

  1. Add the new static method (Func<byte[], string>) to /Tests/ConvertByteArrayToHexString/Test.cs.
  2. Add that method's name to the TestCandidates return value in that same class.
  3. Make sure you are running the input version you want, sentence or text, by toggling the comments in GenerateTestInput in that same class.
  4. Hit F5 and wait for the output (an HTML dump is also generated in the /bin folder).
static string ByteArrayToHexStringViaStringJoinArrayConvertAll(byte[] bytes) {
    return string.Join(string.Empty, Array.ConvertAll(bytes, b => b.ToString("X2")));
}
static string ByteArrayToHexStringViaStringConcatArrayConvertAll(byte[] bytes) {
    return string.Concat(Array.ConvertAll(bytes, b => b.ToString("X2")));
}
static string ByteArrayToHexStringViaBitConverter(byte[] bytes) {
    string hex = BitConverter.ToString(bytes);
    return hex.Replace("-", "");
}
static string ByteArrayToHexStringViaStringBuilderAggregateByteToString(byte[] bytes) {
    return bytes.Aggregate(new StringBuilder(bytes.Length * 2), (sb, b) => sb.Append(b.ToString("X2"))).ToString();
}
static string ByteArrayToHexStringViaStringBuilderForEachByteToString(byte[] bytes) {
    StringBuilder hex = new StringBuilder(bytes.Length * 2);
    foreach (byte b in bytes)
        hex.Append(b.ToString("X2"));
    return hex.ToString();
}
static string ByteArrayToHexStringViaStringBuilderAggregateAppendFormat(byte[] bytes) {
    return bytes.Aggregate(new StringBuilder(bytes.Length * 2), (sb, b) => sb.AppendFormat("{0:X2}", b)).ToString();
}
static string ByteArrayToHexStringViaStringBuilderForEachAppendFormat(byte[] bytes) {
    StringBuilder hex = new StringBuilder(bytes.Length * 2);
    foreach (byte b in bytes)
        hex.AppendFormat("{0:X2}", b);
    return hex.ToString();
}
static string ByteArrayToHexViaByteManipulation(byte[] bytes) {
    char[] c = new char[bytes.Length * 2];
    byte b;
    for (int i = 0; i < bytes.Length; i++) {
        b = ((byte)(bytes[i] >> 4));
        c[i * 2] = (char)(b > 9 ? b + 0x37 : b + 0x30);
        b = ((byte)(bytes[i] & 0xF));
        c[i * 2 + 1] = (char)(b > 9 ? b + 0x37 : b + 0x30);
    }
    return new string(c);
}
static string ByteArrayToHexViaByteManipulation2(byte[] bytes) {
    char[] c = new char[bytes.Length * 2];
    int b;
    for (int i = 0; i < bytes.Length; i++) {
        b = bytes[i] >> 4;
        c[i * 2] = (char)(55 + b + (((b - 10) >> 31) & -7));
        b = bytes[i] & 0xF;
        c[i * 2 + 1] = (char)(55 + b + (((b - 10) >> 31) & -7));
    }
    return new string(c);
}
static string ByteArrayToHexViaSoapHexBinary(byte[] bytes) {
    SoapHexBinary soapHexBinary = new SoapHexBinary(bytes);
    return soapHexBinary.ToString();
}
static string ByteArrayToHexViaLookupAndShift(byte[] bytes) {
    StringBuilder result = new StringBuilder(bytes.Length * 2);
    string hexAlphabet = "0123456789ABCDEF";
    foreach (byte b in bytes) {
        result.Append(hexAlphabet[(int)(b >> 4)]);
        result.Append(hexAlphabet[(int)(b & 0xF)]);
    }
    return result.ToString();
}
static readonly uint* _lookup32UnsafeP = (uint*)GCHandle.Alloc(_Lookup32, GCHandleType.Pinned).AddrOfPinnedObject();
static string ByteArrayToHexViaLookup32UnsafeDirect(byte[] bytes) {
    var lookupP = _lookup32UnsafeP;
    var result = new string((char)0, bytes.Length * 2);
    fixed (byte* bytesP = bytes)
    fixed (char* resultP = result) {
        uint* resultP2 = (uint*)resultP;
        for (int i = 0; i < bytes.Length; i++) {
            resultP2[i] = lookupP[bytesP[i]];
        }
    }
    return result;
}
static uint[] _Lookup32 = Enumerable.Range(0, 255).Select(i => {
    string s = i.ToString("X2");
    return ((uint)s[0]) + ((uint)s[1] << 16);
}).ToArray();
static string ByteArrayToHexViaLookupPerByte(byte[] bytes) {
    var result = new char[bytes.Length * 2];
    for (int i = 0; i < bytes.Length; i++)
    {
        var val = _Lookup32[bytes[i]];
        result[2*i] = (char)val;
        result[2*i + 1] = (char) (val >> 16);
    }
    return new string(result);
}
static string ByteArrayToHexViaLookup(byte[] bytes) {
    string[] hexStringTable = new string[] {
        "00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "0A", "0B", "0C", "0D", "0E", "0F",
        "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "1A", "1B", "1C", "1D", "1E", "1F",
        "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "2A", "2B", "2C", "2D", "2E", "2F",
        "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "3A", "3B", "3C", "3D", "3E", "3F",
        "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "4A", "4B", "4C", "4D", "4E", "4F",
        "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "5A", "5B", "5C", "5D", "5E", "5F",
        "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "6A", "6B", "6C", "6D", "6E", "6F",
        "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "7A", "7B", "7C", "7D", "7E", "7F",
        "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "8A", "8B", "8C", "8D", "8E", "8F",
        "90", "91", "92", "93", "94", "95", "96", "97", "98", "99", "9A", "9B", "9C", "9D", "9E", "9F",
        "A0", "A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "AA", "AB", "AC", "AD", "AE", "AF",
        "B0", "B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "BA", "BB", "BC", "BD", "BE", "BF",
        "C0", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "CA", "CB", "CC", "CD", "CE", "CF",
        "D0", "D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "DA", "DB", "DC", "DD", "DE", "DF",
        "E0", "E1", "E2", "E3", "E4", "E5", "E6", "E7", "E8", "E9", "EA", "EB", "EC", "ED", "EE", "EF",
        "F0", "F1", "F2", "F3", "F4", "F5", "F6", "F7", "F8", "F9", "FA", "FB", "FC", "FD", "FE", "FF",
    };
    StringBuilder result = new StringBuilder(bytes.Length * 2);
    foreach (byte b in bytes) {
        result.Append(hexStringTable[b]);
    }
    return result.ToString();
}

Update (2010-01-13)

Added Waleed's answer to analysis. Quite fast.

Update (2011-10-05)

Added string.Concat Array.ConvertAll variant for completeness (requires .NET 4.0). On par with string.Join version.

Update (2012-02-05)

Test repo includes more variants such as StringBuilder.Append(b.ToString("X2")). None upset the results any. foreach is faster than {IEnumerable}.Aggregate, for instance, but BitConverter still wins.

Update (2012-04-03)

Added Mykroft's SoapHexBinary answer to analysis, which took over third place.

Update (2013-01-15)

Added CodesInChaos's byte manipulation answer, which took over first place (by a large margin on large blocks of text).

Update (2013-05-23)

Added Nathan Moinvaziri's lookup answer and the variant from Brian Lambert's blog. Both rather fast, but not taking the lead on the test machine I used (AMD Phenom 9750).

Update (2014-07-31)

Added @CodesInChaos's new byte-based lookup answer. It appears to have taken the lead on both the sentence tests and the full-text tests.

Update (2015-08-20)

Added airbreather's optimizations and unsafe variant to this answer's repo. If you want to play in the unsafe game, you can get some huge performance gains over any of the prior top winners on both short strings and large texts.

251
votes

There's a class called SoapHexBinary that does exactly what you want.

using System.Runtime.Remoting.Metadata.W3cXsd2001;

public static byte[] GetStringToBytes(string value)
{
    SoapHexBinary shb = SoapHexBinary.Parse(value);
    return shb.Value;
}

public static string GetBytesToString(byte[] value)
{
    SoapHexBinary shb = new SoapHexBinary(value);
    return shb.ToString();
}
148
votes

When writing crypto code it's common to avoid data dependent branches and table lookups to ensure the runtime doesn't depend on the data, since data dependent timing can lead to side-channel attacks.

It's also pretty fast.

static string ByteToHexBitFiddle(byte[] bytes)
{
    char[] c = new char[bytes.Length * 2];
    int b;
    for (int i = 0; i < bytes.Length; i++) {
        b = bytes[i] >> 4;
        c[i * 2] = (char)(55 + b + (((b-10)>>31)&-7));
        b = bytes[i] & 0xF;
        c[i * 2 + 1] = (char)(55 + b + (((b-10)>>31)&-7));
    }
    return new string(c);
}

Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn


Abandon all hope, ye who enter here

An explanation of the weird bit fiddling:

  1. bytes[i] >> 4 extracts the high nibble of a byte
    bytes[i] & 0xF extracts the low nibble of a byte
  2. b - 10
    is < 0 for values b < 10, which will become a decimal digit
    is >= 0 for values b > 10, which will become a letter from A to F.
  3. Using i >> 31 on a signed 32 bit integer extracts the sign, thanks to sign extension. It will be -1 for i < 0 and 0 for i >= 0.
  4. Combining 2) and 3), shows that (b-10)>>31 will be 0 for letters and -1 for digits.
  5. Looking at the case for letters, the last summand becomes 0, and b is in the range 10 to 15. We want to map it to A(65) to F(70), which implies adding 55 ('A'-10).
  6. Looking at the case for digits, we want to adapt the last summand so it maps b from the range 0 to 9 to the range 0(48) to 9(57). This means it needs to become -7 ('0' - 55).
    Now we could just multiply with 7. But since -1 is represented by all bits being 1, we can instead use & -7 since (0 & -7) == 0 and (-1 & -7) == -7.

Some further considerations:

  • I didn't use a second loop variable to index into c, since measurement shows that calculating it from i is cheaper.
  • Using exactly i < bytes.Length as upper bound of the loop allows the JITter to eliminate bounds checks on bytes[i], so I chose that variant.
  • Making b an int allows unnecessary conversions from and to byte.
107
votes

If you want more flexibility than BitConverter, but don't want those clunky 1990s-style explicit loops, then you can do:

String.Join(String.Empty, Array.ConvertAll(bytes, x => x.ToString("X2")));

Or, if you're using .NET 4.0:

String.Concat(Array.ConvertAll(bytes, x => x.ToString("X2")));

(The latter from a comment on the original post.)

76
votes

Another lookup table based approach. This one uses only one lookup table for each byte, instead of a lookup table per nibble.

private static readonly uint[] _lookup32 = CreateLookup32();

private static uint[] CreateLookup32()
{
    var result = new uint[256];
    for (int i = 0; i < 256; i++)
    {
        string s=i.ToString("X2");
        result[i] = ((uint)s[0]) + ((uint)s[1] << 16);
    }
    return result;
}

private static string ByteArrayToHexViaLookup32(byte[] bytes)
{
    var lookup32 = _lookup32;
    var result = new char[bytes.Length * 2];
    for (int i = 0; i < bytes.Length; i++)
    {
        var val = lookup32[bytes[i]];
        result[2*i] = (char)val;
        result[2*i + 1] = (char) (val >> 16);
    }
    return new string(result);
}

I also tested variants of this using ushort, struct{char X1, X2}, struct{byte X1, X2} in the lookup table.

Depending on the compilation target (x86, X64) those either had the approximately same performance or were slightly slower than this variant.


And for even higher performance, its unsafe sibling:

private static readonly uint[] _lookup32Unsafe = CreateLookup32Unsafe();
private static readonly uint* _lookup32UnsafeP = (uint*)GCHandle.Alloc(_lookup32Unsafe,GCHandleType.Pinned).AddrOfPinnedObject();

private static uint[] CreateLookup32Unsafe()
{
    var result = new uint[256];
    for (int i = 0; i < 256; i++)
    {
        string s=i.ToString("X2");
        if(BitConverter.IsLittleEndian)
            result[i] = ((uint)s[0]) + ((uint)s[1] << 16);
        else
            result[i] = ((uint)s[1]) + ((uint)s[0] << 16);
    }
    return result;
}

public static string ByteArrayToHexViaLookup32Unsafe(byte[] bytes)
{
    var lookupP = _lookup32UnsafeP;
    var result = new char[bytes.Length * 2];
    fixed(byte* bytesP = bytes)
    fixed (char* resultP = result)
    {
        uint* resultP2 = (uint*)resultP;
        for (int i = 0; i < bytes.Length; i++)
        {
            resultP2[i] = lookupP[bytesP[i]];
        }
    }
    return new string(result);
}

Or if you consider it acceptable to write into the string directly:

public static string ByteArrayToHexViaLookup32UnsafeDirect(byte[] bytes)
{
    var lookupP = _lookup32UnsafeP;
    var result = new string((char)0, bytes.Length * 2);
    fixed (byte* bytesP = bytes)
    fixed (char* resultP = result)
    {
        uint* resultP2 = (uint*)resultP;
        for (int i = 0; i < bytes.Length; i++)
        {
            resultP2[i] = lookupP[bytesP[i]];
        }
    }
    return result;
}
67
votes

You can use the BitConverter.ToString method:

byte[] bytes = {0, 1, 2, 4, 8, 16, 32, 64, 128, 256}
Console.WriteLine( BitConverter.ToString(bytes));

Output:

00-01-02-04-08-10-20-40-80-FF

More information: BitConverter.ToString Method (Byte[])

57
votes

I just encountered the very same problem today, and I came across this code:

private static string ByteArrayToHex(byte[] barray)
{
    char[] c = new char[barray.Length * 2];
    byte b;
    for (int i = 0; i < barray.Length; ++i)
    {
        b = ((byte)(barray[i] >> 4));
        c[i * 2] = (char)(b > 9 ? b + 0x37 : b + 0x30);
        b = ((byte)(barray[i] & 0xF));
        c[i * 2 + 1] = (char)(b > 9 ? b + 0x37 : b + 0x30);
    }
    return new string(c);
}

Source: Forum post byte[] Array to Hex String (see the post by PZahra). I modified the code a little to remove the 0x prefix.

I did some performance testing to the code and it was almost eight times faster than using BitConverter.ToString() (the fastest according to patridge's post).

22
votes

This is an answer to revision 4 of Tomalak's highly popular answer (and subsequent edits).

I'll make the case that this edit is wrong, and explain why it could be reverted. Along the way, you might learn a thing or two about some internals, and see yet another example of what premature optimization really is and how it can bite you.

tl;dr: Just use Convert.ToByte and String.Substring if you're in a hurry ("Original code" below), it's the best combination if you don't want to re-implement Convert.ToByte. Use something more advanced (see other answers) that doesn't use Convert.ToByte if you need performance. Do not use anything else other than String.Substring in combination with Convert.ToByte, unless someone has something interesting to say about this in the comments of this answer.

warning: This answer may become obsolete if a Convert.ToByte(char[], Int32) overload is implemented in the framework. This is unlikely to happen soon.

As a general rule, I don't much like to say "don't optimize prematurely", because nobody knows when "premature" is. The only thing you must consider when deciding whether to optimize or not is: "Do I have the time and resources to investigate optimization approaches properly?". If you don't, then it's too soon, wait until your project is more mature or until you need the performance (if there is a real need, then you will make the time). In the meantime, do the simplest thing that could possibly work instead.

Original code:

    public static byte[] HexadecimalStringToByteArray_Original(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        for (var i = 0; i < outputLength; i++)
            output[i] = Convert.ToByte(input.Substring(i * 2, 2), 16);
        return output;
    }

Revision 4:

    public static byte[] HexadecimalStringToByteArray_Rev4(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        using (var sr = new StringReader(input))
        {
            for (var i = 0; i < outputLength; i++)
                output[i] = Convert.ToByte(new string(new char[2] { (char)sr.Read(), (char)sr.Read() }), 16);
        }
        return output;
    }

The revision avoids String.Substring and uses a StringReader instead. The given reason is:

Edit: you can improve performance for long strings by using a single pass parser, like so:

Well, looking at the reference code for String.Substring, it's clearly "single-pass" already; and why shouldn't it be? It operates at byte-level, not on surrogate pairs.

It does allocate a new string however, but then you need to allocate one to pass to Convert.ToByte anyway. Furthermore, the solution provided in the revision allocates yet another object on every iteration (the two-char array); you can safely put that allocation outside the loop and reuse the array to avoid that.

    public static byte[] HexadecimalStringToByteArray(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        using (var sr = new StringReader(input))
        {
            for (var i = 0; i < outputLength; i++)
            {
                numeral[0] = (char)sr.Read();
                numeral[1] = (char)sr.Read();
                output[i] = Convert.ToByte(new string(numeral), 16);
            }
        }
        return output;
    }

Each hexadecimal numeral represents a single octet using two digits (symbols).

But then, why call StringReader.Read twice? Just call its second overload and ask it to read two characters in the two-char array at once; and reduce the amount of calls by two.

    public static byte[] HexadecimalStringToByteArray(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        using (var sr = new StringReader(input))
        {
            for (var i = 0; i < outputLength; i++)
            {
                var read = sr.Read(numeral, 0, 2);
                Debug.Assert(read == 2);
                output[i] = Convert.ToByte(new string(numeral), 16);
            }
        }
        return output;
    }

What you're left with is a string reader whose only added "value" is a parallel index (internal _pos) which you could have declared yourself (as j for example), a redundant length variable (internal _length), and a redundant reference to the input string (internal _s). In other words, it's useless.

If you wonder how Read "reads", just look at the code, all it does is call String.CopyTo on the input string. The rest is just book-keeping overhead to maintain values we don't need.

So, remove the string reader already, and call CopyTo yourself; it's simpler, clearer, and more efficient.

    public static byte[] HexadecimalStringToByteArray(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        for (int i = 0, j = 0; i < outputLength; i++, j += 2)
        {
            input.CopyTo(j, numeral, 0, 2);
            output[i] = Convert.ToByte(new string(numeral), 16);
        }
        return output;
    }

Do you really need a j index that increments in steps of two parallel to i? Of course not, just multiply i by two (which the compiler should be able to optimize to an addition).

    public static byte[] HexadecimalStringToByteArray_BestEffort(string input)
    {
        var outputLength = input.Length / 2;
        var output = new byte[outputLength];
        var numeral = new char[2];
        for (int i = 0; i < outputLength; i++)
        {
            input.CopyTo(i * 2, numeral, 0, 2);
            output[i] = Convert.ToByte(new string(numeral), 16);
        }
        return output;
    }

What does the solution look like now? Exactly like it was at the beginning, only instead of using String.Substring to allocate the string and copy the data to it, you're using an intermediary array to which you copy the hexadecimal numerals to, then allocate the string yourself and copy the data again from the array and into the string (when you pass it in the string constructor). The second copy might be optimized-out if the string is already in the intern pool, but then String.Substring will also be able to avoid it in these cases.

In fact, if you look at String.Substring again, you see that it uses some low-level internal knowledge of how strings are constructed to allocate the string faster than you could normally do it, and it inlines the same code used by CopyTo directly in there to avoid the call overhead.

String.Substring

  • Worst-case: One fast allocation, one fast copy.
  • Best-case: No allocation, no copy.

Manual method

  • Worst-case: Two normal allocations, one normal copy, one fast copy.
  • Best-case: One normal allocation, one normal copy.

Conclusion? If you want to use Convert.ToByte(String, Int32) (because you don't want to re-implement that functionality yourself), there doesn't seem to be a way to beat String.Substring; all you do is run in circles, re-inventing the wheel (only with sub-optimal materials).

Note that using Convert.ToByte and String.Substring is a perfectly valid choice if you don't need extreme performance. Remember: only opt for an alternative if you have the time and resources to investigate how it works properly.

If there was a Convert.ToByte(char[], Int32), things would be different of course (it would be possible to do what I described above and completely avoid String).

I suspect that people who report better performance by "avoiding String.Substring" also avoid Convert.ToByte(String, Int32), which you should really be doing if you need the performance anyway. Look at the countless other answers to discover all the different approaches to do that.

Disclaimer: I haven't decompiled the latest version of the framework to verify that the reference source is up-to-date, I assume it is.

Now, it all sounds good and logical, hopefully even obvious if you've managed to get so far. But is it true?

Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz
    Cores: 8
    Current Clock Speed: 2600
    Max Clock Speed: 2600
--------------------
Parsing hexadecimal string into an array of bytes
--------------------
HexadecimalStringToByteArray_Original: 7,777.09 average ticks (over 10000 runs), 1.2X
HexadecimalStringToByteArray_BestEffort: 8,550.82 average ticks (over 10000 runs), 1.1X
HexadecimalStringToByteArray_Rev4: 9,218.03 average ticks (over 10000 runs), 1.0X

Yes!

Props to Partridge for the bench framework, it's easy to hack. The input used is the following SHA-1 hash repeated 5000 times to make a 100,000 bytes long string.

209113288F93A9AB8E474EA78D899AFDBB874355

Have fun! (But optimize with moderation.)

18
votes

Complement to answer by @CodesInChaos (reversed method)

public static byte[] HexToByteUsingByteManipulation(string s)
{
    byte[] bytes = new byte[s.Length / 2];
    for (int i = 0; i < bytes.Length; i++)
    {
        int hi = s[i*2] - 65;
        hi = hi + 10 + ((hi >> 31) & 7);

        int lo = s[i*2 + 1] - 65;
        lo = lo + 10 + ((lo >> 31) & 7) & 0x0f;

        bytes[i] = (byte) (lo | hi << 4);
    }
    return bytes;
}

Explanation:

& 0x0f is to support also lower case letters

hi = hi + 10 + ((hi >> 31) & 7); is the same as:

hi = ch-65 + 10 + (((ch-65) >> 31) & 7);

For '0'..'9' it is the same as hi = ch - 65 + 10 + 7; which is hi = ch - 48 (this is because of 0xffffffff & 7).

For 'A'..'F' it is hi = ch - 65 + 10; (this is because of 0x00000000 & 7).

For 'a'..'f' we have to big numbers so we must subtract 32 from default version by making some bits 0 by using & 0x0f.

65 is code for 'A'

48 is code for '0'

7 is the number of letters between '9' and 'A' in the ASCII table (...456789:;<=>?@ABCD...).

18
votes

This problem could also be solved using a look-up table. This would require a small amount of static memory for both the encoder and decoder. This method will however be fast:

  • Encoder table 512 bytes or 1024 bytes (twice the size if both upper and lower case is needed)
  • Decoder table 256 bytes or 64 KiB (either a single char look-up or dual char look-up)

My solution uses 1024 bytes for the encoding table, and 256 bytes for decoding.

Decoding

private static readonly byte[] LookupTable = new byte[] {
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};

private static byte Lookup(char c)
{
  var b = LookupTable[c];
  if (b == 255)
    throw new IOException("Expected a hex character, got " + c);
  return b;
}

public static byte ToByte(char[] chars, int offset)
{
  return (byte)(Lookup(chars[offset]) << 4 | Lookup(chars[offset + 1]));
}

Encoding

private static readonly char[][] LookupTableUpper;
private static readonly char[][] LookupTableLower;

static Hex()
{
  LookupTableLower = new char[256][];
  LookupTableUpper = new char[256][];
  for (var i = 0; i < 256; i++)
  {
    LookupTableLower[i] = i.ToString("x2").ToCharArray();
    LookupTableUpper[i] = i.ToString("X2").ToCharArray();
  }
}

public static char[] ToCharLower(byte[] b, int bOffset)
{
  return LookupTableLower[b[bOffset]];
}

public static char[] ToCharUpper(byte[] b, int bOffset)
{
  return LookupTableUpper[b[bOffset]];
}

Comparison

StringBuilderToStringFromBytes:   106148
BitConverterToStringFromBytes:     15783
ArrayConvertAllToStringFromBytes:  54290
ByteManipulationToCharArray:        8444
TableBasedToCharArray:              5651 *

* this solution

Note

During decoding IOException and IndexOutOfRangeException could occur (if a character has a too high value > 256). Methods for de/encoding streams or arrays should be implemented, this is just a proof of concept.

18
votes

As of .NET 5 RC2 you can use:

Overloads are available that take span parameters.

11
votes

Why make it complex? This is simple in Visual Studio 2008:

C#:

string hex = BitConverter.ToString(YourByteArray).Replace("-", "");

VB:

Dim hex As String = BitConverter.ToString(YourByteArray).Replace("-", "")
9
votes

This is a great post. I like Waleed's solution. I haven't run it through patridge's test but it seems to be quite fast. I also needed the reverse process, converting a hex string to a byte array, so I wrote it as a reversal of Waleed's solution. Not sure if it's any faster than Tomalak's original solution. Again, I did not run the reverse process through patridge's test either.

private byte[] HexStringToByteArray(string hexString)
{
    int hexStringLength = hexString.Length;
    byte[] b = new byte[hexStringLength / 2];
    for (int i = 0; i < hexStringLength; i += 2)
    {
        int topChar = (hexString[i] > 0x40 ? hexString[i] - 0x37 : hexString[i] - 0x30) << 4;
        int bottomChar = hexString[i + 1] > 0x40 ? hexString[i + 1] - 0x37 : hexString[i + 1] - 0x30;
        b[i / 2] = Convert.ToByte(topChar + bottomChar);
    }
    return b;
}
7
votes

Safe versions:

public static class HexHelper
{
    [System.Diagnostics.Contracts.Pure]
    public static string ToHex(this byte[] value)
    {
        if (value == null)
            throw new ArgumentNullException("value");

        const string hexAlphabet = @"0123456789ABCDEF";

        var chars = new char[checked(value.Length * 2)];
        unchecked
        {
            for (int i = 0; i < value.Length; i++)
            {
                chars[i * 2] = hexAlphabet[value[i] >> 4];
                chars[i * 2 + 1] = hexAlphabet[value[i] & 0xF];
            }
        }
        return new string(chars);
    }

    [System.Diagnostics.Contracts.Pure]
    public static byte[] FromHex(this string value)
    {
        if (value == null)
            throw new ArgumentNullException("value");
        if (value.Length % 2 != 0)
            throw new ArgumentException("Hexadecimal value length must be even.", "value");

        unchecked
        {
            byte[] result = new byte[value.Length / 2];
            for (int i = 0; i < result.Length; i++)
            {
                // 0(48) - 9(57) -> 0 - 9
                // A(65) - F(70) -> 10 - 15
                int b = value[i * 2]; // High 4 bits.
                int val = ((b - '0') + ((('9' - b) >> 31) & -7)) << 4;
                b = value[i * 2 + 1]; // Low 4 bits.
                val += (b - '0') + ((('9' - b) >> 31) & -7);
                result[i] = checked((byte)val);
            }
            return result;
        }
    }
}

Unsafe versions For those who prefer performance and do not afraid of unsafeness. About 35% faster ToHex and 10% faster FromHex.

public static class HexUnsafeHelper
{
    [System.Diagnostics.Contracts.Pure]
    public static unsafe string ToHex(this byte[] value)
    {
        if (value == null)
            throw new ArgumentNullException("value");

        const string alphabet = @"0123456789ABCDEF";

        string result = new string(' ', checked(value.Length * 2));
        fixed (char* alphabetPtr = alphabet)
        fixed (char* resultPtr = result)
        {
            char* ptr = resultPtr;
            unchecked
            {
                for (int i = 0; i < value.Length; i++)
                {
                    *ptr++ = *(alphabetPtr + (value[i] >> 4));
                    *ptr++ = *(alphabetPtr + (value[i] & 0xF));
                }
            }
        }
        return result;
    }

    [System.Diagnostics.Contracts.Pure]
    public static unsafe byte[] FromHex(this string value)
    {
        if (value == null)
            throw new ArgumentNullException("value");
        if (value.Length % 2 != 0)
            throw new ArgumentException("Hexadecimal value length must be even.", "value");

        unchecked
        {
            byte[] result = new byte[value.Length / 2];
            fixed (char* valuePtr = value)
            {
                char* valPtr = valuePtr;
                for (int i = 0; i < result.Length; i++)
                {
                    // 0(48) - 9(57) -> 0 - 9
                    // A(65) - F(70) -> 10 - 15
                    int b = *valPtr++; // High 4 bits.
                    int val = ((b - '0') + ((('9' - b) >> 31) & -7)) << 4;
                    b = *valPtr++; // Low 4 bits.
                    val += (b - '0') + ((('9' - b) >> 31) & -7);
                    result[i] = checked((byte)val);
                }
            }
            return result;
        }
    }
}

BTW For benchmark testing initializing alphabet every time convert function called is wrong, alphabet must be const (for string) or static readonly (for char[]). Then alphabet-based conversion of byte[] to string becomes as fast as byte manipulation versions.

And of course test must be compiled in Release (with optimization) and with debug option "Suppress JIT optimization" turned off (same for "Enable Just My Code" if code must be debuggable).

7
votes

Not to pile on to the many answers here, but I found a fairly optimal (~4.5x better than accepted), straightforward implementation of the hex string parser. First, output from my tests (the first batch is my implementation):

Give me that string:
04c63f7842740c77e545bb0b2ade90b384f119f6ab57b680b7aa575a2f40939f

Time to parse 100,000 times: 50.4192 ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F

Accepted answer: (StringToByteArray)
Time to parse 100000 times: 233.1264ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F

With Mono's implementation:
Time to parse 100000 times: 777.2544ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F

With SoapHexBinary:
Time to parse 100000 times: 845.1456ms
Result as base64: BMY/eEJ0DHflRbsLKt6Qs4TxGfarV7aAt6pXWi9Ak58=
BitConverter'd: 04-C6-3F-78-42-74-0C-77-E5-45-BB-0B-2A-DE-90-B3-84-F1-19-F6-AB-5
7-B6-80-B7-AA-57-5A-2F-40-93-9F

The base64 and 'BitConverter'd' lines are there to test for correctness. Note that they are equal.

The implementation:

public static byte[] ToByteArrayFromHex(string hexString)
{
  if (hexString.Length % 2 != 0) throw new ArgumentException("String must have an even length");
  var array = new byte[hexString.Length / 2];
  for (int i = 0; i < hexString.Length; i += 2)
  {
    array[i/2] = ByteFromTwoChars(hexString[i], hexString[i + 1]);
  }
  return array;
}

private static byte ByteFromTwoChars(char p, char p_2)
{
  byte ret;
  if (p <= '9' && p >= '0')
  {
    ret = (byte) ((p - '0') << 4);
  }
  else if (p <= 'f' && p >= 'a')
  {
    ret = (byte) ((p - 'a' + 10) << 4);
  }
  else if (p <= 'F' && p >= 'A')
  {
    ret = (byte) ((p - 'A' + 10) << 4);
  } else throw new ArgumentException("Char is not a hex digit: " + p,"p");

  if (p_2 <= '9' && p_2 >= '0')
  {
    ret |= (byte) ((p_2 - '0'));
  }
  else if (p_2 <= 'f' && p_2 >= 'a')
  {
    ret |= (byte) ((p_2 - 'a' + 10));
  }
  else if (p_2 <= 'F' && p_2 >= 'A')
  {
    ret |= (byte) ((p_2 - 'A' + 10));
  } else throw new ArgumentException("Char is not a hex digit: " + p_2, "p_2");

  return ret;
}

I tried some stuff with unsafe and moving the (clearly redundant) character-to-nibble if sequence to another method, but this was the fastest it got.

(I concede that this answers half the question. I felt that the string->byte[] conversion was underrepresented, while the byte[]->string angle seems to be well covered. Thus, this answer.)

6
votes

Inverse function for Waleed Eissa code (Hex String To Byte Array):

    public static byte[] HexToBytes(this string hexString)        
    {
        byte[] b = new byte[hexString.Length / 2];            
        char c;
        for (int i = 0; i < hexString.Length / 2; i++)
        {
            c = hexString[i * 2];
            b[i] = (byte)((c < 0x40 ? c - 0x30 : (c < 0x47 ? c - 0x37 : c - 0x57)) << 4);
            c = hexString[i * 2 + 1];
            b[i] += (byte)(c < 0x40 ? c - 0x30 : (c < 0x47 ? c - 0x37 : c - 0x57));
        }

        return b;
    }

Waleed Eissa function with lower case support:

    public static string BytesToHex(this byte[] barray, bool toLowerCase = true)
    {
        byte addByte = 0x37;
        if (toLowerCase) addByte = 0x57;
        char[] c = new char[barray.Length * 2];
        byte b;
        for (int i = 0; i < barray.Length; ++i)
        {
            b = ((byte)(barray[i] >> 4));
            c[i * 2] = (char)(b > 9 ? b + addByte : b + 0x30);
            b = ((byte)(barray[i] & 0xF));
            c[i * 2 + 1] = (char)(b > 9 ? b + addByte : b + 0x30);
        }

        return new string(c);
    }
5
votes

Extension methods (disclaimer: completely untested code, BTW...):

public static class ByteExtensions
{
    public static string ToHexString(this byte[] ba)
    {
        StringBuilder hex = new StringBuilder(ba.Length * 2);

        foreach (byte b in ba)
        {
            hex.AppendFormat("{0:x2}", b);
        }
        return hex.ToString();
    }
}

etc.. Use either of Tomalak's three solutions (with the last one being an extension method on a string).

5
votes

Fastest method for old school people... miss you pointers

    static public byte[] HexStrToByteArray(string str)
    {
        byte[] res = new byte[(str.Length % 2 != 0 ? 0 : str.Length / 2)]; //check and allocate memory
        for (int i = 0, j = 0; j < res.Length; i += 2, j++) //convert loop
            res[j] = (byte)((str[i] % 32 + 9) % 25 * 16 + (str[i + 1] % 32 + 9) % 25);
        return res;
    }
4
votes

From Microsoft's developers, a nice, simple conversion:

public static string ByteArrayToString(byte[] ba) 
{
    // Concatenate the bytes into one long string
    return ba.Aggregate(new StringBuilder(32),
                            (sb, b) => sb.Append(b.ToString("X2"))
                            ).ToString();
}

While the above is clean and compact, performance junkies will scream about it using enumerators. You can get peak performance with an improved version of Tomalak's original answer:

public static string ByteArrayToString(byte[] ba)   
{   
   StringBuilder hex = new StringBuilder(ba.Length * 2);   

   for(int i=0; i < ba.Length; i++)       // <-- Use for loop is faster than foreach   
       hex.Append(ba[i].ToString("X2"));   // <-- ToString is faster than AppendFormat   

   return hex.ToString();   
} 

This is the fastest of all the routines I've seen posted here so far. Don't just take my word for it... performance test each routine and inspect its CIL code for yourself.

4
votes

.NET 5 has added the Convert.ToHexString method.

For those using an older version of .NET

internal static class ByteArrayExtensions
{
    
    public static string ToHexString(this byte[] bytes, Casing casing = Casing.Upper)
    {
        Span<char> result = stackalloc char[0];
        if (bytes.Length > 16)
        {
            var array = new char[bytes.Length * 2];
            result = array.AsSpan();
        }
        else
        {
            result = stackalloc char[bytes.Length * 2];
        }

        int pos = 0;
        foreach (byte b in bytes)
        {
            ToCharsBuffer(b, result, pos, casing);
            pos += 2;
        }

        return result.ToString();
    }

    private static void ToCharsBuffer(byte value, Span<char> buffer, int startingIndex = 0, Casing casing = Casing.Upper)
    {
        uint difference = (((uint)value & 0xF0U) << 4) + ((uint)value & 0x0FU) - 0x8989U;
        uint packedResult = ((((uint)(-(int)difference) & 0x7070U) >> 4) + difference + 0xB9B9U) | (uint)casing;

        buffer[startingIndex + 1] = (char)(packedResult & 0xFF);
        buffer[startingIndex] = (char)(packedResult >> 8);
    }
}

public enum Casing : uint
{
    // Output [ '0' .. '9' ] and [ 'A' .. 'F' ].
    Upper = 0,

    // Output [ '0' .. '9' ] and [ 'a' .. 'f' ].
    Lower = 0x2020U,
}

Adapted from the .NET repository https://github.com/dotnet/runtime/blob/v5.0.3/src/libraries/System.Private.CoreLib/src/System/Convert.cs https://github.com/dotnet/runtime/blob/v5.0.3/src/libraries/Common/src/System/HexConverter.cs

3
votes

I'll enter this bit fiddling competition as I have an answer that also uses bit-fiddling to decode hexadecimals. Note that using character arrays may be even faster as calling StringBuilder methods will take time as well.

public static String ToHex (byte[] data)
{
    int dataLength = data.Length;
    // pre-create the stringbuilder using the length of the data * 2, precisely enough
    StringBuilder sb = new StringBuilder (dataLength * 2);
    for (int i = 0; i < dataLength; i++) {
        int b = data [i];

        // check using calculation over bits to see if first tuple is a letter
        // isLetter is zero if it is a digit, 1 if it is a letter
        int isLetter = (b >> 7) & ((b >> 6) | (b >> 5)) & 1;

        // calculate the code using a multiplication to make up the difference between
        // a digit character and an alphanumerical character
        int code = '0' + ((b >> 4) & 0xF) + isLetter * ('A' - '9' - 1);
        // now append the result, after casting the code point to a character
        sb.Append ((Char)code);

        // do the same with the lower (less significant) tuple
        isLetter = (b >> 3) & ((b >> 2) | (b >> 1)) & 1;
        code = '0' + (b & 0xF) + isLetter * ('A' - '9' - 1);
        sb.Append ((Char)code);
    }
    return sb.ToString ();
}

public static byte[] FromHex (String hex)
{

    // pre-create the array
    int resultLength = hex.Length / 2;
    byte[] result = new byte[resultLength];
    // set validity = 0 (0 = valid, anything else is not valid)
    int validity = 0;
    int c, isLetter, value, validDigitStruct, validDigit, validLetterStruct, validLetter;
    for (int i = 0, hexOffset = 0; i < resultLength; i++, hexOffset += 2) {
        c = hex [hexOffset];

        // check using calculation over bits to see if first char is a letter
        // isLetter is zero if it is a digit, 1 if it is a letter (upper & lowercase)
        isLetter = (c >> 6) & 1;

        // calculate the tuple value using a multiplication to make up the difference between
        // a digit character and an alphanumerical character
        // minus 1 for the fact that the letters are not zero based
        value = ((c & 0xF) + isLetter * (-1 + 10)) << 4;

        // check validity of all the other bits
        validity |= c >> 7; // changed to >>, maybe not OK, use UInt?

        validDigitStruct = (c & 0x30) ^ 0x30;
        validDigit = ((c & 0x8) >> 3) * (c & 0x6);
        validity |= (isLetter ^ 1) * (validDigitStruct | validDigit);

        validLetterStruct = c & 0x18;
        validLetter = (((c - 1) & 0x4) >> 2) * ((c - 1) & 0x2);
        validity |= isLetter * (validLetterStruct | validLetter);

        // do the same with the lower (less significant) tuple
        c = hex [hexOffset + 1];
        isLetter = (c >> 6) & 1;
        value ^= (c & 0xF) + isLetter * (-1 + 10);
        result [i] = (byte)value;

        // check validity of all the other bits
        validity |= c >> 7; // changed to >>, maybe not OK, use UInt?

        validDigitStruct = (c & 0x30) ^ 0x30;
        validDigit = ((c & 0x8) >> 3) * (c & 0x6);
        validity |= (isLetter ^ 1) * (validDigitStruct | validDigit);

        validLetterStruct = c & 0x18;
        validLetter = (((c - 1) & 0x4) >> 2) * ((c - 1) & 0x2);
        validity |= isLetter * (validLetterStruct | validLetter);
    }

    if (validity != 0) {
        throw new ArgumentException ("Hexadecimal encoding incorrect for input " + hex);
    }

    return result;
}

Converted from Java code.

2
votes

And for inserting into an SQL string (if you're not using command parameters):

public static String ByteArrayToSQLHexString(byte[] Source)
{
    return = "0x" + BitConverter.ToString(Source).Replace("-", "");
}
2
votes

In terms of speed, this seems to be better than anything here:

  public static string ToHexString(byte[] data) {
    byte b;
    int i, j, k;
    int l = data.Length;
    char[] r = new char[l * 2];
    for (i = 0, j = 0; i < l; ++i) {
      b = data[i];
      k = b >> 4;
      r[j++] = (char)(k > 9 ? k + 0x37 : k + 0x30);
      k = b & 15;
      r[j++] = (char)(k > 9 ? k + 0x37 : k + 0x30);
    }
    return new string(r);
  }
2
votes

I did not get the code you suggested to work, Olipro. hex[i] + hex[i+1] apparently returned an int.

I did, however have some success by taking some hints from Waleeds code and hammering this together. It's ugly as hell but it seems to work and performs at 1/3 of the time compared to the others according to my tests (using patridges testing mechanism). Depending on input size. Switching around the ?:s to separate out 0-9 first would probably yield a slightly faster result since there are more numbers than letters.

public static byte[] StringToByteArray2(string hex)
{
    byte[] bytes = new byte[hex.Length/2];
    int bl = bytes.Length;
    for (int i = 0; i < bl; ++i)
    {
        bytes[i] = (byte)((hex[2 * i] > 'F' ? hex[2 * i] - 0x57 : hex[2 * i] > '9' ? hex[2 * i] - 0x37 : hex[2 * i] - 0x30) << 4);
        bytes[i] |= (byte)(hex[2 * i + 1] > 'F' ? hex[2 * i + 1] - 0x57 : hex[2 * i + 1] > '9' ? hex[2 * i + 1] - 0x37 : hex[2 * i + 1] - 0x30);
    }
    return bytes;
}
2
votes

This version of ByteArrayToHexViaByteManipulation could be faster.

From my reports:

  • ByteArrayToHexViaByteManipulation3: 1,68 average ticks (over 1000 runs), 17,5X
  • ByteArrayToHexViaByteManipulation2: 1,73 average ticks (over 1000 runs), 16,9X
  • ByteArrayToHexViaByteManipulation: 2,90 average ticks (over 1000 runs), 10,1X
  • ByteArrayToHexViaLookupAndShift: 3,22 average ticks (over 1000 runs), 9,1X
  • ...

    static private readonly char[] hexAlphabet = new char[]
        {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
    static string ByteArrayToHexViaByteManipulation3(byte[] bytes)
    {
        char[] c = new char[bytes.Length * 2];
        byte b;
        for (int i = 0; i < bytes.Length; i++)
        {
            b = ((byte)(bytes[i] >> 4));
            c[i * 2] = hexAlphabet[b];
            b = ((byte)(bytes[i] & 0xF));
            c[i * 2 + 1] = hexAlphabet[b];
        }
        return new string(c);
    }
    

And I think this one is an optimization:

    static private readonly char[] hexAlphabet = new char[]
        {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
    static string ByteArrayToHexViaByteManipulation4(byte[] bytes)
    {
        char[] c = new char[bytes.Length * 2];
        for (int i = 0, ptr = 0; i < bytes.Length; i++, ptr += 2)
        {
            byte b = bytes[i];
            c[ptr] = hexAlphabet[b >> 4];
            c[ptr + 1] = hexAlphabet[b & 0xF];
        }
        return new string(c);
    }
2
votes

Another way is by using stackalloc to reduce GC memory pressure:

static string ByteToHexBitFiddle(byte[] bytes)
{
        var c = stackalloc char[bytes.Length * 2 + 1];
        int b; 
        for (int i = 0; i < bytes.Length; ++i)
        {
            b = bytes[i] >> 4;
            c[i * 2] = (char)(55 + b + (((b - 10) >> 31) & -7));
            b = bytes[i] & 0xF;
            c[i * 2 + 1] = (char)(55 + b + (((b - 10) >> 31) & -7));
        }
        c[bytes.Length * 2 ] = '\0';
        return new string(c);
}
2
votes

For performance I would go with drphrozens solution. A tiny optimization for the decoder could be to use a table for either char to get rid of the "<< 4".

Clearly the two method calls are costly. If some kind of check is made either on input or output data (could be CRC, checksum or whatever) the if (b == 255)... could be skipped and thereby also the method calls altogether.

Using offset++ and offset instead of offset and offset + 1 might give some theoretical benefit but I suspect the compiler handles this better than me.

private static readonly byte[] LookupTableLow = new byte[] {
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};

private static readonly byte[] LookupTableHigh = new byte[] {
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0x00, 0x10, 0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80, 0x90, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xA0, 0xB0, 0xC0, 0xD0, 0xE0, 0xF0, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xA0, 0xB0, 0xC0, 0xD0, 0xE0, 0xF0, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};

private static byte LookupLow(char c)
{
  var b = LookupTableLow[c];
  if (b == 255)
    throw new IOException("Expected a hex character, got " + c);
  return b;
}

private static byte LookupHigh(char c)
{
  var b = LookupTableHigh[c];
  if (b == 255)
    throw new IOException("Expected a hex character, got " + c);
  return b;
}

public static byte ToByte(char[] chars, int offset)
{
  return (byte)(LookupHigh(chars[offset++]) | LookupLow(chars[offset]));
}

This is just off the top of my head and has not been tested or benchmarked.

2
votes

Shortest way and .net core supported:

    public static string BytesToString(byte[] ba) =>
        ba.Aggregate(new StringBuilder(32), (sb, b) => sb.Append(b.ToString("X2"))).ToString();
2
votes

There is a simple one-liner solution not yet mentioned that will convert hex strings into byte arrays (we don't care about negative interpretation here as it does not matter):

BigInteger.Parse(str, System.Globalization.NumberStyles.HexNumber).ToByteArray().Reverse().ToArray();