84
votes

I need a script/tool which reads a binary file and outputs a C/C++ source code array (which represents the file content). Are there any?


(This question was deleted earlier. I put this question back in because it is valuable. I was searching for exactly this on Google and didn't found anything. Of course it is trivial to code myself but I would have saved some minutes if I would have found such a simple script. Thus it is valuable.

This questions also had a lot of down-votes without much explanation. Please comment before you down-vote why you think this is of no value or of bad value.

This question also caused a lot of confusion about what I am asking about. If something is unclear, please ask. I don't really know how to make it more clear. See the answers for examples.

Also (after putting the question here), I already have several answers. I just want to put/link them here (again) because I think it might be useful for someone else searching for this.)

7
Probably people understood that you wanted some kind of decompiler or stuff like that. You may rephrase it as "reads a binary file and outputs a C/C++ declaration of an array initialized to the content of the file" or something like that.Matteo Italia

7 Answers

150
votes

On Debian and other Linux distros is installed by default (along with vim) the xxd tool, which, given the -i option, can do what you want:

matteo@teodeb:~/Desktop$ echo Hello World\! > temp
matteo@teodeb:~/Desktop$ xxd -i temp 
unsigned char temp[] = {
  0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64, 0x21,
  0x0a
};
unsigned int temp_len = 13;
8
votes

The accepted answer using xxd tool is nice if you are on a *nix-like system. Here is a "one-liner" for any system that has python executable on the path:

python -c "import sys;a=sys.argv;open(a[2],'wb').write(('const unsigned char '+a[3]+'[] = {'+','.join([hex(b) for b in open(a[1],'rb').read()])+'};').encode('utf-8'))" <binary file> <header file> <array name>

< binary file > is the name of the file you want to turn into a C header, < header file > is the name of the header file, and < array name > is the name you want the array to have.

The above one-line Python command does approximately the same as the following (much more readable) Python program:

import sys

with open(sys.argv[2],'wb') as result_file:
  result_file.write(b'const char %s[] = {' % sys.argv[3].encode('utf-8'))
  for b in open(sys.argv[1], 'rb').read():
    result_file.write(b'0x%02X,' % b)
  result_file.write(b'};')
6
votes

One simple tool can be found here:

#include <stdio.h>
#include <assert.h>

int main(int argc, char** argv) {
    assert(argc == 2);
    char* fn = argv[1];
    FILE* f = fopen(fn, "rb");
    printf("char a[] = {\n");
    unsigned long n = 0;
    while(!feof(f)) {
        unsigned char c;
        if(fread(&c, 1, 1, f) == 0) break;
        printf("0x%.2X,", (int)c);
        ++n;
        if(n % 10 == 0) printf("\n");
    }
    fclose(f);
    printf("};\n");
}
0
votes

This tool compiles in the developer command prompt in C. It produces output to the terminal displaying the contents in the "array_name.c" file that is created. Note that some terminals may display the "\b" character.

    #include <stdio.h>
    #include <assert.h>

    int main(int argc, char** argv) {
    assert(argc == 2);
    char* fn = argv[1];

    // Open file passed by reference
    FILE* f = fopen(fn, "rb");
    // Opens a new file in the programs location
    FILE* fw = fopen("array_name.c","w");

    // Next two lines write the strings to the console and .c file
    printf("char array_name[] = {\n");
    fprintf(fw,"char hex_array[] = {\n");

    // Declare long integer for number of columns in the array being made
    unsigned long n = 0;

    // Loop until end of file
    while((!feof(f))){
        // Declare character that stores the bytes from hex file
        unsigned char c;

        // Ignore failed elements read
        if(fread(&c, 1, 1, f) == 0) break;
        // Prints to console and file, "0x%.2X" ensures format for all
        // read bytes is like "0x00"
        printf("0x%.2X,", (int)c);
        fprintf(fw,"0x%.2X,", (int)c);

        // Increment counter, if 20 columns have been made, begin new line
        ++n;
        if(n % 20 == 0){
            printf("\n");
            fprintf(fw,"\n");
        }
    }

    // fseek places cursor to overwrite extra "," made from previous loop
    // this is for the new .c file. Since "\b" is technically a character
    // to remove the extra "," requires overwriting it.
    fseek(fw, -1, SEEK_CUR);

    // "\b" moves cursor back one in the terminal
    printf("\b};\n");
    fprintf(fw,"};\n");
    fclose(f);
    fclose(fw);
}
0
votes

This is a binary file to C array generator python source code which is identical program in Albert's answer.

import sys
from functools import partial

if len(sys.argv) < 2:
  sys.exit('Usage: %s file' % sys.argv[0])
print("char a[] = {")
n = 0
with open(sys.argv[1], "rb") as in_file:
  for c in iter(partial(in_file.read, 1), b''):
    print("0x%02X," % ord(c), end='')
    n += 1
    if n % 16 == 0:
      print("")
print("};")
0
votes

The question is old but let me suggest simple tool which can be used as alternative...

You can use GUI based tool called Fluid. It is actually used for designing interface for FLTK toolkit but can also generate unsigned char array for C++ from binary file. Download it from muquit.

Fluid screenshot

0
votes

I checked all available options and decided to make my own little program to do the conversion:

https://github.com/TheLivingOne/bin2array/blob/master/bin2array.c

It works much faster than bin2c and even xxd which is important for larger files, especially if you want to embed the conversion into your build system. E.g. for 50 Mb file on my machine:

bin2c.py > 20 sec

Simple Python scripts - about 10 sec

xxd - about 3 sec

bin2array - about 0.4 sec

Also, it produces much more compact output and adds alignment to the array, in case you want to put 32 or 64 bit values there.