3
votes

I store my data in a char array, and I need to read float and int variables from there. This code works fine on CPU:

global float *p;
p = (global float*)get_pointer_to_the_field(char_array, index);
*p += 10;

But on GPU I get the error -5: CL_OUT_OF_RESOURCES. The reading itself works, but doing something with the value (adding 10 in this case) causes the error. How could I fix it?

Update:

This works on GPU:

float f = *p;
f += 10;

However, I still can't write this value back to the array.

Here is the kernel:

global void write_value(global char *data, int tuple_pos, global char *field_value, 
                    int which_field, global int offsets[], global int *num_of_attributes) {

    int tuple_size = offsets[*num_of_attributes];
    global char *offset = data + tuple_pos * tuple_size;
    offset += offsets[which_field];

    memcpy(offset, field_value, (offsets[which_field+1] - offsets[which_field]));
}

global char *read_value(global char *data, int tuple_pos, 
                    int which_field, global int offsets[], global int *num_of_attributes) {
    int tuple_size = offsets[*num_of_attributes];
    global char *offset = data + tuple_pos * tuple_size;
    offset += offsets[which_field];
    return offset;
}

kernel void update_single_value(global char* input_data, global int* pos, global int offsets[], 
                            global int *num_of_attributes, global char* types) {
    int g_id = get_global_id(1);
    int attr_id = get_global_id(0);
    int index = pos[g_id];

    if (types[attr_id] == 'f') { // if float

        global float *p;
        p = (global float*)read_value(input_data, index, attr_id, offsets, num_of_attributes);
        float f = *p;
        f += 10;
        //*p += 10; // not working on GPU
    } 
    else if (types[attr_id] == 'i') { // if int
        global int *p;
        p = (global int*)read_value(input_data, index, attr_id, offsets, num_of_attributes);
        int i = *p;
        i += 10;
        //*p += 10;
    }
    else { // if char
        write_value(input_data, index, read_value(input_data, index, attr_id, offsets, num_of_attributes), attr_id, offsets, num_of_attributes);
    }
}

It updates values of a table's tuples, int and float are increased by 10, char fields are just replaced with the same content.

2
If you're still having this problem, you should probably post more comprehensive source code.pmdj
Added the complete kernel's code.lawful_neutral
What's the alignment situation of this data? Are float and int based items aligned to 4 byte boundaries? If not, this could be the source of the problem.pmdj
I just quickly read about alignments, and I think these variables are not aligned. An integer variable can be stored in the array right after a 9 bytes long char, and the size of a tuple % 4 != 0. Could you please explain how that can be a problem?lawful_neutral
@pmdj I checked it, it's actually due to the alignment. Thank you for the suggestion, I would have never thought of it.lawful_neutral

2 Answers

1
votes

Are you enabling the byte_addressable_store extension? As far as I'm aware, bytewise writes to global memory aren't well-defined in OpenCL unless you enable this. (You'll need to check if the extension is supported by your implementation.)

You might also want to consider using the "correct" type in the kernel argument - this might help the compiler produce more efficient code. If the type can vary dynamically, you could perhaps try using a union type (or union fields in a struct type), although I haven't tested this with OpenCL myself.

0
votes

It turned out that the problem occurs because the int and float values in the char array aren't 4 bytes aligned. When I'm doing writes to addresses like

offset = data + tuple_pos*4; // or 8, 16 etc

everything works fine. However, the following causes the error:

offset = data + tuple_pos*3; // or any other number not divisible by 4

This means that either I should change the whole design and store the values somehow else, or add "empty" bytes to the char array to make int and float values 4 bytes aligned (which isn't a really good solution).