Unified memory (i.e. managed) is available for allocated memory. Hence if you change the code to dynamically allocate the arrays and compile using the flag "-gpu=managed", then the data movement of the arrays will be handled by the driver.
OpenACC data regions will perform a shallow copy. Hence for aggregate data type with dynamic data members, you need to do a manual deep copy as shown below:
% cat grades.cpp
#include <iostream>
#include <string>
using namespace std;
class Assignment{
public:
std::string name; //name of assignment
double *grades; // list of grades where n is the number of students
Assignment() {};
Assignment(string name_, double *grades_) {
name = name_;
grades = grades_;
}
};
int main(){
Assignment assignments[2]; // array of 2 assignments
double grades1[3] = {90.0, 95.0, 75.0};
assignments[0] = Assignment("Assignment1", grades1);
double grades2[3] = {50.0, 65.0, 55.0};
assignments[1] = Assignment("Assignment2", grades2);
//Transfer to GPU below...
#pragma acc enter data copyin(assignments[:2])
for (int i=0; i < 2; ++i) {
#pragma acc enter data copyin(assignments[i].grades[:3])
}
// changes the grades on the device
#pragma acc parallel loop present(assignments)
for (int i=0; i < 2; ++i) {
for (int j=0; j < 3; ++j) {
assignments[i].grades[j] += 5.0;
}
}
for (int i=0; i < 2; ++i) {
#pragma acc update self(assignments[i].grades[:3])
std::cout << i << " : " << assignments[i].grades[0]
<< ", " << assignments[i].grades[1]
<< ", " << assignments[i].grades[2]
<< std::endl;
}
for (int i=0; i < 2; ++i) {
#pragma acc exit data delete(assignments[i].grades)
}
#pragma acc exit data delete(assignments)
}
% nvc++ grades.cpp -w -Minfo=accel -acc; a.out
main:
26, Generating enter data copyin(assignments[:])
32, Generating enter data copyin(assignments.grades[:3])
Generating present(assignments[:])
Generating Tesla code
35, #pragma acc loop gang /* blockIdx.x */
36, #pragma acc loop seq
36, Loop is parallelizable
43, Generating update self(assignments.grades[:3])
52, Generating exit data delete(assignments.grades)
55, Generating exit data delete(assignments[:])
0 : 95, 100, 80
1 : 55, 70, 60
I personally prefer to encapsulate the data management into the class itself. It's a bit more tricky here since you have an array of classes where each element is a copy of an anonymous object. While it may or may not be useful, you can take a look at a simple vector like class, "accList", that I made as an example for my chapter on data management for the book Parallel Programing with OpenACC. See: https://github.com/rmfarber/ParallelProgrammingWithOpenACC/tree/master/Chapter05