0
votes

I want to use a derived datatype with MPI3 shared memory. Given the following derived datatype:

type ::pve_data
  type(pve_data), pointer :: next => NULL()
  real*8        , pointer :: array(:) => NULL()
end type pve_data

Initialized with:

type(pve_data) :: pve_grid
allocate(pve_grid%array(10))
pve_grid%array = 0.0d0

After having done some calculation with pve_grid I want to create a shared memory window with MPI_WIN_ALLOCATE_SHARED, whose base should be pve_grid. (Maybe MPI_WIN_CREATE_DYNAMIC would be an alternative, but I need the performance through shared memory)

1) Till now I just used primitive data types or arrays of those as basepointer for the window creation. Can a derived datatype also be used as basepointer? Or do I need to create a window for every component of the derived datatype, which is a primitive variable?

2) Is it possible to use an already "used" variable (pve_grid in this case) as basepointer? Or do I need to use a new pve_data as basepointer and copy the values from pve_grid to it?

EDIT I know, that it would be easier to use an OpenMP-approach instead of MPI Shared Memory. But I want to try a MPI-only approach on purpose, to improve my MPI-abilities.

EDIT2 (05.09.16) I made some progress and was able to use shared Memory, where the basepointer was a simple integer variable. But I still have a problem when I want to use a derived datatype as basepointer for the window-creation (For testing purpose I changed its definition - see sharedStructMethod.f90 below). The compiler and execution does not throw any error...the remote access simply has no effect on derived datatype components: The write shows the old values, which has been initialized by the parent. The subsequent codes show my current state. I used the possibility to spawn new processes during execution time: the parent-process creates the window and the child-procs execute changes on it. I hope spawning processes does not trouble efforts of debugging, I just added it for my project. (And for the next time I will change real*8 to fit the standard).

Declaration of the derived datatype (sharedStructMethod.f90)

  module sharedStructMethod

      REAL*8, PARAMETER      :: prec=1d-13
      INTEGER, PARAMETER     :: masterProc = 0
      INTEGER, PARAMETER     :: SLAVE_COUNT = 2
      INTEGER, PARAMETER     :: CONSTSIZE = 10
      !Struct-Definition    
         type :: vertex
            INTEGER, Dimension(3) :: coords
         end type vertex

         type :: pve_data
            real(kind(prec)), pointer   :: intensity(:) => NULL()
            logical, pointer            :: flag => NULL()
            type(vertex), pointer       :: vertices(:) => NULL()
         end type pve_data
      end module sharedStructMethod

Declaration of the parent-process (sharedStruct.f90), which the user executes.

      PROGRAM sharedStruct
     USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
     USE mpi
     USE sharedStructMethod
     IMPLICIT NONE
     type(pve_data) :: pve_grid
     integer        :: ierror
     integer        :: myRank, numProcs
     INTEGER        :: childComm
     INTEGER        :: childIntracomm
     integer        :: i
     INTEGER(KIND=MPI_ADDRESS_KIND) :: memSize
     INTEGER                        :: dispUnit
     TYPE(C_PTR)                    :: basePtr
     INTEGER                        :: win
     TYPE(pve_data), POINTER        :: shared_data

     call MPI_INIT(ierror)
     memSize = sizeof(pve_grid)
     dispUnit = 1
     CALL MPI_COMM_SPAWN("sharedStructWorker.x", MPI_ARGV_NULL, SLAVE_COUNT, MPI_INFO_NULL, masterProc,MPI_COMM_SELF, childComm,MPI_ERRCODES_IGNORE, ierror);
     CALL MPI_INTERCOMM_MERGE(childComm, .false., childIntracomm, ierror)
     CALL MPI_WIN_ALLOCATE_SHARED(memSize, dispUnit, MPI_INFO_NULL, childIntracomm, basePtr, win, ierror)
     CALL C_F_POINTER(basePtr, shared_data)
     CALL MPI_WIN_LOCK(MPI_LOCK_EXCLUSIVE, masterProc,0,win,ierror)

     allocate(shared_data%intensity(CONSTSIZE))
     allocate(shared_data%vertices(CONSTSIZE))
     allocate(shared_data%flag)
     shared_data%intensity = -1.0d0
     DO i =1,CONSTSIZE
        shared_data%vertices(i)%coords(1) = -1
        shared_data%vertices(i)%coords(2) = -2
        shared_data%vertices(i)%coords(3) = -3
     END DO
     shared_data%flag = .true.

     CALL MPI_WIN_UNLOCK(masterProc, win, ierror)
     CALL MPI_BARRIER(childIntracomm, ierror)
     CALL MPI_BARRIER(childIntracomm, ierror)
     WRITE(*,*) "After: Flag ",shared_data%flag,"intensity(1): ",shared_data%intensity(1)
     call mpi_finalize(ierror)
  END PROGRAM sharedStruct

And last but not least: The Declaration of the child-process, which is spawned automatically by the parent-proc during runtime, and does change the window content (sharedStructWorker.f90)

PROGRAM sharedStructWorker
USE mpi
  USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
  USE sharedStructMethod
  IMPLICIT NONE

  INTEGER                   :: ierror
  INTEGER                   :: myRank, numProcs
  INTEGER                   :: parentComm
  INTEGER                   :: parentIntracomm
  TYPE(C_PTR)               :: pveCPtr
  TYPE(pve_data), POINTER   :: pve_gridPtr
  INTEGER                   :: win
  INTEGER(KIND=MPI_ADDRESS_KIND)    :: sizeOfPve
  INTEGER                   :: dispUnit2

  CALL MPI_INIT(ierror)
  CALL MPI_COMM_GET_PARENT(parentComm, ierror)
  CALL MPI_INTERCOMM_MERGE(parentComm, .true., parentIntracomm, ierror)
  sizeOfPve = 0_MPI_ADDRESS_KIND
  dispUnit2 = 1
  CALL MPI_WIN_ALLOCATE_SHARED(sizeOfPve,dispUnit2, MPI_INFO_NULL, parentIntracomm, pveCPtr, win, ierror)
  CALL MPI_WIN_SHARED_QUERY(win, masterProc, sizeOfPve, dispUnit2, pveCPtr, ierror)
  CALL C_F_POINTER(pveCPtr, pve_gridPtr)

  CALL MPI_BARRIER(parentIntracomm, ierror)
  CALL MPI_WIN_LOCK(MPI_LOCK_EXCLUSIVE,masterProc,0,win,ierror)
  pve_gridPtr%flag = .false.
  pve_gridPtr%intensity(1) = 42
  CALL MPI_WIN_UNLOCK(masterProc, win, ierror)
  CALL MPI_BARRIER(parentIntracomm, ierror)
  CALL MPI_FINALIZE(ierror)
  END PROGRAM sharedStructWorker

Compilation with:

mpiifort   -c  sharedStructMethod.f90
mpiifort   -o  sharedStructWorker.x sharedStructWorker.f90 sharedStructMethod.o
mpiifort   -o  sharedStruct.x sharedStruct.f90 sharedStructMethod.o

Is this the right approach or do I need to create a shared memory block with its own window for every component of the derived datatype pve_data, which is just a pointer? Thank you for your help!

EDIT 10/09/2016, Solution: Explanation in comments. One way to solve the problem is to generate a window for every component, on which parent and children work. For more complexe derived datatypes this quickly becomes tedious to implement, but it seems that there is no different choice.

1
Please note that real*8 is not valid Fortran and was never part of any ISO Fortran standard.jlokimlin
MPI_WIN_ALLOCATE_SHARED allocates shared memory space for you. In Fortran 2003+, you can associate that space with the pointer in your structure using C_F_POINTER from the ISO_C_BINDING module.Hristo Iliev
@HristoIliev I tried it. It works for primitive datatypes, but I still have issues while using derived datatypes with pointer-components as basepointer (see edit2)Jannek S.
I would rather associate the allocated memory with the individual members of the structure, i.e., CALL C_F_POINTER(pveCPtr, pve_gridPtr%intensity) and similarly for the vertices member. You could either create two different windows or split a single allocation between the two members.Hristo Iliev
There is no need to mark everything with EDIT and a date. Stackoverflow has an automatic versioning system. Click on the link edited ... below the post stackoverflow.com/posts/39293904/revisions It is all there. You can keep your question as a straightforward description of the problem. You can also post your own answer.Vladimir F

1 Answers

1
votes

I think this is mostly covered here: MPI Fortran code: how to share data on node via openMP?

So, in terms of 1) you can have basepointers that are Fortran derived types. However, the answer to 2) is that MPI_Win_alloc_shared returns storage to you - you cannot reuse existing storage. Given you have a linked list I don't see how this could be converted to a shared window even in principle. To be able to make use of the returned storage it would be much simpler to have an array of pve_data objects - you're going to have to store them consecutively in the returned array so linking them doesn't seem to add anything useful.

I might have misunderstood here - if you only want the head of the list to be remotely accessible in the window then that should be OK.