2
votes

I have a Linux System V IPC shared memory segment that is populated by one process and read by many others. All the processes use interface to the shared memory segment in the form of a class which takes care of looking up, attaching, and detaching to the segment as part of its constructor/destructor methods.

The problem here is that from time to time I'm seeing that the segment has "split". What I mean here is that looking in the "ipcs -m -s" output I see that I've got two segments listed: one which has been marked for destruction but still has some processes attached to it, and a second which appears to get all new attempts to attach to the segment. However, I'm never actually asking the kernel to destroy the segment. What's happening here?!

One other thing to note is that unfortunately the system this is running on is seriously overcommited in the memory department. There is 1 GB of physical memory, no swap, and the Committed_AS in /proc/meminfo is reporting about 2.5GB of commited memory. Fortunately the system processes are not actually using this much memory... they're just asking for it (I still have about 660MB "free" memory as reported by vmstat). While I know this is far from ideal, for the time being there is nothing I can do about the overcommitted memory. However, browsing the kernel/libc source I don't see anything in there that would mark a shared memory segment for deletion for any reason other than a user request (but perhaps I've missed it hidden in there somewhere).

For reference here's the shared memory interface class' constructor:

const char* shm_ftok_pathname = "/usr/bin";
int shm_ftok_proj_id = 21;

// creates a key from a file path so different processes will get same key
key_t m_shm_key = ftok(shm_ftok_pathname, shm_ftok_proj_id);

if ( m_shm_key  == -1 )
{
    fprintf(stderr,"Couldn't get the key for the shared memory\n%s\n",strerror(errno));
    exit ( status );
}

m_shm_id = shmget(m_shm_key, sizeof(shm_data_s), (IPC_CREAT | 0666));

if (m_shm_id < 0) 
{
    fprintf(stderr,"Couldn't get the shared memory ID\nerrno = %s  \n",strerror(errno));
    exit ( status );
}

// get a ptr to shared memory, which is a shared mem struct 
// second arg of 0 says let OS choose shm address
m_shm_data_ptr = (shm_data_s *)shmat(m_shm_id, 0, 0);

if ( (int)m_shm_data_ptr == -1 )
{
    fprintf(stderr,"Couldn't get the shared memory pointer\n");
    exit ( status );
}

And here's my uname output: Linux 2.6.18-5-686 #1 SMP Fri Jun 1 00:47:00 UTC 2007 i686 GNU/Linux

2
I wonder if you need to worry about the Linux OOM killer because of the shortage of memory? Google it...Jonathan Leffler
Thanks for the input, but I don't believe I'm running into an OOM issue. For one, I don't see any evidence of a killed process, nor would I expect to as I'm never actually running out of memory (i.e. while the aggregate memory requests are indeed for more than the available memory, since the processes aren't actually using the memory there are plenty of available pages). And even if a process was being killed I don't see how anything I'm doing in the shared memory setup would cause the memory to "split".Evan Grim

2 Answers

1
votes

My first guess is that you probably are calling shmctl(..., IPC_RMID, ...) somewhere.

Can you show the shared memory interface class' destructor?

0
votes

The only reason for kernel to mark the segment for deletion is the explicit user call.May be you can give a try to strace/truss(in solaris) to find out if there is a user call to the said function, mentioned in 1 above.

Raman Chalotra