2
votes

I have a boost application I've written running on an ARMv5 machine alongside a very performance sensitive application. The performance sensitive application collects performance metrics about itself while running, and with my application that uses boost ASIO running, occasionally prints that it is experiencing poor performance. This application has a main thread that repeatedly loops, and the metric that is occasionally off is the total time of the loop. It goes from taking 2-8ms per loop, to 63 seconds for one whole loop, but just for one loop every once in a rare while. This never happens when my application is not running.
That is the basic problem I am seeing. The two programs interact over unix domain sockets. The sensitive application has additional threads, a send and receive thread which read and write to and from queues which the main thread also read and writes to and from. The mutex read and writes and in Mutex locks, but only the read and write operations themselves are in the lock. The actual send and receive in the threads is outside of the lock for performance reasons. We want the main thread in the performance sensitive application to block as little as possible and keep the loop time down in the 2-8ms range.

I have ruled out the domain socket communication as the problem, sadly. The new program has two modes which it can run in. Domain socket only mode, and UDS + TCP mode. If I run my program in UDS only mode, the problem ceases to happen, but if I switch back to UDS + TCP mode, it starts again. I am using asynchronous Boost ASIO programming for both the Unix domain sockets and TCP communication. The practical difference between these two modes when the errors are occurring are as following:

  • I have one class which is repeatedly trying to connect to a host using SSL, fails on the resolve, sleeps the thread for 60 seconds, then tries again.
  • Another that is using boost timers to do the same thing, but queues up a 15 minute deadline_timer after the connect fails.
  • It does an async_accept on a TCP socket.

The machine does not have a network connection for the entire run of the application, so the resolves always fail. It is possible for the machine to get an internet connection and succeed, but logs show that this does not happen from the run of the programs until the problem occurs. Does anyone have ideas about how either of these operations could cause a problem for another program on the same machine. The machine is a single core 1.0 GHz ARMv5 with 1GB of RAM. Looking at system load during runtime, the performance sensitive application uses about 30-60% of the CPU, my application uses 0-3%. The load average stay below 1, and is usually around 0.60. The program is using one io_service running on 2 threads, with the main thread waiting on those two threads.

EDIT: The performance sensitive application has been running with a niceness of -20, and my application has been running with a niceness of -10.

I'd be glad to clarify anything or provide more information upon request. Thanks in advance!

1
Running another application, especially on a single-core processor, will always have an impact on performance. There's nothing you can do about that. - ssube
Right, but its strange because I see no other performance issues other than this one loop every few hours of runtime. The only thing that is happening is looping over the resolver. I can't think of what could be happening, even from a scheduling standpoint to cause the loop time to go from a few miliseconds to many seconds. - Jacob Wiltse

1 Answers

3
votes

Boost ASIO is not affecting performance in another processes. It cannot.

The relevant thing here is

  • operating system
  • scheduling
  • process priorities

One thing that you need to (please!) remember about Linux is that it is not fundamentally designed to be a "real-time" operating system.

The nature of the processes may not even be relevant at all. Consider if the processor is uni-core, and the OS non-realtime and preemptive; There's no way the OS could ever faithfully schedule two performance-sensitive processes "into the limelight" at the same time.

So I'd start looking there:

  • can you use a kernel with realtime options
  • can you exercise those options for the processes that require it e.g.

    • chrt - manipulate the real-time attributes of a process
    • nice
  • can you upgrade to hardware that supports the software tasks better