0
votes

When running the following command

strace -f python3 -c 'import os; print(os.getpid())'

I noticed that strace does not catch the call to the getpid(2) system call. I first considered that this was due to glibc caching the pid, but there shouldn't be a pid for libc to cache without at least a single real system call. Then I considered that maybe vdso was the culprit, but running a C program that makes this system call through libc shows a getpid call when straced. I finally gave up and looked up the source of the os.getpid python module, which apparently seems to be defined in Modules/posixmodule.c. To my surprise (and subsequent confusion), it makes a normal call to getpid!

So my question is: How does python determine the result of os.getpid? and if such value is indeed obtained by a call to getpid, how is that call actually being made?

1
strace -e trace=getpid -f python3 -c 'import os; print(os.getpid())' showed two calls to the getpid syscall when I tested it. - Shawn
@Shawn: Very odd. I just tested strace with the additional trace=getpid on two separate systems (both the same version of ubuntu, 16.04, unfortunately). Both print the PID but show no call to getpid; only '+++ exited with 0 +++'. - Tenders McChiken
@Shawn: What system are you using? is vdso enabled? - Tenders McChiken
Arch and ubuntu 18.04. - Shawn
@jww: Perhaps, but from github.com/torvalds/linux/blob/v3.19/include/uapi/linux/…, it doesn't seem that the system exposes the PID to users as an auxiliary vector value. x86 values don't seem to include the PID either... It also doesn't explain why ubuntu 18.04 makes the call while 16.04 doesn't :/ - Tenders McChiken

1 Answers

3
votes

The way the vdso works is, among other things, mapping process-specific variables into userspace that the vdso functions know how to read. One of them is the current process ID, so gettimeofday doesn't need to make a syscall to access that information.

Now, specifically for getpid, it's not actually a VDSO call. In glibc before 2.25, the library would cache calls, and since part of the Python runtime calls getpid, there wouldn't be calls to it after the first. From 2.25 onward, the library doesn't cache the process ID, and so every getpid call results in a syscall.