0
votes

I am relatively new to python and the subprocess module.

I'm trying to get a directories size with python using a subprocess on mac osx. os.walk takes a long time for large directories. I am hoping to get subprocess to do this with a shell command and speed up the result. this shell command works for me but i cannot get it to work from subprocess?

( cd /test_folder_path && ls -nR | grep -v '^d' | awk '{total += $5} END {print total}' )

This is how I am trying to create the subprocess in python.

import shlex 
import subprocess

target_folder = "/test_folder_path"
command_line = "( cd " + target_folder + " && ls -nR | grep -v '^d' | awk '{total += $5} END {print total}' )"
args = shlex.split(command_line)
print args
folder_size = subprocess.check_output(args)
print str(folder_size)

in python i get the following errors when the subprocess.check_ouput is called

folder_size = subprocess.check_output(args) File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 568, in check_output process = Popen(stdout=PIPE, *popenargs, **kwargs) File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in init errread, errwrite) File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

when I use the same directory in the shell command it works and gives me the directories correct size.

any help with making this approach work or pointing me to a better method would be much appreciated.

1
What is wrong? Python error? Cmd line syntax error? Wrong answer?Paul Draper
Try cd /test_folder_path && du -c | tail -n 1dawg

1 Answers

2
votes

python's subprocess defaults to using shell=False. In order to run the subcommand with the pipes, you need the shell to prevent python from interpreting the pipes (and &&) as arguments to cd.

target_folder = "/test_folder_path"
command_line = "cd " + target_folder + " && ls -nR | grep -v '^d' | awk '{total += $5} END {print total}'"
folder_size = subprocess.check_output(command_line, shell=True)

I've tried the above, only using the command suggested by drewk:

>>> import subprocess
>>> folder_size = subprocess.check_output('cd ~/mydir && du -c | tail -n 1', shell=True)
>>> folder_size
b'113576\ttotal\n'

and all seems to be well.

As noted in the comments, subprocess.Popen (and by extension, check_output) also accepts a cwd argument which is the directory to run your command from. This eliminates the need to do any changing of directory in your command:

>>> import subprocess
>>> result = subprocess.check_output('du -c | tail -n 1', cwd='/path/to/home/mydir', shell=True)
>>> result
'113576\ttotal\n'