I wonder if i could achieve something like the following logic:
given a set of jobs to be done fold_num
and a limit number of worker processes, say work_num
, i hope to run work_num
processes in parallel until all jobs fold_num
are done. Finally, there is some other processing on the results of all these jobs. We can assume fold_num
is always several times of work_num
.
I haven't got the following snippet working so far, with tips from How to wait in bash for several subprocesses to finish and return exit code !=0 when any subprocess ends with code !=0?
#!/bin/bash
worker_num=5
fold_num=10
pids=""
result=0
for fold in $(seq 0 $(( $fold_num-1 ))); do
pids_idx=$(( $fold % ${worker_num} ))
echo "pids_idx=${pids_idx}, pids[${pids_idx}]=${pids[${pids_idx}]}"
wait ${pids[$pids_idx]} || let "result=1"
if [ "$result" == "1" ]; then
echo "some job is abnormal, aborting"
exit
fi
cmd="echo fold$fold" # use echo as an example, real command can be time-consuming to run
$cmd &
pids[${pids_idx}]="$!"
echo "pids=${pids[*]}"
done
# when the for-loop completes, do something else...
The output looks like:
pids_idx=0, pids[0]=
pids=5846
pids_idx=1, pids[1]=
fold0
pids=5846 5847
fold1
pids_idx=2, pids[2]=
pids=5846 5847 5848
fold2
pids_idx=3, pids[3]=
pids=5846 5847 5848 5849
fold3
pids_idx=4, pids[4]=
pids=5846 5847 5848 5849 5850
pids_idx=0, pids[0]=5846
fold4
./test_wait.sh: line 12: wait: pid 5846 is not a child of this shell
some job is abnormal, aborting
Question:
1. Seems the pids
array has recorded correct process IDs, but failed to be 'wait' for. Any ideas how to fix this?
2. Do we need to use wait
after the for-loop? if so, what to do after the for-loop?
parallel -j $work_num process ::: {1..$fold_num}
– Mark Setchell