0
votes

Although this is likely not a Docker issue, for completeness:

I have two Docker containers inside a stack.
One is a PostgreSQL database instance, containing an input table, a parser function, and a table for persisting the data.
The other is an Ubuntu-based worker. It contains a bash script called via crontab every 5 minutes to

  1. download a CSV file from a URL
  2. insert the data into the input table via PGPASSWORD=abc psql -h db_container -U user -d database -c "\copy input_table FROM downloaded/file/location.csv"
  3. call the parser function via a separate psql [...] -c "SELECT parser_function()" call

Note that usernames, database names had to be altered. Even though the function takes less than one second to run, in the worker container's logs I get:

2021-04-19T19:25:00.497643906Z crond: user root: process already running: /bin/bash /scripts.sh
,2021-04-19T19:38:00.496474340Z crond: user root: process already running: /bin/bash /scripts.sh
,2021-04-19T19:30:00.495587798Z crond: user root: process already running: /bin/bash /scripts.sh
,2021-04-19T19:36:00.494264531Z crond: user root: process already running: /bin/bash /scripts.sh
,2021-04-19T19:35:00.493496625Z crond: user root: process already running: /bin/bash /scripts.sh

The \copy command (list #2) is executed and terminated correctly. The function call (list #3) is executed, the values are inserted correctly from the input table to the persistent table. However, the psql process in the worker container that calls the parser function is not terminated correctly, and more psql processes keep stacking up.

In order to exclude the parser function as point of failure, I wrote a simple database function only returning the current datetime.

CREATE OR REPLACE FUNCTION data_schema.tester()
RETURNS void
LANGUAGE plpgsql
AS
$function$
BEGIN
RAISE NOTICE 'Test: %', now();
END;
$function$;

The function is called from a new, separate bash script. But the psql process calling the function still does not terminate. For testing, I set crontab to execute the file every minute:

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
* * * * * /bin/bash /scripts.sh

After several minutes, ps aux delivers:

46 root 0:00 /bin/bash /scripts.sh
57 root 0:00 /usr/bin/psql -h db_instance -U user -d database -c SELECT data_schema.tester()
58 root 0:00 more
96 root 0:00 /bin/bash \scripts.it.sh
97 root 0:00 /usr/bin/psql -h db_instance -U user -d database -c SELECT data_schema.tester()
98 root 0:00 more
99 root 0:00 /bin/bash \scripts.it.sh
100 root 0:00 /usr/bin/psql -h db_instance -U user -d database -c SELECT data_schema.tester()
101 root 0:00 more
102 root 0:00 ps aux

The processes keeps stacking up until every 10 minutes, the psql process succeeds. For the next 9 minutes, new psql processes are created with "...process already running", etc.

Executing the script from the command line causes no such behaviour in any constellation. The psql command is executed and terminated correctly.

It looks to me like a crond issue. The PGPASSWORD and other variables are expanded successfully inside the bash script when called by cron. The files and scripts permissions are set accordingly. I tried flock and timeout to prevent the processes from stacking up, to no avail.

Thank you in advance!

1
I appreciate you've already pursued this, but a hung pw prompt is still hard to ignore as a potential issue. What user does the parser function execute under, and how is auth achieved in real world example?whoasked
And what does /scripts.sh look like?tink
Are you running an init daemon inside the cron container to cleanup orphaned processes?jordanm
@whoasked Thank you for your answer. I changed the testing script to a hardcoded PGPASSWORD, same result. To rule out permissions issues in the DB, I made the calling user the owner of the function, input- and persistent table incl. any sequences with the same result. The Docker secrets are read from the docker-compose.yml and expanded in the container through medium.com/@basi/…J.B.
@tink The simplified test version to rule out other factors (like the csv download or variable expansion) is a simple PGPASSWORD=password psql -h db_instance -U user -d database -c "SELECT parser_function();" nothing else.J.B.

1 Answers

1
votes

As it turns out, the issue is likely that my Alpine-Docker-cron instance cannot handle the returned output of psql.

While the \copy return statement ("COPY 60")for some reason did not pose a problem, the return of the SELECT statement ("TEST 2020-01-01 00:00:00") was never handled and the process never terminated.

Simply piping the cron or psql output to a file, /dev/null or the like solves this issue.

Thank you everyone for your responses