0
votes

From a GCE instance, I am executing the following from command line

mysql -udbase -p -DmyDb -hmyIp < sql.txt

Once it gets running, I monitor the SQL command in another shell by checking the process list. I see the SQL, the processId and the time it is taking to execute.

After this last test, the processId changed from Query to sleep after 1172 seconds.

However, the mysql command does not receive the response, it just stays there, as if still waiting for some response from CloudSql.

After a minute or two, the processID disappears from the process list, and still the mysql command is there, as if waiting.

After a very long period of time (between 30 and 60 minutes), I finally get:

ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query

It's like the connection between GCE and CloudSQL gets severed, just that the client does not realize it.

As per recommendations in another post, I exaggerated the numbers in

sudo /sbin/sysctl -w net.ipv4.tcp_keepalive_time=1800 net.ipv4.tcp_keepalive_intvl=1800 net.ipv4.tcp_keepalive_probes=50

But the only thing this seemed to do, is prolong the time between when the query finishes and when the client finally disconnects.

If net_read_timeout applied, shouldn't the client give the error much earlier?

Thanks for your help.

1

1 Answers

1
votes

The TCP Keepalive needs to be shorter than 10 minutes to make the GCE firewall not forget the connections. Making it larger will defeat that purpose. I would recommend trying the instructions from Connecting to Google Cloud SQL from Google Compute Engine:

# Display the current tcp_keepalive_time value.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time

# Set tcp_keepalive_time to 60 seconds and make it permanent across reboots.
$ echo 'net.ipv4.tcp_keepalive_time = 60' | sudo tee -a /etc/sysctl.conf

# Apply the change.
$ sudo /sbin/sysctl --load=/etc/sysctl.conf

# Display the tcp_keepalive_time value to verify the change was applied.
$ cat /proc/sys/net/ipv4/tcp_keepalive_time