I have arangodb 3.1.16 installed on an AWS C4 Instance. I have a Foxx Service trying to run in production. It is getting an average of 10 packets of 200 octets per second, and returning a flow of 20 packets of 200 octets per second.
Each time I start running my process, the foxx service runs with consistent performance for an hour and then suddenly stops. I do not have access to my foxx api anymore : all requests get connection timeout errors, and do not print on the foxx logs. I do not have access to the web interface anymore : the page just doesn’t load.
After a minute or so, the foxx logs show me an error message : 'ArangoError 18: lock timeout’
After an other minute the logs show me requests that are usually fast but took a very long time (WARNING {queries} slow query: took: 1770.862498)
Using "journalctl -xe", I learned that after a foreign IP tried to connect, I got = "Job dev-xvdb.device/start timed out"
I managed to restart arango using :
ps -eaf |grep arangod
sudo kill #
sudo apt-get --reinstall install arangodb3=3.1.16
How can I solve this recurring issue ?
"journalctl -xe" gives me :
Apr 04 15:03:10 my-ip systemd[1]: arangodb3.service: Failed with result 'exit-code’.
-- Subject: Unit arangodb3.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit arangodb3.service has begun starting up.
Apr 04 15:03:10 my-ip arangodb3[11481]: * Starting arango database server arangod
Apr 04 15:03:10 my-ip arangodb3[11481]: * database version check failed, maybe you need to run 'upgrade'?
Apr 04 15:03:10 my-ip systemd[1]: arangodb3.service: Control process exited, code=exited status=1
Apr 04 15:03:10 my-ip systemd[1]: Failed to start LSB: arangodb.
-- Subject: Unit arangodb3.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit arangodb3.service has failed.
--
-- The result is failed.
Apr 04 15:03:10 my-ip systemd[1]: arangodb3.service: Unit entered failed state.
Apr 04 15:03:10 my-ip systemd[1]: arangodb3.service: Failed with result 'exit-code'.
Apr 04 15:03:10 my-ip sudo[11346]: pam_unix(sudo:session): session closed for user root
Apr 04 15:03:17 my-ip sshd[11502]: Did not receive identification string from UNKNOWN IP 1
Apr 04 15:03:21 my-ip sshd[11503]: Connection closed by UNKNOWN IP 2 port 54736 [preauth]
Apr 04 15:03:21 my-ip sshd[11507]: Did not receive identification string from UNKNOWN IP 2
Apr 04 15:03:21 my-ip sshd[11506]: fatal: Unable to negotiate with UNKNOWN IP 2 port 54730: no matching host key type found. Their offer: ssh-dss [preauth]
Apr 04 15:03:21 my-ip sshd[11504]: Connection closed by UNKNOWN IP 2 port 54732 [preauth]
Apr 04 15:03:22 my-ip sshd[11505]: Connection closed by UNKNOWN IP 2 port 54734 [preauth]
Apr 04 15:03:40 my-ip systemd[1]: dev-xvdb.device: Job dev-xvdb.device/start timed out.
Apr 04 15:03:40 my-ip systemd[1]: Timed out waiting for device dev-xvdb.device.
-- Subject: Unit dev-xvdb.device has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit dev-xvdb.device has failed.
--
-- The result is timeout.
Apr 04 15:03:40 my-ip systemd[1]: Dependency failed for File System Check on /dev/xvdb.
-- Subject: Unit [email protected] has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit [email protected] has failed.
--
-- The result is dependency.
Apr 04 15:03:40 my-ip systemd[1]: Dependency failed for /mnt.
-- Subject: Unit mnt.mount has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mnt.mount has failed.
--
-- The result is dependency.
Apr 04 15:03:40 my-ip systemd[1]: mnt.mount: Job mnt.mount/start failed with result 'dependency'.
Apr 04 15:03:40 my-ip systemd[1]: [email protected]: Job [email protected]/start failed with result 'dependency'.
Apr 04 15:03:40 my-ip systemd[1]: dev-xvdb.device: Job dev-xvdb.device/start failed with result 'timeout'.
I tried :
sudo curl --dump - -X GET http://127.0.0.1:8529/_api/version && echo
It gives me :
HTTP/1.1 401 Unauthorized
Www-Authenticate: Bearer token_type="JWT", realm="ArangoDB"
Server: ArangoDB
Connection: Keep-Alive
Content-Type: text/plain; charset=utf-8
Content-Length: 0
I tried :
ps auxw | fgrep arangod
It gives me :
root 10439 0.0 0.1 82772 8664 ? Ss 10:09 0:00 /usr/sbin/arangod --uid arangodb --gid arangodb --pid-file /var/run/arangodb/arangod.pid --temp.path /var/tmp/arangod --log.foreground-tty false --supervisor
arangodb 10440 5.7 94.5 12901776 7242340 ? Sl 10:09 16:36 /usr/sbin/arangod --uid arangodb --gid arangodb --pid-file /var/run/arangodb/arangod.pid --temp.path /var/tmp/arangod --log.foreground-tty false --supervisor
ubuntu 11339 0.0 0.0 12916 1000 pts/0 R+ 14:59 0:00 grep -F --color=auto arangod
arangod restart gives me :
2017-04-04T15:01:16Z [11344] INFO ArangoDB 3.1.16 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.2g 1 Mar 2016
2017-04-04T15:01:16Z [11344] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-04-04T15:01:16Z [11344] FATAL could not open shutdown file '/var/log/arangodb3/restart/SHUTDOWN': internal error
'service arangodb3 restart’ gives me (after a short wait time) :
Job for arangodb3.service failed because the control process exited with error code. See "systemctl status arangodb3.service" and "journalctl -xe" for details.
'systemctl status arangodb3.service' gives me :
arangodb3.service - LSB: arangodb
Loaded: loaded (/etc/init.d/arangodb3; bad; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2017-04-04 15:03:10 UTC; 34s ago
Docs: man:systemd-sysv-generator(8)
Process: 11352 ExecStop=/etc/init.d/arangodb3 stop (code=exited, status=0/SUCCESS)
Process: 11481 ExecStart=/etc/init.d/arangodb3 start (code=exited, status=1/FAILURE)
Tasks: 83
Memory: 6.5G
CPU: 73ms
CGroup: /system.slice/arangodb3.service
├─10439 /usr/sbin/arangod --uid arangodb --gid arangodb --pid-file /var/run/arangodb/arangod.pid --temp.path /var/tmp/arangod --log.foreground-tty false --supervisor
└─10440 /usr/sbin/arangod --uid arangodb --gid arangodb --pid-file /var/run/arangodb/arangod.pid --temp.path /var/tmp/arangod --log.foreground-tty false --supervisor
Apr 04 15:03:10 my-ip systemd[1]: Starting LSB: arangodb...
Apr 04 15:03:10 my-ip arangodb3[11481]: * Starting arango database server arangod
Apr 04 15:03:10 my-ip arangodb3[11481]: * database version check failed, maybe you need to run 'upgrade'?
Apr 04 15:03:10 my-ip systemd[1]: arangodb3.service: Control process exited, code=exited status=1
Apr 04 15:03:10 my-ip systemd[1]: Failed to start LSB: arangodb.
Apr 04 15:03:10 my-ip systemd[1]: arangodb3.service: Unit entered failed state.