3
votes

we have a Jenkins matrix job, which tests several variations of our software on many slave nodes in parallel. Sometimes it happens, that one of these slave crashes and has to be rebooted. I don't want to skip the run in such a case. I wan't to mark the specific slave node as temporarily not available by the underling script, which detects the crash and afterwards reboot the node. I've found out, that this should be possible by the Jenkins REST API. And I found two python libraries, which should do the job; https://python-jenkins.readthedocs.org/en/latest/index.html and http://pythonhosted.org/jenkinsapi/index.html. But both libraries have problems to change something on my Jenkins 1.580.2 system (fetching information is not a problem) with python 3.4.3.

JenkinsAPI:

from jenkinsapi.jenkins import Jenkins
from jenkinsapi.utils.requester import Requester

class SSLRequester(Requester):
    def __init__(self, username=None, password=None):
        super(SSLRequester, self).__init__(username, password)

   def get_request_dict(self, *largs, **kwargs):
        requestKWargs = super(SSLRequester, self).get_request_dict(*largs, **kwargs)
        requestKWargs['verify'] = False
        return requestKWargs 

jenkins = Jenkins(jenkinsurl, username, password, requester=SSLRequester())

I has to use a customize SSLRequester, because I use a https:// connection for my Jenkins server and I fould receive the following error otherwise

SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:600)

Well, if I try to fetch some information by using the jenkins object, everything is fine.

node.is_temporarily_offline()
False

But if I try to toggle the node, I receive the following:

node.toggle_temporarily_offline()

JenkinsAPIException: Operation failed. url=https:///computer//toggleOffline?offlineMessage=requested%20from%20jenkinsapi, data={}, headers={'Content-Type': 'application/x-www-form-urlencoded'}, status=403, text=b"%2FtoggleOffline%3FofflineMessage%3Drequested%2520from%2520jenkinsapi'/>window.location.replace('/login?from=%2Fcomputer%2F%2FtoggleOffline%3FofflineMessage%3Drequested%2520from%2520jenkinsapi');\n\n\nAuthentication required\n\n\n

My login data are totally ignored.

python-jenkins:

import jenkins
j = jenkins.Jenkins(jenkinsurl, username, password)
j.disable_node(slavenode)

TypeError: the JSON object must be str, not 'bytes'

After a short google search, I found out, that I have to patch the library, because JSON don't like bytearrays, which are provided by the Jenkins JSON API. After inserting several decode('utf-8') statements, I was able to call the following statement:

j.get_node_info(slavenode)

But I'm still failing to mark it as offline:

j.disable_node(slavenode)

TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

So, bring it down to a simple question. Do you know of some other convenient, scriptable way to mark a node as temporarily offline (and of course as online again, if the reboot was successful)? I would prefer a python solution, because I trigger the reboot from my python script. But a groovy script would also be good enough.

Thanks in advance for your help

1

1 Answers

4
votes

You could look at the script console where you can test scripts. You can also call these scripts using curl or the CLI and I imagine a python library

This is a good example of a groovy script looking at nodes and deleting a node

for (aSlave in hudson.model.Hudson.instance.slaves) {
  println('====================');
  println('Name: ' + aSlave.name);
  println('getLabelString: ' + aSlave.getLabelString());
  println('getNumExectutors: ' + aSlave.getNumExecutors());
  println('getRemoteFS: ' + aSlave.getRemoteFS());
  println('getMode: ' + aSlave.getMode());
  println('getRootPath: ' + aSlave.getRootPath());
  println('getDescriptor: ' + aSlave.getDescriptor());
  println('getComputer: ' + aSlave.getComputer());
  println('\tcomputer.isAcceptingTasks: ' + aSlave.getComputer().isAcceptingTasks());
  println('\tcomputer.isLaunchSupported: ' + aSlave.getComputer().isLaunchSupported());
  println('\tcomputer.getConnectTime: ' + aSlave.getComputer().getConnectTime());
  println('\tcomputer.getDemandStartMilliseconds: ' + aSlave.getComputer().getDemandStartMilliseconds());
  println('\tcomputer.isOffline: ' + aSlave.getComputer().isOffline());
  println('\tcomputer.countBusy: ' + aSlave.getComputer().countBusy());
  //if (aSlave.name == 'NAME OF NODE TO DELETE') {
  //  println('Shutting down node!!!!');
  //  aSlave.getComputer().setTemporarilyOffline(true,null);
  //  aSlave.getComputer().doDoDelete();
  //}
  println('\tcomputer.getLog: ' + aSlave.getComputer().getLog());
  println('\tcomputer.getBuilds: ' + aSlave.getComputer().getBuilds());
}