19
votes

I've moved a few old Jenkins jobs to new ones using the pipeline feature in order to be able to integrate the Jenkins configuration within the git repositories. It's working fine but I'm asking myself if there is a way to reduce the number of checkout that happens while building.

Setup

  • I have a Jenkins multibranch job which is related to my git repository
  • I have a Jenkinsfile in my git repository

    #!groovy
    node {
    
      stage 'Checkout'
      checkout scm
    
      // build project
      stage 'Build'
      ...
    }
    

Problem

When I push to my remote branche BRANCH_1, the multibranch jenkins job is triggered and my understanding is that the following steps happen:

  • the multibranch job makes a git fetch for the branch indexing and triggers the job corresponding to my remote branch: BRANCH_1_job
  • BRANCH_1_job makes a git checkout to retrieve the Jenkinsfile of the triggered branch
  • the Jenkinsfile is executed and makes a checkout scm itself. If I don't do it, I can not build my project because no source are available.

So for building my branch, I end up with one git fetch and two git checkout.

Questions

  • Do I understand the process correctly? Or did I miss something?
  • Is there a way to reduce the number of git checkout? When I check the official examples, they all make a checkout scm as first step. I would personally think that I don't have to do it because the jenkins job already had to make a checkout to retrieve the Jenkinsfile (so my sources have to be here somehow).
  • Don't you think these multiple checkouts can cause bad performance as soon as the git repo contains a big number of refs?

Thanks you all

2
If you're not set on using the multi-branch pipeline, you could create a normal (single branch) pipeline job with the Jenkinsfile in Jenkins which performs a single checkout and then loads your actual build script from the checked out repository via load 'ci/build-script-in-your-repo.gy'. Unfortunately, you would loose the separate jobs for each branch in that case. - Carsten
I know what you mean but I would like to: - keep my ci config in my repository and not in my jenkins job configuration - use the nice multibranch features (like one job per branch, so one can easily see which branch is failing) - Bastien

2 Answers

9
votes

With plain git Jenkins has to do two checkouts: one to get the Jenkinsfile to know what to execute in the job, and then another checkout of the actual repository content for building purposes. Technically Jenkins only needs to load the one single Jenkinsfile from the repo, but git doesn't allow checkout of a single file. Therefore the double checkout cannot be avoided with plain git using the multibranch plugin.

If you host git on Bitbucket or GitHub then you avoid the double checkout by using their specific Jenkins plugins instead of the multibranch plugin.

See the Jenkins plugin site for Bitbucket and GitHub plugins accordingly.

These plugins use the respective Git provider's REST API to load the single Jenkins file. So you technically still have a double checkout, but the first one is a simple REST call to download a single file, rather than doing a full native git checkout of the whole repository.

0
votes

I've run into this several times and the robust solution that I came with was define a tiny "launcher script" inside the job itself (without scm source) that checks-out the correct source revision and loads the actual pipeline from the sources.

If you are using the DSL plugin to general your job, you'll define the pipeline this way:

pipelineJob("myjob") {
  ...
  definition {
    cps {
      script('''
        node {
          checkout scm
          load("path/to/script.groovy")
        }
      ''')
    }
  }
}

If your are configuring the job manually using the jenkins "Configure" screen, this is identical to choosing "Pipeline script" instead of "Pipeline script from SCM" and copying the small checkout-and-load script inside the box.

This decouples the pipeline bootstrap from the actual SCM and allows you to checkout-out once and have both the pipeline definition and the sources to be built. Not the most beautiful approach but definitely does the job well.