0
votes

I have a YAML build script in an Azure hosted git repository which gets triggered across 7 build agents running on a local VM. Every time this runs, the build performs a git clean which takes a significant amount of time due to a large node_modules folder which takes a long time to clean up.

The MSDN page here seems to suggest this is configurable but shows no detail of how to configure it. I can't tell whether this is a setting that should be specified on the agent, the YAML script, within DevOps on the pipeline, or where.

Is there any other documentation I'm missing or is this not possible?

Update: The start of the YAML file is here:

variables:
  BUILD_VERSION: 1.0.0.$(Build.BuildId)
  buildConfiguration: 'Release'
  process.clean: false

jobs:
###### ######################################################
###### 1 - Build and publish .NET
#############################################################

- job: net_build_publish
  displayName: .NET build and publish
  pool:
    name: default
  steps:
  - script: echo $(BUILD_VERSION)

  - task: DotNetCoreCLI@2
    displayName: dotnet build $(buildConfiguration)
    inputs:
      command: 'build'
      projects: |
        myrepo/**/API/*.csproj
      arguments: '-c $(buildConfiguration) /p:Version=$(BUILD_VERSION)'

The complete yaml is a lot longer, but the output from the first job includes this output in a Checkout task

Checkout myrepo@master to s

View raw log

Starting: Checkout myrepo@master to s
==============================================================================
Task         : Get sources
Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version      : 1.0.0
Author       : Microsoft
Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
==============================================================================
Syncing repository: myrepo (Git)
Prepending Path environment variable with directory containing 'git.exe'.
git version
git version 2.26.2.windows.1
git lfs version
git-lfs/2.11.0 (GitHub; windows amd64; go 1.14.2; git 48b28d97)
git config --get remote.origin.url
git clean -ffdx
Removing myrepo/Data/Core/API/bin/
Removing myrepo/Data/Core/API/customersettings.json
Removing myrepo/Data/Core/API/obj/
Removing myrepo/Data/Core/Shared/bin/
Removing myrepo/Data/Core/Shared/obj/
....

We have another job further down which runs npm install and npm build for an Angular project, and every build in the pipeline is taking 5 minutes to perform the npm install step, possibly because of this git clean when retrieving the repository?

2
Try to add: steps: - checkout: self clean: false or: - job: myJob workspace: clean: falseShayki Abramczyk

2 Answers

0
votes

As I mentioned below. You need to calculate hash before you run npm install. If hash is the same as the one kept close to node_modules you can skip installing dependencies. This may help you achieve this:

steps:
- task: PowerShell@2
  displayName: 'Calculate and save packages.config hash'
  inputs:
    targetType: 'inline'
    pwsh: true
    script: |
      # generates a hash of package-lock.json
      $newHash = Get-FileHash -Algorithm MD5 -Path (Get-ChildItem package-lock.json)
      $hashPath = "$(System.DefaultWorkingDirectory)/cache-npm/hash.txt"
      if(Test-Path -path $hashPath) {
        if(Compare-Object -ReferenceObject $(Get-Content $hashPath) -DifferenceObject $newHash) {
          
          Write-Host "##vso[task.setvariable variable=NodeModulesAreUpToDate;]true"
          $newHash > $hashPath
          Write-Host ("Hash File saved to " + $hashPath)
        } else {
          # files are the same
          Write-Host "no need to install node_modules"
        }
      } else {
        $newHash > $hashPath
        Write-Host ("Hash File saved to " + $hashPath)
      }
      
      $storedHash = Get-Content $hashPath
      Write-Host $storedHash
    workingDirectory: '$(System.DefaultWorkingDirectory)/cache-npm'

- script: npm install
  workingDirectory: '$(Build.SourcesDirectory)/cache-npm'
  condition: ne(variables['NodeModulesAreUpToDate'], true)
0
votes

git clean -ffdx will clean any change untracked by source control in the source. You may try Pipeline caching, which can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, thereby reducing or avoiding the cost to recreate or redownload the same files again. Check the following link:

https://docs.microsoft.com/en-us/azure/devops/pipelines/release/caching?view=azure-devops#nodejsnpm

variables:
  npm_config_cache: $(Pipeline.Workspace)/.npm

steps:
- task: Cache@2
  inputs:
    key: 'npm | "$(Agent.OS)" | package-lock.json'
    restoreKeys: |
       npm | "$(Agent.OS)"
    path: $(npm_config_cache)
  displayName: Cache npm