88
votes

Now that maven-3 did drop support for the <uniqueVersion>false</uniqueVersion> for snapshot artefacts it seems that you really need to use timestamped SNAPSHOTS. Especially m2eclipse, which does use maven 3 internally seems to be affected with it, update-snapshots does not work when the SNAPSHOTS are not unique.

It seemed best practice before to set all snapshots to uniqueVersion=false

Now, it seems no big problem to switch to the timestamped version, after all they are managed by a central nexus repository, which is able to delete old snapshots in regular intervalls.

The problem are the local developer workstations. Their local repository quickly does grow very large with unique snapshots.

How to deal with this problem ?

Right now i see the folowing possible solutions:

  • Ask the developers to purge the repository in regular intervals (which leads to lots of fustration, as it takes long time to delete and even longer to download everything needed)
  • Set up some script which does delete all SNAPSHOT directories from the local repository and ask developers to run that script from time to time (better than the first, but still takes quite some time to run and download current snapshots)
  • use the dependency:purge-local-repository plugin (Does have problems when run from eclipse, due to open files, needs to be run from each project)
  • set up nexus on every workstation and set up a job to clean old snapshots (best result, but I don't want to maintain 50+ nexus servers, plus memory is always tight on developer workstations)
  • stop using SNAPSHOTS at all

What is the best way to keep your local repository from filling up your hard drive space ?

Update:

To verify the beaviour and to give more info i setup a small nexus server, build two projects (a and b) and try:

a:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>de.glauche</groupId>
  <artifactId>a</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <distributionManagement>
    <snapshotRepository>
        <id>nexus</id>
        <name>nexus</name>
        <url>http://server:8081/nexus/content/repositories/snapshots</url>
    </snapshotRepository>
  </distributionManagement>

</project>

b:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>de.glauche</groupId>
  <artifactId>b</artifactId>
  <version>0.0.1-SNAPSHOT</version>
    <distributionManagement>
    <snapshotRepository>
        <id>nexus</id>
        <name>nexus</name>
        <url>http://server:8081/nexus/content/repositories/snapshots/</url>
    </snapshotRepository>
  </distributionManagement>
 <repositories>
    <repository>
        <id>nexus</id>
        <name>nexus</name>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
        <url>http://server:8081/nexus/content/repositories/snapshots/</url>
    </repository>
 </repositories>
  <dependencies>
    <dependency>
        <groupId>de.glauche</groupId>
        <artifactId>a</artifactId>
        <version>0.0.1-SNAPSHOT</version>
    </dependency>
  </dependencies>
</project>

Now, when i use maven and run "deploy" on "a", i'll have

a-0.0.1-SNAPSHOT.jar
a-0.0.1-20101204.150527-6.jar
a-0.0.1-SNAPSHOT.pom
a-0.0.1-20101204.150527-6.pom

in the local repository. With a new timestamp version each time i run the deploy target. The same happens when i try to update Snapshots from the nexus server (close "a" Project, delete it from local repository, build "b")

In an environment where lots of snapshots get build (think hudson server ...), the local reposioty fills up with old versions fast

Update 2:

To test how and why this is failing i did some more tests. Each test is run against clean everything (de/glauche gets delete from both machines and nexus)

  • mvn deploy with maven 2.2.1 :

local repository on machine A does contain snapshot.jar + snapshot-timestamp.jar

BUT: only one timestamped jar in nexus, metadata reads:

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <groupId>de.glauche</groupId>
  <artifactId>a</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <versioning>
    <snapshot>
      <timestamp>20101206.200039</timestamp>

      <buildNumber>1</buildNumber>
    </snapshot>
    <lastUpdated>20101206200039</lastUpdated>
  </versioning>
</metadata>
  • run update dependencies (on machine B) in m2eclipse (embedded m3 final) -> local repository has snapshot.jar + snapshot-timestamp.jar :(
  • run package goal with external maven 2.2.1 -> local repository has snapshot.jar + snapshot-timestamp.jar :(

Ok, next try with maven 3.0.1 (after removing all traces of project a)

  • local repository on machine A looks better, only one one non-timestamped jar

  • only one timestamped jar in nexus, metadata reads:

    de.glauche a 0.0.1-SNAPSHOT

    <snapshot>
      <timestamp>20101206.201808</timestamp>
      <buildNumber>3</buildNumber>
    </snapshot>
    <lastUpdated>20101206201808</lastUpdated>
    <snapshotVersions>
      <snapshotVersion>
        <extension>jar</extension>
        <value>0.0.1-20101206.201808-3</value>
        <updated>20101206201808</updated>
      </snapshotVersion>
      <snapshotVersion>
        <extension>pom</extension>
        <value>0.0.1-20101206.201808-3</value>
        <updated>20101206201808</updated>
      </snapshotVersion>
    </snapshotVersions>
    

  • run update dependencies (on machine B) in m2eclipse (embedded m3 final) -> local repository has snapshot.jar + snapshot-timestamp.jar :(

  • run package goal with external maven 2.2.1 -> local repository has snapshot.jar + snapshot-timestamp.jar :(

So, to recap: The "deploy" goal in maven3 works better than in 2.2.1, the local repository on the creating machine looks fine. But, the receiver always ends up with lots of timestamed versions ...

What am i doing wrong ?

Update 3

I also did test various other configurations, first replace nexus with artifactory -> same behaviour. Then use linux maven 3 clients to download the snapshots from the repository manager -> local repository still has timestamped snapshots :(

6
Related question, about only the local .m2\repository part, focused on the local repository on a (Jenkins) build server: stackoverflow.com/q/9729076/223837.MarnixKlooster ReinstateMonica
Here is working link to the Apcahe Maven Comptability Notes - cwiki.apache.org/confluence/display/MAVEN/…aka_sh

6 Answers

36
votes

The <uniqueVersion> configuration applied to artifacts that were deployed (via mvn deploy) to a Maven repository such as Nexus.

To remove these from Nexus, you can easily create an automated job to purge the SNAPSHOT repository every day. It can be configured to retain a certain number of shapshots or keep them for a certain period of time. Its super easy and works great.

Artifacts in the local repository on a developer machine get there from the "install" goal and do not use these timestamps...they just keep replacing the one and only SNAPSHOT version unless you are also incrementing the revision number (e.g. 1.0.0-SNAPSHOT to 1.0.1-SNAPSHOT).

14
votes

This plugin removes project's artifacts from local repository. Useful to keep only one copy of large local snapshot.

<plugin>         
    <groupId>org.codehaus.mojo</groupId>         
    <artifactId>build-helper-maven-plugin</artifactId>         
    <version>1.7</version>         
    <executions>           
        <execution>             
            <id>remove-old-artifacts</id>             
            <phase>package</phase>             
            <goals>               
                <goal>remove-project-artifact</goal>             
            </goals>            
            <configuration>  
                <removeAll>true</removeAll><!-- When true, remove all built artifacts including all versions. When false, remove all built artifacts of this project version -->             
            </configuration>          
        </execution>         
    </executions>       
</plugin>
8
votes

Well I didn't like any of proposed solutions. Deleting maven cache often significantly increases network traffic and slows down build process. build-helper-maven-plugin helps only with one artifact, I wanted solution that can purge all outdated timestamped snapshot artifacts from local cache in one simple command. After few days of searching, I gave up and decided to write small program. The final program seems to be working quite well in our environment. So I decided to share it with others who may need such tool. Sources can be pulled from github: https://github.com/nadestin/tools/tree/master/MavenCacheCleanup

2
votes

As far as the remote repository piece of this, I think the previous answers that discuss a purging of SNAPSHOTs on a regular interval will work. But no one has addressed the local-developer workstation synchronization part of your question.

We have not started using Maven3 yet, so we've yet to see SNAPSHOTs starting to build up on local machines.

But we have had different problems with m2eclipse. When we have "Workspace Resolution" enabled and the project exists within our workspace, source updates usually keep us on the bleeding edge. But we've found it's very difficult to get m2eclipse to update itself with recently published artifacts in Nexus. We're experiencing similar problems within our team and it's particularly problematic because we have a very large project graph... there are a lot of dependencies that won't be in your workspace but will be getting SNAPSHOTs published frequently.

I'm pretty sure this boils back to an issue in m2eclipse where it doesn't handle SNAPSHOTs exactly as it should. You can see in the Maven console within eclipse where m2eclipse tells you it's skipping the update of a recently published SNAPSHOT because it's got a cached version. If you do a -U from a run configuration or from the command line, Maven will pick up the metadata change. But an "Update Snapshots..." selection should tell m2eclipse to have Maven expire this cache. It doesn't appear to be getting passed along. There appears to be a bug out there that is filed for this if you're interested in voting for it: https://issues.sonatype.org/browse/MNGECLIPSE-2608

You made mention of this in a comment somewhere.

The best workaround for this problem seems to be having developers purge their local workstations when things start to break down from within m2eclipse. Similar solution to a different problem... Others have reported problems with Maven 2.2.1 and 3 backing m2eclipse, and I've seen the same.

I would hope if you're using Maven3 you can configure it to only pull the latest SNAPSHOT, and cache that for the amount of time the repository says (or until you expire it by hand). Hopefully then you won't need to have a bunch of SNAPSHOTs sitting in your local repository.

That is unless you're talking about a build server that is manually doing a mvn install on them. As far as how to prevent SNAPSHOTs from building up on an environment like a build server, we've kind of dodged that bullet by having each build use its own workspace and local repository (though, in Maven 2.2.1, certain things such as POMs seem to always come out of the ~/.m2/repository) The extra SNAPSHOTs really only stick around for a single build and then they get dropped (and downloaded again from scratch). So we've seen this approach does end up eating up more space to begin with, but it tends to remain more stable than having everything resolved out of a single repository. This option (on Hudson) is called "Use private Maven repository" and is under the Advanced button of the Build section on project configurations when you've selected to build with Maven. Here is the help description for that option:

Normally, Hudson uses the local Maven repository as determined by Maven — the exact process seems to be undocumented, but it's ~/.m2/repository and can be overridden by in ~/.m2/settings.xml (see the reference for more details.) This normally means that all the jobs that are executed on the same node shares a single Maven repository. The upside of this is that you can save the disk space, but the downside of this is that sometimes those builds could interfere with each other. For example, you might end up having builds incorrectly succeed, just because your have all the dependencies in your local repository, despite that fact that none of the repositories in POM might have them.

There are also some reported problems regarding having concurrent Maven processes trying to use the same local repository.

When this option is checked, Hudson will tell Maven to use $WORKSPACE/.repository as the local Maven repository. This means each job will get its own isolated Maven repository just for itself. It fixes the above problems, at the expense of additional disk space consumption.

When using this option, consider setting up a Maven artifact manager so that you don't have to hit remote Maven repositories too often.

If you'd prefer to activate this mode in all the Maven jobs executed on Hudson, refer to the technique described here.

Hope this helps - if it doesn't address your problem please let me know where I missed.

1
votes

In groovy, deleting timestamped files like artifact-0.0.1-20101204.150527-6.jar can be very simple:

root = 'path to your repository'

new File(root).eachFileRecurse {
  if (it.name.matches(/.*\-\d{8}\.\d{6}\-\d+\.[\w\.]+$/)) {
    println 'Deleting ' + it.name
    it.delete()
  }
}

Install Groovy, save the script into a file and schedule the execution at each week, start, logon, whatever suits you.

Or, you can even wire the execution into maven build, using gmavenplus-plugin. Notice, how is the repository location set by maven into the property settings.localRepository and then binded through configuration into variable repository:

  <plugin>
    <groupId>org.codehaus.gmavenplus</groupId>
    <artifactId>gmavenplus-plugin</artifactId>
    <version>1.3</version>
    <executions>
      <execution>
        <phase>install</phase>
        <goals>
          <goal>execute</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <properties>
        <property>
          <name>repository</name>
          <value>${settings.localRepository}</value>
        </property>
      </properties>
      <scripts>
        <script><![CDATA[
          new File(repository).eachFileRecurse {
            if (it.name.matches(/.*\-\d{8}\.\d{6}\-\d+\.[\w\.]+$/)) {
              println 'Deleting snapshot ' + it.getAbsolutePath()
              it.delete()
            }
          }
        ]]></script>
      </scripts>
    </configuration>
    <dependencies>
      <dependency>
        <groupId>org.codehaus.groovy</groupId>
        <artifactId>groovy-all</artifactId>
        <version>2.3.7</version>
        <scope>runtime</scope>
      </dependency>
    </dependencies>
  </plugin>  
0
votes

Add following parameter into your POM file

POM

<configuration>
<outputAbsoluteArtifactFilename>true</outputAbsoluteArtifactFilename>
</configuration>

https://maven.apache.org/plugins/maven-dependency-plugin/copy-mojo.html

POM example

<plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>2.10</version>
        <executions>
          <execution>
            <id>copy</id>
            <phase>package</phase>
            <goals>
              <goal>copy</goal>
            </goals>
            <configuration>
              <artifactItems>
                <artifactItem>
                  <groupId>junit</groupId>
                  <artifactId>junit</artifactId>
                  <version>3.8.1</version>
                  <type>jar</type>
                  <overWrite>false</overWrite>
                  <outputDirectory>${project.build.directory}/alternateLocation</outputDirectory>
                  <destFileName>optional-new-name.jar</destFileName>
                </artifactItem>
              </artifactItems>
              **<outputAbsoluteArtifactFilename>true</outputAbsoluteArtifactFilename>**
              <outputDirectory>${project.build.directory}/wars</outputDirectory>
              <overWriteReleases>false</overWriteReleases>
              <overWriteSnapshots>true</overWriteSnapshots>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

Configure in Jenkins:

// copy artifact 
copyMavenArtifact(artifact: "commons-collections:commons-collections:3.2.2:jar", outputAbsoluteArtifactFilename: "${pwd()}/target/my-folder/commons-collections.jar")