1
votes

We have large number of tasks(~30) kicked off by SCDF on PCF, however we are running to disk space issues with SCDF, the issue appears to be due to SCDF downloading artifacts each time a task is invoked.

  1. The artifacts in our case are downloaded from an rest endpoint https://service/{artifact-name-version.jar} (which inturn serves it from an S3 repository)
  2. Every time a task is invoked, it appears that SCDF downloads the artifact (to ~tmp/spring-cloud-deployer directory)verifies the sha1 hash to make sure it's the latest before it launches the task on PCF
  3. The downloaded artifacts never get cleaned up

It's not desirable to download artifacts each time and fill up disk space in ~tmp/ of SCDF instance on PCF. Is there a way to tell SCDF not to download artifact if it already exists ?

Also, can someone please explain the mechanism of artifact download, comparing sha1 hash and launching tasks (and various options around it)

Thanks !

1

1 Answers

1
votes

SCDF downloads the artifacts for the following reasons at the server side.

1) Metadata (application properties) retrieval - if you have an explicit metadata resource then only that is downloaded 2) The corresponding deployer (local, CF) eventually downloads the artifact before it sends the deployment request/launching request.

The hash value is used for unique temp file creation when the artifact is downloaded.

Is there a way to tell SCDF not to download artifact if it already exists?

The HTTP based (or any explicit URL based other than maven, docker) artifacts are always downloaded due to the fact that the resources in a specific URL can be replaced with some other resource and we don't want to use the cache in this case.

Also, We recently deprecated the use of cache cleanup mechanism as it wasn't being used effectively.

If your use case (with this specific disk space limitation can't handle caching multiple artifacts) requires this cleaning of cache feature, please create a Github request here

We were also considering the removal of HTTP based artifact after it is deployed/launched. Looks like it is worth to revisit that now.