2
votes

I have two questions:

  1. As storage bucket names are unique, how do I keep bucket name exactly same in development environment and production environment. Or what are best practice for dev and prod environment in data based environment?

  2. How do i copy data from one project to other. I tried searching but i could not get efficient way to copy between 2 projects.

PS: Storage transfer allows copying between 2 buckets within same project, not cross project. I was not able to find bucket from different project even with search option. I searched using gs://another-project-bucket

3
What do you mean copy data from one project to another? Do you mean copy data from one GCS bucket to another? - Kolban
You might want to consider multiple questions? Stackoverflow questions are usually desired to be one question per question posting. - Kolban
Personnaly, I created two projets myapp-prod and mayapp-test. In every file or command, I pass the project name as a variable. I think it is the easiest way if you begin. What cloud product of GCP are you using? - ThisIsMyName
@Kolban Resources are organized in projects in GCP. To separate environment -dev and prod - separate projects are created. Both the projects has cloud storage buckets but they cannot communicate directly without granting some additional permission as the boundary is project. But bucket are global resource and so for naming convention, it has to be unique globally outside the boundary of the project. - Darpan Patel
@ThisIsMyName thanks for recommendation. Major functional components (apart from network) being used in projects are- storage, bigquery, pubsub, cloud function, dataprep, scheduler - Darpan Patel

3 Answers

2
votes

First question

As storage bucket names are unique, how do I keep bucket name exactly same in development environment and production environment. Or what are best practice for dev and prod environment in data based environment?

You are correct. As far as Google Cloud Storage buckets is concerned, bucket names reside in a single Cloud Storage namespace. As per the documentation, this means that:

Every bucket name must be unique. Bucket names are publicly visible. If you try to create a bucket with a name that already belongs to an existing bucket, Cloud Storage responds with an error message. However, once you delete a bucket, you or another user can reuse its name for a new bucket.

As for best practices for development and production environment, I would say that the so-called "separation of concerns" would be the best option here. Having one single project for development purposes, and having a different project for production purposes separetly would be the best fit . Nonetheless, you can have both environments, env and prod, running within a single project; although, this option is not ideal in some cases.


Second question

How do i copy data from one project to other. I tried searching but i could not get efficient way to copy between 2 projects.

The answer can vary for this question:

  1. You can copy GCS bucket objects across projects using the gsutil cp command, REST APIs, or GCS Client Libraries (Java, Node.js, Python). More info can be found here.
  2. You can also achieve this using the Cloud Storage Data Transfer Service to move data from one Cloud Storage bucket to another, so that it is available to different groups of users or applications. Check the link for more information.

An example using gsutil cp would be as follows:

gsutil cp gs://[SOURCE_BUCKET_NAME]/[SOURCE_OBJECT_NAME] gs://[DESTINATION_BUCKET_NAME]/[NAME_OF_COPY]

Where:

[SOURCE_BUCKET_NAME] is the name of the bucket containing the object you want to copy. For example, my-bucket.

[SOURCE_OBJECT_NAME] is the name of the object you want to copy. For example, pets/dog.png.

[DESTINATION_BUCKET_NAME] is the name of the bucket where you want to copy your object. For example, another-bucket.

[NAME_OF_COPY] is the name you want to give the copy of your object. For example, shiba.png.


IMPORTANT: Make sure that you have the correct set permissions to perform this type of operation

You can also check How can I move data directly from one Google Cloud Storage project to another?.

1
votes
  1. As a best practice I'd recommend using different buckets for production and development, to avoid potentially having untested dev code impact production data.

  2. Copying is efficient (metadata-only, no data copying) if the source and destination objects have the same location and storage class.

0
votes

How do i copy data from one project to other. I tried searching but i could not get efficient way to copy between 2 projects.

  1. Create two buckets:

    gcloud projects create env-proj
    gcloud projects create env-proj2
    
  2. Set project property to source project:

    gcloud config set project env-proj
    
  3. Create a file in source project:

    nano file
    cat file
    # This is a file 
    
  4. Create a bucket in source project:

    gsutil mb gs://testbucket-env
    
  5. Copy the file to the bucket created:

    gsutil cp file gs://testbucket-env
    
  6. Set project property to destination project:

    gcloud config set project env-proj2
    
  7. Copy the file to destination project:

    gsutil mv gs://testbucket-env/file  file
    
  8. Testing:

    cat file
    # This is a file