1
votes

I am new in python and google composer. I am trying to read configuration(.properties) file from google cloud storage from my python script. configuration files contain key & value pair. I tried with configparser for reading the config file & normal with operator also If same file we have in same composer environment then we can give the path like '/home/airflow/gcs/dags/config.properties'

But for other bucket what path I can give ??

I am trying to access path storage_client using below code

separator = "="
keys = {}

def iterate_bucket():
    bucket_name = 'other-bucket'
    storage_client = storage.Client.from_service_account_json(
    '/home/airflow/gcs/data/*************.json')
    bucket = storage_client.get_bucket(bucket_name)
    blobs = bucket.list_blobs()
    return blobs

def read_Prop():
    blobs = iterate_bucket()

    for blob in blobs:
        if "config.properties" in blob.name:
            print("hello : ", blob.name)
            #file_name = "/home/airflow/gcs/" + blob.name
            file_name = blob.name

    with open(file_name, 'r') as f: 
       for line in f:
       if separator in line:

            name, value = line.split(separator, 1)
           keys[name.strip()] = value.strip()

print(keys.get('any_key'))

I used configparser also

config = configparser.ConfigParser()
config.read(blob.name)

In both the condition for other google cloud storage(bucket) is not accesible from my python script. I am getting no such file or directory error

What path we can give or any other way to access ??

Example 2 -

def readYML():
bucket_name = 'external-bucket'
storage_client = storage.Client.from_service_account_json(
    '/home/airflow/gcs/data/private-key.json')
bucket = storage_client.get_bucket(bucket_name)
file_name = ""
blobs = bucket.list_blobs()
print("blobs : ",blobs)
for blob in blobs:
    if "sample_config.yml" in blob.name:
        print("hello : ", blob.name)
        file_name = blob.name
        print("file_name : ",file_name)
with open(file_name, 'r') as ymlfile:
    CFG = yaml.load(ymlfile)
    print("inside readYML method : ", CFG['configs']['project_id'])
    return CFG

see the logs - Here in print statement file name is coming but when we are reading getting error for NO such file or directory error.

[2020-01-02 12:39:12,853] {base_task_runner.py:101} INFO - Job 11782: 
Subtask Raw1 [2020-01-02 12:39:12,852] {logging_mixin.py:95} INFO - hello :  
sample_config.yml
[2020-01-02 12:39:12,853] {logging_mixin.py:95} INFO - file_name :  
sample_config.yml
[2020-01-02 12:39:12,853] {base_task_runner.py:101} INFO - Job 11782: 
Subtask Raw1 [2020-01-02 12:39:12,853] {logging_mixin.py:95} INFO - 
file_name :  sample_config.yml
[2020-01-02 12:39:12,855] {models.py:1846} ERROR - [Errno 2] No such file or 
directory: 'sample_config.yml'

Thanks in advance

1
Use the Google Cloud Console and look up the path for each object in your bucket. You will see how they are named. However, you cannot use the standard file I/O library to read objects. You need to use the same SDK as you are using to get the names to read the objects (except if the objects are public then you would use an HTTP request library). - John Hanley
Note You have some issue with your code. For example, referencing blob.name after the for loop (or is your indenting wrong in your code display?). - John Hanley
there is no indention error, in code display is wrong, anyway I changed it - Bhagesh Arora
At security point of you, I recommend you to not store and use a JSON key file. It's better to use the identity of Composer and to authorized this service account on your bucket (as Object Reader only, if no other action are required). - guillaume blaquiere
Hi @BhageshArora if you still experience this issue, can you add complete error trace to your post, to further investigate you issue? However, if the issue is resolved by now, can you post the answer and accept it, for the better visibility of the resolution? - Pawel Czuczwara

1 Answers

1
votes

We can't access any configuration file from composer bucket. we can only access the file using storage access API but directly we can't read any property file from any other bucket.