1
votes

I am trying to define a create cluster function to create a cluster on Cloud Dataproc. While going through the reference material I came across an idle delete parameter (idleDeleteTtl) which would auto-delete the cluster if not in use for the amount of time defined. When I try to include it in cluster config it gives me a ValueError: Protocol message ClusterConfig has no "lifecycleConfig" field.

The create cluster function for reference:

def create_cluster(dataproc, project, zone, region, cluster_name, pip_packages):
    """Create the cluster."""
    print('Creating cluster...')
    zone_uri = \
        'https://www.googleapis.com/compute/v1/projects/{}/zones/{}'.format(
            project, zone)
    cluster_data = {
        'project_id': project,
        'cluster_name': cluster_name,
        'config': {
            'initialization_actions': [{
                'executable_file': 'gs://<some_path>/python/pip-install.sh'
            }],
            'gce_cluster_config': {
                'zone_uri': zone_uri,
                'metadata': {
                        'PIP_PACKAGES': pip_packages
                    }
            },
            'master_config': {
                'num_instances': 1,
                'machine_type_uri': 'n1-standard-1'
            },
            'worker_config': {
                'num_instances': 2,
                'machine_type_uri': 'n1-standard-1'
            },
              'lifecycleConfig': { #### PROBLEM AREA ####
                'idleDeleteTtl': '30m'
            }
        }
    }

    cluster = dataproc.create_cluster(project, region, cluster_data)
    cluster.add_done_callback(callback)
    global waiting_callback
    waiting_callback = True

I want similar functionality if not in the same function itself. I already have a manual delete function defined but I want to add the functionality to auto delete clusters when not in use.

1
Since it's a beta feature still I think your endpoint might need to be v1beta2 instead of plain v1. - Hitobat
Be sure that you're importing the Dataproc client from the v1beta2 package and not the v1 package as well (cloud.google.com/dataproc/docs/reference/rpc/…) It should be OK to leave the zone URI as v1, since beta Dataproc can use v1 compute. - Jerry Ding

1 Answers

2
votes

You are calling the v1 API passing a parameter that is part of the v1beta2 API.

Change your endpoint from:

https://www.googleapis.com/compute/v1/projects/{}/zones/{}

To this:

https://www.googleapis.com/compute/v1beta2/projects/{}/zones/{}