I'd like to create an EMR cluster programmatically using spot pricing to achieve some cost savings. To do this, I am trying to retrieve EMR spot instance pricing from AWS using boto3 but the only API available that I'm aware of from Boto3 is to use the ec2 client's decribe_spot_price_history
call - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.describe_spot_price_history
The prices from EC2 are not indicative of the pricing for EMR as seen here - https://aws.amazon.com/emr/pricing/. The values are almost double that of EMR's.
Is there a way that I can see the spot price history for EMR similar to EC2? I have checked https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html and several other pages of documentation from AWS online about this and have found nothing.
Here's a code snippet that I use to check approximate pricing that I can use to bid on EMR instances.
max_bid_price = 0.140
min_bid_price = max_bid_price
az_choice = ''
response = ec2.describe_spot_price_history(
Filters=[{
'Name': 'availability-zone',
'Values': ['us-east-1a', 'us-east-1c', 'us-east-1d']
},
{
'Name': 'product-description',
'Values': ['Linux/UNIX (Amazon VPC)']
}],
InstanceTypes=['r5.2xlarge'],
EndTime=datetime.now(),
StartTime=datetime.now()
)
# TODO: Add more Subnets in other AZ's if picking from our existing 3 is an issue
# 'us-east-1b', 'us-east-1e', 'us-east-1f'
for spot_price_history in response['SpotPriceHistory']:
print(spot_price_history)
if float(spot_price_history['SpotPrice']) <= min_bid_price:
min_bid_price = float(spot_price_history['SpotPrice'])
az_choice = spot_price_history['AvailabilityZone']
The above fails since the prices for EC2 spot instances are a bit higher than what Amazon would charge for the normal hourly amount for EMR on-demand instances. (e.g. on demand for a cluster of that size only costs $0.126/hour, but on demand for EC2 is $0.504/hour and spot instances go for about $0.20/hour).