I'm experimenting with cloudformation/autoscaling groups to provision our build agents. These take a loong time to provision, and I'm using Lifecycle hooks to signal readiness after a node is finished installing its dependencies, which may take as long as 90 minutes after it comes online.
My LifeCycleHook is configured with HeartbeatTimeout: 7200, and this seems to work fine. However, I get the impression that Cloudformation will only wait 60 minutes for the ASG to stabilize before it gives up. Are there other timeouts besides the HeartbeatTimeout that I should set in order to allow my nodes enough time to go thru their installation process?
(I realize this is an edge case)
UPDATE: As I worked more with this I realize there are two separate methods of controlling the state of an "Autoscale Instance" - lifecycle hooks and "cloudformation signals". It seems that the problem is that cloudformation is completely unaware of the lifecycle hook being in a wait state. I looked into replacing my logic by using cfn signals, but it seems that on Rolling Updates the max timeout ("PauseTime") setting that can be configured is one hour.
All in all, there doesn't seem to be a way to reliably emit status during instance deployment if the deployment takes more than one hour. I've forwarded this to aws support to see what they say.