1
votes

I am trying to develop a clone workflow for my properly working ARM VM running Ubuntu. This VM was created from Bitnami LAMP image in Marketplace.

I am trying to use the -CreateOption attach option instead of fromImage, according my knowledge it should work... I am aware that there is an other option: deprovision->capture->-CreateOption fromImage, but that also have problems see Creating new ARM VM from captured image: Blob name in URL must end with '.vhd' extension error The workflow I followed is according to many description and I do not understand this login issue, hopefully I miss a simple step...

I've tried this workflow twice with different source ARM VMs and got the very same results: The new machine seems fully operational, but I can not log in with the known username password to the new machine (via SSH).

Diagnostics:

  • Even the web server and mysql works in the new machine properly, because after the new machine is started I can view the web sites served by it.
  • From the script below I've omitted the configuration of inbound rules, but I successfully allowed HTTP (see above) and SSH. The SSH connects asking the password and evaluates it as wrong.

Here is what I've done:

  • Stopped the fully functional ARM VM (No waagent -deprovision was ran)
  • Copied the OS vhd to a new .vhd blob (Success, the copy script is out of topic)
  • Then ran the following script with full success:

.

Login-AzureRmAccount
Select-AzureRmSubscription -SubscriptionName "Visual Studio Premium with MSDN"

# Create VM from an existing image

$location = "westeurope"
$vmSize = "Standard_DS2"

#Existing resource name parameters:
$rgName = "rg-wp"
$vnetName= "vn-wp" 
$stName = "mystorage"
#This vhd is a copy of a completely working ARM OS vhd: 
$vhdUri = "https://mystorage.blob.core.windows.net/vhds/disk-wp-01.vhd"

#Newly creatable resource names and other parameters
$vmName = "vm-wp-02"
$nicName="ni-wp-02"
$pipName="pip-wp-02"
$nsgName="nsg-wp-02"
$vhdName = "disk-wp-02"

$vnet = Get-AzureRmVirtualNetwork -Name $vnetName -ResourceGroupName $rgName
$storageAccount = Get-AzureRmStorageAccount -AccountName $stName -ResourceGroupName $rgName 

$pip = New-AzureRmPublicIpAddress -Name $pipName -ResourceGroupName $rgName -Location $location -AllocationMethod Static -DomainNameLabel $pipName 
$nsg = New-AzureRmNetworkSecurityGroup -Name $nsgName -ResourceGroupName $rgName -Location $location
$nic = New-AzureRmNetworkInterface -Name $nicName -ResourceGroupName $rgName -Location $location -SubnetId $vnet.Subnets[0].Id -PublicIpAddressId $pip.Id -NetworkSecurityGroupId $nsg.Id

# Configure VM: 
$vm = New-AzureRmVMConfig -vmName $vmName -vmSize $vmSize
$vm = Add-AzureRmVMNetworkInterface -VM $vm -Id $nic.Id
$vm = Set-AzureRmVMOSDisk -VM $vm -Name $vhdName -VhdUri $vhdUri -Linux -CreateOption attach

New-AzureRmVM -ResourceGroupName $rgName -Location $location -VM $vm
2
I have just tried running this against a bitnami image, and am getting an error "Creating a virtual machine from Marketplace image requires Plan information in the request" Did you experience this?Michael B
No. However just isolated, diagnosed that case, but I think you are using the -fromImage and not the -attach. My script below runs like charm, and the VM is running like charm, "just" the login creds somehow ruined. The error message you refer I got when trinig to create a VM from a captured (not copied) image. See the described case with some added diagnostics here: stackoverflow.com/q/35871496/1157814g.pickardou

2 Answers

1
votes

I had a look through this, and it seems to be something within bitnami that is preventing it from logging in.

I didn't put a lot of time into trying to work out how everything connects, but there is definitely code there that checks when a new VM has been deployed from the image, as it will always regenerate the sshd keys when it does so.

All of the expected files are identical (I performed a diff between a plain ubuntu box and the bitnami box)

the error it produces is

Mar  8 20:09:19 lamp sshd[2511]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=13.69.83.238  user=ubuntu
Mar  8 20:09:21 lamp sshd[2511]: Failed password for ubuntu from 13.69.83.238 port 32991 ssh2

Yet /etc/shadow has the right salt / hash for the password I'm using. Someone with more Linux / bitnami experience might be able to pinpoint what the error is in that. But at first brush, it should be working...

This does work exactly as expected on a plain ubuntu image, so it might be simpler to build a new machine from scratch.

Otherwise you might get a better answer posting on the bitnami forums. (its very likely just a config setting)

1
votes

Accidentally I got some diagnostic results while experimenting with capture->create fromImage (CCFI) (do not confuse with copy->attach (CA) what this question is about)

If you do not care about the details just jump to the very end and see the "Currently known simplest workaround"

Diagnostics:

  • If a newly created marketplace bitnami LAMP ARM VM suffers a CA then original admin user can not log on anymore

  • If a newly created marketplace bitnami LAMP ARM VM suffers a CCFI then original admin user can log on, and the newly defined credentials (admin2) also can log on.

  • if a VM what created via CCFI suffers a CA then original admin user can logon, but the admin2 (created in the previous CCFI) can not log on

After this experiment I was able to log on to a CA-d machine what was previously CCFI-d with the original admin user.

Examining the /etc/shadow it turned out that admin2 was disabled: (! sign) before the password hash. The only question remains, which smartie did this clearly unwanted and unexpeted thing. We have two suspects and three scenarios:

  • some bitnami custom script
  • waagent
  • waagent misleaded by some bitnami custom action or misconfiguration

Although I am not sure but my guess is "pure waagent", the second is "waagent misleaded..."

Here is why: I've examined the /var/lib/waagent/ovf-env.xml file, and found that there is a

<ns1:UserPassword>REDACTED</ns1:UserPassword>

entry there for the last provisioned user. This is for "admin" in case of a virgin marketplace machine, and admin2 for a CCFI-d machine. When a machine suffers a CA then this user will be disabled. As this file belongs to waagent my guess waagent somehow detects the CA (why???) operation and first start disables the user. The diagnostics are exactly consistent with this theory.

One thing for sure: This issue is an unwanted automatism which happens withing the VM itself and either bitnami specific either Linux specific.

This issue happens always during/after CA. This is clearly unwanted, because CA has nothing to do with provisioning, deprovisioning, generalizing, clearing sensitive informations, etc.

For workaround:

Tried and proven: CCFI, then you can CA a CCFI-d VM as many times you want.

Currently known simplest workaround:

It worth to try to eliminate the CCFI from the workaround because its time cost. (not talking about that the original machine what was generalized can not be started anymore by design (Azure)

Do the following on any machine:

sudo adduser dummyadmin
sudo adduser dummyadmin sudo
edit the /var/lib/waagent/ovf-env.xml file and replace admin to dummyadmin

If you do this in any machine including a virgin created then customized marketplace machine, then you can freely CA it. Of course you can not log in the new machine with dummyadmin, but you can log in with admin.