1
votes

I need to mount the azure file storage to Linux-Pools when they are being spun-up.I am following the instructions given here to achieve that: mounting Azure-File Storage to Batch Specically in my Azure CLI script under the Pools start commands I am inserting something which looks like this

--start-task-command-line="apt-get update && apt-get install cifs-utils && mkdir -p {} && mount -t cifs {} {} -o vers=3.0,username={},password={},dir_mode=0777,file_mode=0777,serverino".format(_COMPUTE_NODE_MOUNT_POINT, _STORAGE_ACCOUNT_SHARE_ENDPOINT, _COMPUTE_NODE_MOUNT_POINT, _STORAGE_ACCOUNT_NAME, _STORAGE_ACCOUNT_KEY)

but when I run the tasks with the auto-user that batch uses by default I get an error in the stderr.txt file mentioning that it was unable to create the "/mnt/MyAzureFileshare" directory and so my guess is the mounting didn't occur during the pool creation process.I saw a very similar question to the one I am facing:setting custom user identity for tasks and even the official Microsoft documentation goes over this in detail:Run Tasks under User accounts in Batch but none of them put a light on how to achieve this using Azure CLI.

In order to install specific packages so that Azure File Storage can be mounted requires sudo privileges and I am unable to do that through the Azure-CLI. In order to recreate the error I would recommend having a look at this:app to replicate the issue

What I want to achieve is:

1) Create a Pool with the Azure-File Storage mounted on it and elevate the privileges of the auto-user to the admin level using Azure CLI

2) Run tasks with the same auto-user with Admin Privileges using the azure CLI

Update 1: I was able to mount Azure File Storage with Batch using the Azure CLI. I still am not able to populate the Azure File Storage with the output files of the app that I deployed on Batch Nodes.I have got no error in the stderr.txt files. The output of the stderr.txt file is:

WARNING: In "login" auth mode, the following arguments are ignored: --account-key

Alive[################################################################]  100.0000%
Finished[#############################################################]  100.0000%

pdf--->png:   0%|          | 0/1 [00:00<?, ?it/s]
pdf--->png: 100%|##########| 1/1 [00:00<00:00,  1.16it/s]WARNING: In "login" auth mode, the following arguments are ignored: --account-key
WARNING: uploading /mnt/batch/tasks/workitems/pdf-processing-job-2018-10-29-15-36-15/job-1/mytask-0/wd/png_files-2018-10-29-15-39-25/akronbeaconjournal_20180108_AkronBeaconJournal_0___page---0.png

Alive[################################################################]  100.0000%
Finished[#############################################################]  100.0000%

The Python App that was deployed on the Batch Nodes is:

import os
import fitz
import subprocess
import argparse
import time
from tqdm import tqdm
import sentry_sdk
import sys
import datetime

def azure_active_directory_login(azure_username,azure_password,azure_tenant):
    try:
        azure_login_output=subprocess.check_output(["az","login","--service-principal","--username",azure_username,"--password",azure_password,"--tenant",azure_tenant])
    except subprocess.CalledProcessError:
        sentry_sdk.capture_message("Invalid Azure Login Credentials")
        sys.exit("Invalid Azure Login Credentials")

def download_from_azure_blob(azure_storage_account,azure_storage_account_key,input_azure_container,file_to_process,pdf_docs_path):
    file_to_download=os.path.join(input_azure_container,file_to_process)
    try:
        subprocess.check_output(["az","storage","blob","download","--container-name",input_azure_container,"--file",os.path.join(pdf_docs_path,file_to_process),"--name",file_to_process,"--account-key",azure_storage_account_key,\
        "--account-name",azure_storage_account,"--auth-mode","login"])
    except subprocess.CalledProcessError:
        sentry_sdk.capture_message("unable to download the pdf file")
        sys.exit("unable to download the pdf file")

def pdf_to_png(input_folder_path,output_folder_path):
    pdf_files=[x for x in os.listdir(input_folder_path) if x.endswith((".pdf",".PDF"))]
    pdf_files.sort()
    for pdf in tqdm(pdf_files,desc="pdf--->png"):
        doc=fitz.open(os.path.join(input_folder_path,pdf))
        page_count=doc.pageCount
        for f in range(page_count):
            page=doc.loadPage(f)
            pix = page.getPixmap()
            if pdf.endswith(".pdf"):
                png_filename=pdf.split(".pdf")[0]+"___"+"page---"+str(f)+".png"
                pix.writePNG(os.path.join(output_folder_path,png_filename))
            elif pdf.endswith(".PDF"):
                png_filename=pdf.split(".PDF")[0]+"___"+"page---"+str(f)+".png"
                pix.writePNG(os.path.join(output_folder_path,png_filename))


def upload_to_azure_blob(azure_storage_account,azure_storage_account_key,output_azure_container,png_docs_path):
    try:
        subprocess.check_output(["az","storage","blob","upload-batch","--destination",output_azure_container,"--source",png_docs_path,"--account-key",azure_storage_account_key,\
        "--account-name",azure_storage_account,"--auth-mode","login"])
    except subprocess.CalledProcessError:
        sentry_sdk.capture_message("Unable to upload file to the container")

def upload_to_fileshare(png_docs_path):
    try:
        subprocess.check_output(["cp","-r",png_docs_path,"/mnt/MyAzureFileShare/"])
    except subprocess.CalledProcessError:
        sentry_sdk.capture_message("unable to upload to azure file share ")

if __name__=="__main__":
    #Credentials 
    sentry_sdk.init("<Sentry Creds>")
    azure_username=<azure_username>
    azure_password=<azure_password>
    azure_tenant=<azure_tenant>
    azure_storage_account=<azure_storage_account>
    azure_storage_account_key=<azure_account_key>
    try:
        parser = argparse.ArgumentParser()
        parser.add_argument("input_azure_container",type=str,help="Location to download files from")
        parser.add_argument("output_azure_container",type=str,help="Location to upload files to")
        parser.add_argument("file_to_process",type=str,help="file link in azure blob storage")
        args = parser.parse_args()
        timestamp = time.time()
        timestamp_humanreadable= datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d-%H-%M-%S')
        task_working_dir=os.getcwd()
        file_to_process=args.file_to_process
        input_azure_container=args.input_azure_container
        output_azure_container=args.output_azure_container
        pdf_docs_path=os.path.join(task_working_dir,"pdf_files"+"-"+timestamp_humanreadable)
        png_docs_path=os.path.join(task_working_dir,"png_files"+"-"+timestamp_humanreadable)
        os.mkdir(pdf_docs_path)
        os.mkdir(png_docs_path)
    except Exception as e:
        sentry_sdk.capture_exception(e)
    azure_active_directory_login(azure_username,azure_password,azure_tenant)
    download_from_azure_blob(azure_storage_account,azure_storage_account_key,input_azure_container,file_to_process,pdf_docs_path)
    pdf_to_png(pdf_docs_path,png_docs_path)
    upload_to_azure_blob(azure_storage_account,azure_storage_account_key,output_azure_container,png_docs_path)
    upload_to_fileshare(png_docs_path)

The upload_to_fileshare() in the python app above should initiate the upload but in my case nothing happens and there is no error in the copy operation in the stderr.txt files

Please let me know a way to troubleshoot this issue

1

1 Answers

1
votes

It does not look like the run elevated parameter is exposed via a command line argument through the CLI. You can however specify a JSON file to the --json argument formatted as the REST API object to get all functionalities.