1
votes

I'm in the process of creating an Azure Logic App to work with Abbyy's OCR REST API.

I use the Create SAS URI by path action which returns Web URL. Web URL returns the FQDN, incuding the SAS token, to my blob.

Web URL is passed to an Http action as a query parameter. In code view, the relevant part of the JSON looks like:

"method": "GET",
"uri": "https://cloud.ocrsdk.com/processRemoteImage?source=@{body('Create_SAS_Uri_by_path')?['WebUrl']}&language=English&exportformat=pdfSearchable&description=blah&imageSource=scanner"

The uri resolves thus:

https://cloud.ocrsdk.com/processRemoteImage?source=https://mysaaccount.blob.core.windows.net/inbox/180730110047_0001.pdf?sv=2017-04-17&sr=b&sig=2IGMt1qDZthaBSyvD3WJ6T1zc36Wr%2FNoiB4Wki5Lf28%3D&se=2018-08-16T11%3A16%3A48Z&sp=r&language=English&exportformat=pdfSearchable&description=blah&imageSource=scanner"

This results in the error (450):

<?xml version="1.0" encoding="utf-8"?><error><message language="english">Invalid parameter: sr</message></error>

Which is basically picking up sr= query parameter from the SAS token and of course, the API doesn't have an sr argument, and even if it did, its value would be wrong.

I did find this question and I attempted "percent-escape" the ampersands (&) by adjusting my code to use the replace function, thus:

"method": "GET",
"uri": "https://cloud.ocrsdk.com/processRemoteImage?source=@{replace(body('Create_SAS_Uri_by_path')?['WebUrl'],'%26','%2526' )}&language=English&exportformat=pdfSearchable&description=blah&imageSource=scanner"

However, this has no effect. I.e the resulting URI is the same as above. Interestingly, it appears the SAS token itself has made use of "percent-escape".

If anyone has any suggestion on how to resolve or work around this problem, I'd be most greatful if you would share your thoughts.

Does anyone know if the LogicApp actions are opensource and if so what the GitHub link is. I can then raise an issue.

1

1 Answers

2
votes

Resolved it.

Basically, I was on the right track, but I used %26 when I should have used &, so using the code above, it should read:

"method": "GET",
"uri": "https://cloud.ocrsdk.com/processRemoteImage?source=@{replace(body('Create_SAS_Uri_by_path')?['WebUrl'],'&','%26' )}&language=English&exportformat=pdfSearchable&description=blah&imageSource=scanner"

And therefore, the URI reads:

https://cloud.ocrsdk.com/processRemoteImage?source=https://mysaaccount.blob.core.windows.net/intray/180730110047_0001.pdf?st=2018-08-17T10%3A55%3A38Z%26se=2018-08-18T10%3A55%3A38Z%26sp=r%26sv=2018-03-28%26sr=b%26sig=FTRoVgV7MRz5d5gTgrEs6D0QSy3268BqscZX1LHbJYQ%3D&language=English&exportformat=pdfSearchable&description=blah&imageSource=scanner

Next step: convert XML body into JSON...

Update 1

I have updated (17/08/2018) the replace with value from %2526 to %26. I was clearly getting my knickers in a twist. Must have been trying to double-escape, which isn't needed.

Although Microsoft seem to partially percent-encode the SAS token, I note they percent-encode = with %3D, however, the Abbyy API doesn't seem to care about =. (Testing from Postman).

Update 2

Not sure why Update 1 did work. It could be because I set the access policy to blob (anonymous read access for blobs only), and setting it back to private hadn't taken effect, or it might be the wrong content-type. Anyway, it worked and then it didn't. Well, it was replacing the & with %26 without issue, and I tried escaping the = too, but the endpoint didn't like it, testing via Postman and Logic App.

My trigger is actually a C# BlobTrigger* Azure Function App, and I have written code to generate the SAS token, so I used dotnet's Uri.EscapeDataString() method: string fullUrl = cloudBlobContainer.Uri + "/" + blobName + sasToken;

string escapedUri = Uri.EscapeDataString(fullUrl);

log.LogInformation($"Full URL: {fullUrl}");
log.LogInformation($"Escaped URL: {escapedUri}");

return escapedUri;

Update 3

Just seen this line, created by the Logic App designer:

"path": "/datasets/@{encodeURIComponent(encodeURIComponent('https://someaccount.sharepoint.com/sites/eps'))}/tables//items//attachments",

Looks like encodeURIComponent() might be a function to do the same thing as Uri.EscapedDataString and would replace replace() function. I haven't tested it yet.

*the reason I am using a function app for the trigger is down to cost. Although there is a Logic App trigger that runs when a new blob is detected, it has to run to a schedule. My plan is consumption, and I have read that MS charge when a Logic App runs, regardless of whether it does anything or not. IMHO, it is inefficient to have a task triggering every 5 seconds, when 90% of the time there isn't anything for it to do. The function app is better suited to my requirements, although there is apparently a ~10 minute "warm-up" period, if the app has gone to sleep. I can live with that.