0
votes

WHAT I am trying to do:

I am trying to retrieve the Sign-In logs for the last 24 hours of all users and save it in a blob storage. After the first result set creates the blob, the next results sets would update the blob file with the next remaining results.

Thought of using a blob storage and MS Graph because the Graph output contains all the details that I want without having to jump through various hoops in Powershell to expand certain properties and because the result size is huge (over 1GB via Export-CSV in PowerShell).

HOW I'm trying to do it

A scheduled run that does an HTTP request with the Graph query filtered by the last 24h which creates a block with the HTTP Body as content. After creation of the Blob, I added a (Do) Until control that runs until the HTTP Body does not contain @odata.nextLink and updates the blob file.

ISSUES:

  1. First issue is that the Until loop finishes in 6 seconds.
  2. Second issue is that the blob file only contains the first result set and is usually 9.3MB in size. Which means the next results set is not accessed and appended to the existing blob file.

My research

I tried with pagination enabled & disabled, various pagination thresholds, custom functions, but nothing that would make sense (to me at least) and I'm trying to follow the KISS model.

I looked over and tried to apply in one shape or form the answers from the below S.O. questions: Graph Pagination in Logic Apps | Pagination with oauth azure data factory | Microsoft graph, batch request's nextLink | https://docs.microsoft.com/en-us/graph/paging;

Code I am trying

{
"definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "actions": {
        "Create_blob": {
            "inputs": {
                "body": "@body('fRequest')",
                "host": {
                    "connection": {
                        "name": "@parameters('$connections')['azureblob']['connectionId']"
                    }
                },
                "method": "post",
                "path": "/datasets/default/files",
                "queries": {
                    "folderPath": "/graph",
                    "name": "DoUntil",
                    "queryParametersSingleEncoded": true
                }
            },
            "runAfter": {
                "fRequest": [
                    "Succeeded"
                ]
            },
            "runtimeConfiguration": {
                "contentTransfer": {
                    "transferMode": "Chunked"
                }
            },
            "type": "ApiConnection"
        },
        "Until": {
            "actions": {
                "Update_blob": {
                    "inputs": {
                        "body": "@body('fRequest')",
                        "host": {
                            "connection": {
                                "name": "@parameters('$connections')['azureblob']['connectionId']"
                            }
                        },
                        "method": "put",
                        "path": "/datasets/default/files/@{encodeURIComponent(encodeURIComponent('/graph/DoUntil'))}"
                    },
                    "runAfter": {},
                    "type": "ApiConnection"
                }
            },
            "expression": "@not(contains(body('fRequest'), '@odata.nextLink'))",
            "limit": {
                "count": 60,
                "timeout": "PT1H"
            },
            "runAfter": {
                "Create_blob": [
                    "Succeeded"
                ]
            },
            "type": "Until"
        },
        "fRequest": {
            "inputs": {
                "authentication": {
                    "audience": "https://graph.microsoft.com",
                    "clientId": "registered_app",
                    "secret": "app_secret",
                    "tenant": "tenant_id",
                    "type": "ActiveDirectoryOAuth"
                },
                "method": "GET",
                "uri": "https://graph.microsoft.com/beta/auditLogs/signIns?$filter=createdDateTime gt @{addDays(utcNow(),-1)}"
            },
            "runAfter": {},
            "runtimeConfiguration": {
                "paginationPolicy": {
                    "minimumItemCount": 500
                }
            },
            "type": "Http"
        }
    },
    "contentVersion": "1.0.0.0",
    "outputs": {},
    "parameters": {
        "$connections": {
            "defaultValue": {},
            "type": "Object"
        }
    },
    "triggers": {
        "Recurrence": {
            "recurrence": {
                "frequency": "Week",
                "interval": 7,
                "schedule": {
                    "hours": [
                        "7"
                    ],
                    "minutes": [
                        0
                    ]
                }
            },
            "type": "Recurrence"
        }
    }
},
"parameters": {
    "$connections": {
        "value": {
            "azureblob": {
                "connectionId": "/subscriptions/subscription_id/resourceGroups/Apps/providers/Microsoft.Web/connections/azureblob",
                "connectionName": "azureblob",
                "id": "/subscriptions/subscription_id/providers/Microsoft.Web/locations/eastus/managedApis/azureblob"
            }
        }
    }
}

}

What am I doing wrong or missing? Thanks in advance!

1

1 Answers

0
votes

I managed to increase the pagination threshold to 20000 and now my files are no longer 9MB, they reach 200MB in size. I also removed the "Do" loop. Now I only need to create a break to avoid the threshold and resume collecting the remaining pages of results.