0
votes

I am writing a query to evaluate a table schema across multiple work spaces to ensure we aren't encroaching the 500 column table limit. Once we are getting close it would trigger an Azure Alert. Since we can't directly control the schema of the default AzureDiagnsotics table this is an approach to be alerted when we might be crossing the max supported threshold.

The query returns results; however, while testing I notices the query would fail occasional when Log Analytics was updating the schema of the table. This occurred after I hooked up a different resource type to a workspace thus feeding it potentially new schema information.

It would throw the error that it couldn't summarize off of 'columnName'. The 'columnName' value would change over the course of time leading me to believe it was having issues reading the schema since an update was occurring. After a bit of time the query consistently returned with reuslts.

Here is a sample of the query:

workspace("workspace1").AzureDiagnostics
|  getschema
| summarize count(ColumnName)
|project TotalColumns = count_ColumnName, Workspace = "workspace1" 
| where TotalColumns >400
| union (
workspace("workspace2").AzureDiagnostics
|  getschema
| summarize count(ColumnName)
|project TotalColumns = count_ColumnName, Workspace = "workspace2" 
| where TotalColumns >400
)

My question is, is there a way to write a query similar to SQL w/ No Lock? That way it will still return the schema even if it isn't committed. (I've changed the where clause to be able to inspect records that come back).

1

1 Answers

0
votes

If I correctly understand you issue with the query, you need to do two things:

  1. Add to the union IsFuzzy=true - to avoid cases when no result is returned
  2. Add the workspace name via extend rather than project. You can rename the column after it is unioned. Alternatively, you can use extend to create the column that is missing.

Regarding the problem you are trying to solve, we are actively working in Azure Log Analytics to avoid it all together. We are working now to have dedicated tables for most of Azure resource so it wouldn't overpopulate the AzureDiagnostics table. Some Azure resource like Azure Data Factory have options to control whether it would use the dedicated tables or AzureDiagnistcs. See #4 here: https://docs.microsoft.com/en-us/azure/data-factory/monitor-using-azure-monitor#monitor-data-factory-metrics-with-azure-monitor