0
votes

I am running a .set-or-append command to ingest from 1 table into another. I know that the query from the source table is fine and the target table, if it exists, should have the same query but if not the command should just create it. Initially I was not having a problem with this. But a few of my .set-or-append queries have been getting this error:

Invalid query for distributed set / append / replace operation. Error: Query schema does not match table schema. QuerySchema=('long'), TableSchema=('datetime,string,string,string,string,dynamic,dynamic,dynamic'). Query:...'

I know for a fact that the schemas match. I ran the same command again and again and on about the 3rd try the call succeeded. Which makes 0 sense to me. So what is this above error and why did the same command work after failing with no change to the query whatsoever?

The query/command I am running is essentially the following:

.set-or-append async TargetTable <|
SourceTable
| where __id in ("...", "....", ........) // is aproximately 250 distinct ids in "in" operator
2
I would recommend you include the query in your question, to give a better understanding of what you're running. You could also use getschema to validate both the query schema and the target table scheme actually match. Note that if the mismatch is inconsistent, it may suggest you're using a query operator or plugin whose output schema depends on the input data, and can vary between executions (e.g. pivot or bag_unpack) - Yoni L.
updated my question with the query. no it is nothing like that, just a standard IN operator. I verified the schemas are the same. - Jeremy Fisher
are you sure you didn't include some aggregation at the end of the query, like | summarize count()? if you are, please include the client request ID for your failed attempt - Yoni L.
I looked at the .show commands Text column to view my query and I did not see any summarization. The client activity id is KNC.execute;adabeccc-2435-4680-a8be-8f3ac6b1011c. I got around this issue for now but this seems like legitimate bug. My query was very large (it is programatically constructed and includes a large list in the IN operator) so maybe the Text column truncated the query and thus that's why I don't see the summarization? - Jeremy Fisher

2 Answers

1
votes

It appears that the query you're using is extending data with extra column: extend hashBucket = hash(row_number(), ...) | where hashBucket == ... - and thus you're getting schema mismatch.

Perhaps, your intention was to filter based on the hashBucket, and in this case you can just use filtering without extension: where hash(row_number(), ...) == ...

0
votes

Yoni could you explain what you mean by bag_unpack giving an inconsistent mismatch? i aligned my bag_unpack ... project-reorder to match the target table i'm unpacking to, but it just changes around a few variables types in the error message:

    Query schema does not match table schema. 
    QuerySchema=(
    'datetime,long,datetime,string,string,datetime,string,
    long,real,string,bool,guid,guid,string,real'), 
    TableSchema=(
    'datetime,long,datetime,string,string,datetime,string,
    long,real,guid,guid,string,bool,string,real')

really confused what the table schema and query schema even are at this point.


For reference my query is like this:

    .set-or-append async apiV2FormationSearchTransform <|
    //set notruncation;
    apiV2FormationSearchLatest
    | where hash(toguid(fullRecord["id"]), 1) == 0
    | project fullRecord
    | evaluate bag_unpack(fullRecord)
    | extend dateCatalogued = todatetime(column_ifexists("dateCatalogued", ""))
    , simpleId = tolong(column_ifexists("simpleId", ""))
    , dateLastModified = todatetime(column_ifexists("dateLastModified", ""))
    , reportedFormationName = tostring(column_ifexists("reportedFormationName", ""))
    , comments = tostring(column_ifexists("comments", ""))
    , dateCreated = todatetime(column_ifexists("dateCreated", ""))
    , formationName = tostring(column_ifexists("formationName", ""))
    , internalId = tolong(column_ifexists("internalId", ""))
    , topDepth = toreal(column_ifexists("topDepth", ""))
    , wellId = column_ifexists("wellId", toguid(""))
    , id = column_ifexists("id", toguid(""))
    , methodObtained = tostring(column_ifexists("methodObtained", ""))
    , isTarget = tobool(column_ifexists("isTarget", ""))
    , completionId = tostring(column_ifexists("completionId", ""))
    , baseDepth = toreal(column_ifexists("baseDepth", ""))

    | project-reorder dateCatalogued
    , simpleId
    , dateLastModified
    , reportedFormationName
    , comments
    , dateCreated
    , formationName
    , internalId
    , topDepth
    , wellId
    , id
    , methodObtained
    , isTarget
    , completionId
    , baseDepth

and this is the getschema output of my target table:

    dateCatalogued          0   System.DateTime datetime
    simpleId                1   System.Int64    long
    dateLastModified        2   System.DateTime datetime
    reportedFormationName   3   System.String   string
    comments                4   System.String   string
    dateCreated             5   System.DateTime datetime
    formationName           6   System.String   string
    internalId              7   System.Int64    long
    topDepth                8   System.Double   real
    wellId                  9   System.Guid guid
    id                      10  System.Guid guid
    methodObtained          11  System.String   string
    isTarget                12  System.SByte    bool
    completionId            13  System.String   string
    baseDepth               14  System.Double   real