1
votes

I am new to Azure and I have this field in my table in Azure SQL Data Warehouse:

[AnnotationText] varchar(MAX) NULL,

Based on what I read from https://docs.microsoft.com/en-us/sql/t-sql/data-types/char-and-varchar-transact-sql?view=sql-server-2017:

  • varchar [ ( n | max ) ] Variable-size string data. Use n to define the string size in bytes and can be a value from 1 through 8,000 or use max to indicate a column constraint size up to a maximum storage of 2^31-1 bytes (2 GB).

But what I am seeing is the size of 8000 with this type of error message:

Too long string in column [-1]: Actual len = [11054]. MaxLEN=[8000]

I tried to hardcode the size but any number bigger than 8000 isn't working as written in the document.

Also I found some docs saying that HEAP or CLUSTERED INDEX type should be used in the table creation but it didn't help either

Any idea what went wrong?

2
Can you please show the query that gives an error? - Ron Dunn
I used PolyBase via Data Factory so it is not easy to show the query. I used Copy Data activity to copy from a SQL database to a SQL data warehouse. - kee
ADF documents that (max) data types are not supported. docs.microsoft.com/en-us/azure/data-factory/… - Ron Dunn
@RonDunn Please post your comment as an answer. Thanks! - Gaurav Mantri
To be precise, Polybase doesn't support max type. Thanks again @RonDunn and please post it as an answer - kee

2 Answers

1
votes

ADF documents that (max) data types are not supported for Polybase.

One commonly used technique is to split the file in ADF, using Polybase to bulk load the LOB data, then an alternative technique to add the LOB data later. Another technique - faster - is to split the column using ADF, and reassemble it using a view layered over the external table at the point of ingestion.

A better approach may be to question why LOB data is required in the data warehouse. Is there an alternative approach that could be used, depending on the type of the LOB? For example, if the LOB represents a document, could it be externalised to a blob store with a link in the DW table?

1
votes

I'm using Databricks to insert data into Azure DW (Synapse) using Polybase. I have a nvarchar(500) column in my dataset and it failed having the same problem. I found that in my case the breaking point is somewhere between 255 and 300 characters. I ended up just substringing that column to 250 characters, because the content was not important for reporting purposes.