2
votes

I have a file in azure data lake store. I am using polybase to move data from data lake store to data warehouse. I followed all the steps which are mentioned here.

Let's say I have created external table as External_Emp which has 3 columns : ID, Name, Dept.

When I am running following query:

select * from External_Emp;

Then, it is showing me all the records which is right. But when i am selecting a specific column/columns then it shows the column name in a row. For Ex., if i run following query:

select Name from External_Emp;

Then output is :

Name
-----
Name
1
2
3

whereas, the output should be:

Name
------
1
2
3

Similarly , when i run query:

select ID, Name from External_Emp;

Then it shows following output:

ID | Name
---------
ID | Name
1  | abc
2  | xyz
3  | pqr

whereas, the output should be :

ID | Name
--------
1  | abc
2  | pqr
3  | xyz

why is it showing the column names in the separate row? Is that a bug?

I checked the data in csv file in azure data lake multiple times. It doesn't have repetitive column names. Thanks.

1

1 Answers

7
votes

Drop the External Table and the External File Format. Then recreate the External File Format with FIRST_ROW=2 which will skip one row as mentioned in the documentation:

CREATE EXTERNAL FILE FORMAT TextFileFormat
WITH
(   FORMAT_TYPE = DELIMITEDTEXT
,    FORMAT_OPTIONS    (   FIELD_TERMINATOR = '|'
                    ,    STRING_DELIMITER = ''
                    ,    DATE_FORMAT         = 'yyyy-MM-dd HH:mm:ss.fff'
                    ,    USE_TYPE_DEFAULT = FALSE
                    ,    FIRST_ROW = 2
                    )
);