U-Sql Error in Extracting

Question

We have written a USQL script to extract a (.CSV file) ,in which all columns are extracted as a row. But we are unable to process all the files as the job gets failed. The error message we get is "VERTEX FAILED FAST" However if we convert the file format to (.Csv)(MS-DOS) extension the job gets executed. Can someone please figure out the issues and tell us how to solve it. Or any other way to extract all column's as a row would also help. We also attach the script.

$scripts = @"
@rs =
    EXTRACT 
        line string,
        filename string 
    FROM "$filepath/$jobid/{filename}.csv"
    USING Extractors.Text(delimiter:'\n', skipFirstNRows: 1);
@j =
    SELECT *
    FROM @rs;
@rs1 =
    SELECT *
    FROM @j 
    WHERE $output;

@k=
    SELECT filename,COUNT() AS Count1
    FROM @j 
    WHERE $output
    GROUP BY filename;
OUTPUT @rs1 
    TO "$filepath/$jobid/logdata.txt"
    USING Outputters.Text(); 

OUTPUT @k
    TO "$filepath/$jobid/count.txt"
    USING Outputters.Text();

"@

Michael Rys Michael Rys · Accepted Answer · 2017-10-31T09:39:01

First my apologies that currently the error message is not better visible. Vertex Failed fast error messages contain more details that should tell you what actually caused the vertex to fail. Do you have that information? Without it, it is hard to answer the question without speculating.

Having said that, often the root causes for Vertex failed fast fit into one of the following categories:

Your row has a different number of columns than what you expect. This is not likely to be the case here.
Your row contains data that cannot be cast into the column's specified data type. Again, unlikely in your case.
Your row/cell contains data that is too large for the datatype. This could be the case for you, although since you mention that changing the CSV file encoding makes it work seems to indicate it could also be 4. If this is the case, you will have to find a way to either truncate or split the row over several rows.
The encoding of the file is not UTF-8 (as the default setting assumes) but some other encoding and it leads to an error (invalid encoding at best, or any of the first 3 options). If this is the case, please specify the right encoding or change the encoding of the file.

If this does not help you to resolve the issue, please forward me the job link at usql (at) microsoft dot com.

U-Sql Error in Extracting

1 Answers