0
votes

I'm very new to ssis.

I have this issue similar to a post: Flat File to retain commas from SQL data

But unfortunately I have not gotten any clearance yet.

Basically in SSIS package, I have

  1. OLEDB (source)
  2. Fuzzy lookup
  3. Conditional split to flat files (I'm using these flat files as source in another data flow)

While populating the records to the specified flat files, commas in address are being treated as delimiter hence all the values are populated wrongly.

Questions:

  1. I have gone through some recommendations on quoting the string so the commas within one field can be escaped. If this is a resolution, then on which ssis step should I quote my address string?

  2. How SSIS read vs How SSIS add delimiter (might cause brain damage) SSIS wants us to specify how it should read the output from my conditional split. But think about it, shouldn't it already know since from my steps: OLEDB to FUZZY to conditional split the columns are formatted into columns properly. Shouldn't it be asking how you want your values to be delimited instead? Just like if you have an excel and you want to convert them to flat files, it prompts you whether you wants it to be comma delimited or tab delimited. When you specify CSV, it simply adds comma between all your cell value (regardless if you have comma in your cell).

Please bear with my stupidity.

2
So you are writing comma delimited files? You probably just need to use a text qualifier. That is, all fields should be surround by, for example, double quotes. OR you can use a delimiter that will never have this problem, i.e. TAB delimited. TAB delimited is far more reliable.Nick.McDermaid
Thanks Nick, the source is in SQL table. After the fuzzy and conditional split I'm not sure what format it is in but I do want to write into files regardless of delimiters. When I specify (in my flat file configuration) the delimiter to be comma, the values will not be populated accordingly to the column headers, some runs off the rails because of the commas I have in the string. If I use almost any of other delimiters, I will get the entire row values populated into one column. I hope you can visualise this.Wei Bin Fong
If I need to add quotes around the string where and how should I do this? I definitely cannot change the source sql tables.Wei Bin Fong
Change the delimiter to tab. It just looks like everything is in one column because excel shows it that way (I assume you're opening in excel). Use text to columns in excel to split it into many columns.Nick.McDermaid

2 Answers

0
votes

While populating the records to the specified flat files, commas in address are being treated as delimiter hence all the values are populated wrongly.

For this reason you should avoid using comma as a delimiter and use something that does not normally appear in free text. The next most commonly use delimiter is TAB and you are far less likely to face this issue if you use it

If you really want to use comma then you can use a 'text qualifier' of quotes but that just means you have two characters that will cause problems: commas and quotes!

which ssis step should I quote my address string?

Don't do it - just use a tab delimiter (or something even less common like ~ or `. If you really want to do it, it should be in the output component somewhere

How SSIS read vs How SSIS add delimiter (might cause brain damage) SSIS wants us to specify how it should read the output from my conditional split. But think about it, shouldn't it already know since from my steps: OLEDB to FUZZY to conditional split the columns are formatted into columns properly. Shouldn't it be asking how you want your values to be delimited instead? Just like if you have an excel and you want to convert them to flat files, it prompts you whether you wants it to be comma delimited or tab delimited. When you specify CSV, it simply adds comma between all your cell value (regardless if you have comma in your cell).

I'm not familiar with those components but yes, internally you should never have to worry about delimiters - it already knows which are individual columns

The real misunderstanding here is double clicking a csv file and having it open in Excel. Excel guesses the delimiters. When you open your tab delimited csv file in excel it has guessed the delimiter as comma, opened it, and thinks everything is in one column

If you use "text to columns" in excel and tell it to use tab, it will display the file correctly. Not only that, it will remember next time.

In actual fact the file you have exported is a text file that can be opened in notepad. If you open your tab delimited file in notepad you'll see everything neatly lined up in columns. Of course notepad is not the best data analysis tool.

What purpose do you have for exporting these files? If the purpose is to upload data into other systems, then use a tab delimiter.

If the purpose is to have an end user open the file and inspect / analyze it, then exporting as excel might be a less confusing solution

Just be aware that excel format is a bad choice for uploading into other systems, as the excel driver has some very strange behaviour.

0
votes

Just to reorganize this entire thread:

My issue is when I load data from SQL to flat file and I have comma (,) in certain columns, my values goes off the rail and end up in a wrong column. I kept going back and forth setting the column delimiter and somehow it doesn't save when I reload the data flow.

The mistake is that I assumed that SQL loads data as CSV into flat files, but in fact it doesn't. It relies on the destination file you have created. Correct me if I'm wrong, creating a tab delimited (txt) in excel is different from normal text file created in notepad. My resolution is to create the tab delimited file in excel. SSIS can somehow recognize that and doesn't default loading as comma delimited.

One thing I never tried suggested by Nick is that you don't create any output file and just define it in SSIS and let it manage itself.

I've run through a lot of threads and many are facing this issue but has no answer. I hope this helps!