2
votes

I'm getting the following error on every flat file connector I set up. The flat file connectors are set up from ADO.NET in a Data Flow Task

Error at Data Flow Task [Flat File Destination 11 [1230]]: The number of input columns for Flat File Destination 11.Inputs[Flat File Destination Input] cannot be zero.

Error at Data Flow Task [SSIS.Pipeline]: "Flat File Destination 11" failed validation and returned validation status "VS_ISBROKEN".

Error at Data Flow Task [SSIS.Pipeline]: One or more component failed validation.

Error at Data Flow Task: There were errors during task validation.

I have verified that there are columns in the input, as follows:

Flat file appears to be reading the columns correctly

My data flow looks like

enter image description here

Metadata looks appropriate

enter image description here

Column mappings

enter image description here

1
Does your source file have all those columns in the right format? The other problem might be that you are using SSIS and expecting it to work in anything resembling a sane fashion.siride
Yes, the column mapping appears to be correct. Day/Month/Year and Count are all integers, and market is a string. using SSIS is not my choiceJulia McGuigan
One last question before I give up: do you have your columns mapped in the flat file destinations?siride
In your control flow, right click on this data flow and disable it. Add a new data flow. Copy one of these pairs out and paste into the new data flow. I'd suggest starting with the "Section 1 Guardians by Market by Date" Run that, does it work? The fact that you have so many Flat File Destinations with Xs through them leads me to believe SSIS is just returning the wrong error to you.billinkc
"File in use by another process" - do I understand your comments correctly that some of those Destinations are the same physical file?billinkc

1 Answers

9
votes

Don't build data flows like this. They will take a while to validate (components get validated one after another), and you'll get a limited number of them running in parallel since they are all within the same data flow. You also might run into locking issues if all of those sources are hitting the same DB - see Too Many Sources in a Data Flow.

If you have confirmed that all of the Flat File Destinations have their input column mapped, and you're still getting this error, then it could be that there are just too many source/destination mappings for the SSIS data flow to handle/validate properly. You might want to try one of the design alternatives below.

Manageability vs. Performance

When you're working with this many destinations, I'd recommend an approach that is manageable. If the metadata is the same for each of those Source -> Destination mappings, you can do this ETL with a single data flow.

  1. Create a DB table with "SourceQuery" and "OutputFileName" columns
  2. Add a row in the table for each source/output mapping
  3. In the control flow, select all rows from the table
  4. Set the ResultSet to Full result set
  5. Store the result in an object variable (objResultSet)
  6. Use a Foreach Loop Container with an ADO Enumerator to read each row from objResultSet
  7. Map the columns in the result set to package variables
  8. Map the SourceQuery variable to the ADO.NET Source query using expressions
  9. Map the OutputFileName variable to the Flat File Connection Manager's file path

If you feel performance is more important than manageability, and you want to add more parallelism to your scenario, there are a few different things you can consider. The approach you take will depend on how different those source queries are, and whether you want to do calculations at the DB level or within the SSIS data flow. There are a number of ways you can do this - here are some things to keep in mind.

  • Using multiple Data Flow Tasks allows for more parallelism, and typically performs better that having multiple sources in a single data flow. The only time you should use multiple sources in a data flow is if you are merging/joining rows (and the join can't be done in the source query)
  • If all the data you need can be fit into a single source query, use a single source component and Conditional Split transforms to send rows to the appropriate destinations
  • If you need the same rows down multiple paths, use Multicast transforms
  • You can calculate sums/counts using the Aggregate Transform, but it can be faster to push this to the source query instead