1
votes

I receive a csv file periodically from a database download and I cannot change the way this csv file is produced. I need to convert it to an Excel file to pass it on to an external partner. There are two fields causing trouble in the csv file when I import into Excel: a permit number which is actually a character string, and a location description which can contain commas and other punctuation.

If I right click on the csv file and select Open With Excel, the location description is imported correctly but some permit numbers are converted to scientific notation.

Alternatively, if I use the text import wizard in Excel, I can import the permit numbers as text, but the location description ends up being split across multiple cells even when I set the text qualifier to ".

Here is a simplified example of the data as it appears in the csv file:

id, "Permit Number", Species, Longitude, "Location Details"
1, 2F66-16, DE, -120.0001, "near a road, in woods (FR16)"
2, 678E-15, DE, -120.0002, "near milepost 65, north side of road"

Is there a solution that will allow me to import both fields correctly? This question is very similar but has no answers: excel text importing: wizard vs opening a file within Explorer Thanks for any help.

Update: I think the Location Description is being split into multiple cells because of line breaks in the csv. I'm not sure how these are being generated or how to remove them or deal with them, but when I open the file in Excel, the line breaks are still there (although the entire location description is in one cell). Unfortunately I can't post the full example data because it is sensitive data not easily converted to dummy data. Any thoughts on how to remove line breaks from a csv file?

3
What version of Excel? How are you "importing"? - Ron Rosenfeld
I'm using Excel 2010 and importing by either right-clicking on the csv and selecting Open with Excel or by using the text import wizard in Excel. - LM6
By using the text import wizard in Excel, I assume you are setting up a connection from the Data tab. Problem with this file is that the separator is comma + space. Excel does not handle two character separators well. But so long as all of the fields with space-separated strings are enclosed in quotes, you can use space for the separator; and then strip off the commas in post-processing. That will help with dealing with empty fields. For your other problems, we'd need to see some data that can reproduce the problem. What you have supplied does not. - Ron Rosenfeld
Given the update to OP that says the problem may be due to line breaks in the csv, the only option I have found to work in such cases is to use PowerQuery. That allows you to use the csv as a data source, define field types to ensure that Permit Number is not mis-interpretted as a number, and also respects multi-line text feilds. - MattClarke
BTW, if you are not familiar with PowerQuery, it is a Microsoft extension to Excel (free with the current versions of Office) and will allow you to do what I described completely within the Excel user interface. - MattClarke

3 Answers

1
votes

For whatever reason, Excel doesn't seem to like spaces between the comma and the quote. If you do a global find and replace on your input data (change comma-space-quote to comma-quote) -- and then parse your data, it will honor the text qualifiers as you need.

id, "Permit Number", Species, Longitude, "Location Details" 1, 2F66-16, DE, -120.0001, "near a road, in woods (FR16)" 2, 678E-15, DE, -120.0002, "near milepost 65, north side of road"

0
votes

You are on the right track using the text import wizard within Excel, that at least gives you more control.

I don't know why the text qualifier isn't working, that's exactly what it is designed to do.

Are you specifying that final column as TEXT instead of leaving it as GENERAL?

Do the sample items in the wizard show correctly?

Are you positive they are " characters, and something similar (like smart quotes)?

It's a pain, but if nothing else works, you can always combine the columns back together at the end:

  1. Insert a column before your broken up details
  2. Add a formula like =E2 & "," & E3 (or longer if it splits into more than two)
  3. Copy the combined column, then paste values (over the same cells)
  4. Delete the now extraneous columns
0
votes

Was able to do it with the text importer:

enter image description here

enter image description here

enter image description here