3
votes

I'm trying to dynamically generate a csv file with some cells that will contain multiple lines, the address field for example will need to be grouped into a single "address" cell instead of address,city,state etc. All is going well and but for the last two days i've tried to insert \r, \r\n, \n, chr(10),chr(13), as well as a carriage return within the code to create the carriage return i'm looking for within the cell. All of these fail, either being literally printed in my csv as "\r" etc or when I do a manual carriage return in the code it generates a new row. I'm using this to create the breaks in my cells but it isn't working

$groupedCell = implode('\r',$data);

I'm pretty sure the code is correct as its placing \r where I would like a carriage return but not the actual return i'm looking for. I've tried some different encodings but still no luck, I am testing in Open Office which I guess could be the issue but I would assume it can handle carriage returns within a cell and I haven't seen any documentation to support otherwise. Thanks for reading!

3
Collin, you made any progress on this? Was the issue PHP based (can use escape sequences in single quotes in PHP) or CSV spec based (various ways to escape in-cell newlines)?Rudu
I ended up not breaking the line. I tried all the solutions here and nothing worked. I tried single and double quotes with no help, although it is good to know that you need to use double quotes for escaping. Perhaps CSV's cannot have this functionality in a universally supported manner?Collin White
The thing is it's two different things: Double quotes to use escape sequences in PHP. But only sometimes double quotes for escaping in CSV-cells.Rudu
Indeed, I think it has to do with how the csv is being handled by the software, not how PHP is generating itCollin White

3 Answers

8
votes

You need to use "\r". You can't use escaped characters (aside from \') in single quoted strings. '\n' and '\r' are a literal backslash followed by an n or r, while "\n" and "\r" are newlines and carriage returns respectively.

As for inserting new lines in your CSV file, it's up to your implementation. There is no standard for CSV, so you'll have to figure out what format to use based on the system you're supplying the CSV data to. Some might accept a '\n' sequence and interpret it as a new line, others might allow a literal newline provided the cell is enclosed in quotes, and still others will not accept new lines at all.

18
votes

The CSV spec is one I find implemented in many different ways... it basically seems like it's only half-speced which is frustrating given it's popularity.

To include a new-line within a cell in a CSV there cell may need to be wrapped, or the new-line may need to be escaped. You'll notice from the linked doc there are three ways to do this - and different programmes treat it differently:

  1. Excel wraps the whole cell in double quotes: a cell can have (unescaped) newline characters within it and be considered a single cell, as long as it's wrapped in double quotes (note also you'll need to use excel-style double quote escaping within the cell contents)
  2. Other programmes insert a single backslash before the character, therefore a line ending in \ is not considered the end of a line, but a newline character within the cell. A cell can have unescaped newline characters within as long as they're preceded by the backslash character.
  3. Others still replace a newline with C-style character escaping, the actual character sequence \n or \r\n. In this case the cell has fully escaped newline characters.

The problem is compounded by the potential need to escape the control characters (as well as other content (eg " in #1, and \ in #2+3) and different styles of escaping (eg. an embedded quote could be escaped as: double double quote "" or backslash-double quote \")

My advice: generate an open-office document with multiple lines and key escape characters and see how open-office generates a CSV file. From there you can decide which of the above methods to use for newlines within cells, and which escaping method.

example of style-1 (excel):

#num,str,num
1,"Hello
World",1990
2,"Yes",1991

example of style-2:

#num,str,num
1,Hello \
Word,1990
2,Yes,1991

example of style-3:

#num,str,num
1,Hello \nWorld,1990
2,Yes,1991
1
votes
  1. Created an Excel 2010 worksheet with 3 columns.
  2. Added a heading row with literal values: one, two, three
  3. Added 1 data row with literal values: abc, abc, abc except that within the 2nd column I pressed ALT+ENTER after each letter to create a carriage return and line feed.
  4. Did SAVE AS > OTHER and choose CSV while ignoring the warnings.
  5. Examined the CSV data using NOTEPAD++ and clicked the Show All Characters button in toolbar.

One can see the following:

one, two, three[CR][LF]
abc,"a[LF]
b[LF]
c",abc[CR][LF]

Hope this lends more clarify.