How to remove empty cells from csv while parsing csv using PapaParse?

Question

Or to put the question another way: Why is PapaParse's ParseResult.data an empty array when trimming all leading and trailing empty cells during Papa.step() function? EDIT: Please note I can achieve what I'm wanting by mapping over the parsed results and trimming, but I don't want to parse and then map, I'd rather do it all in one go.

Example CSV:

Col 1,Col 2,Col 3
1-1,1-2,
,2-2,2-3
3-1,3-2,3-3

Note that row 1 contains headers (Col 1, Col 2, etc). Row 2 col 3 is empty, and row 3 col 1 is empty.

Given that CSV, I want to present this back to the user (as a nicely-formatted table):

|     |     |     |
|-----|-----|-----|
| 1-1 | 1-2 |     |
| 2-2 | 2-3 |     |
| 3-1 | 3-2 | 3-3 |

I want to push all rows as far to the left as they can go, and remove all empty cells from the end of each row.

In other words, I want to trim all empty cells from both the beginning and the end of each row. Below is the code I'm using. I have put debuggers inside of trimEmptyCells and it is doing exactly as expected. However, the ParseResult that parseAndTrim returns contains an empty data array.

export const parseAndTrim = (csv: string): Papa.ParseResult => {
    return Papa.parse(csv, {
        skipEmptyLines: true,
        step: trimEmptyCells,
    })
};

const trimEmptyCells = (results: Papa.ParseResult) => {
    // Note that `_.dropWhile` and `_.dropRightWhile` are [lodash
    // functions](https://lodash.com/docs/4.17.15#dropRight).
    const leftTrimmed = _.dropWhile(results.data, (r) => r === "");
    return _.dropRightWhile(leftTrimmed, (r) => r === "");
};

My first guess was that PapaParse was experiencing errors with arrays with different lengths, but the errors array is also empty. So I tested what I could (no step function) at https://www.papaparse.com/demo using the example below and simply having missing cells (not merely empty) throws no errors and returns a proper data array.

Example test input at https://www.papaparse.com/demo

Col 1,Col 2,Col 3
1-1,1-2
,2-2,2-3

Jason Fry Jason Fry · Accepted Answer · 2020-03-14T19:52:53

Based on this comment from pokoli (the #2 contributor to PapaParse and the #1 contributor since early 2017), I believe this is impossible. pokoli's proposed solution is

You should use Papa.parse to read records as array, filter them and then use Papa.Unparse to write the second file.

I wish I could mutate data while parsing to be faster, but PapaParse is very fast. I was able to parse a 36,000-line csv in under 300ms, and unparse in twice the time. Parsing a 2,000-line csv took under 30ms and unparse again took twice the time. My use case will involve CSVs under 2,000 lines 99% of the time so parsing into 2d array, filtering, unparsing back into csv, then parsing again into json won't take too long.

How to remove empty cells from csv while parsing csv using PapaParse?

1 Answers