1
votes

I have a spreadsheet where I manually copy and paste some data from pdf tables. I've been using array query split to split that info into different columns and it works flawless in 2 columns (date and amount) and for the other one it works most of the time (reference).

Example that works:

  • PDF DATA: 4500063794 21.07.2020 187.50
  • COLUMNS IN SPREADSHEET (final result after split): Reference:4500063794, Date:21.07.2020, Amount:187.50
  • Formula for retrieving Reference column: =ArrayFormula(QUERY(SPLIT(C3:C7 ;" ");"select Col1";0))
  • This works along the spreadsheet without any problems

Another example that works:

  • PDF DATA: 447/20.6TBOS 04.07.2020 804.00
  • COLUMNS IN SPREADSHEET (final result after split): Reference:447/20.6TBOS, Date:04.07.2020, Amount:804.00
  • Formula for retrieving Reference column: =ArrayFormula(QUERY(SPLIT(C3:C7 ;" ");"select Col1";0))
  • This works along the spreadsheet without any problems

Example that DOES NOT work:

  • If I paste several rows like 1st example and then add several rows as the 2nd example, when I paste afterwards more data like the one in 1st example, split stops retrieving the Reference column for pdf data similar to "447/20.6TBOS 04.07.2020 804.00" (2nd example). It retrieves blank cells for these.

Can anyone shine a light on this? Thanks in advance

Example Spreadsheet

1

1 Answers

2
votes

It will work if you change the query to:

=ArrayFormula(INDEX(SPLIT(REGEXREPLACE(C3:C7; "\s"; "♥");"♥");ROW(C3:C7)-ROW(C3);1))

The formula will replace the spaces by hearts (rare character) and then it will populate the rest.

To change the values of the rows, just change the last character 1 to 2 or 3: )-ROW(C3); ==> 1 ))

enter image description here

You can use the same formula to the G column (don't forget to update the ranges), as the delimiters of both of the 4500063794 21.07.2020 187.50 and the 447/20.6TBOS 04.07.2020 804.00 are the same (whitespaces).