I am trying to parse tab-delimited data, which has been saved as a text file with extraneous data. I would like this to be an R data.table/data.frame.
The tab-delimited format is the following:
A 1092 - 1093 + 1X
B 1093 HRDCPMRFYT
A 1093 + 1094 - 1X
B 1094 BSZSDFJRVF
A 1094 + 1095 + 1X
B 1095 SSTFCLEPVV
...
There are only two types of rows, A and B. A consistently has 5 columns, e.g. for the first row,
1092 - 1093 + 1X
B consistently has two columns:
1093 HRDCPMRFYT
Question: How do you parse a file with "alternating" rows with different formats?
Let's say that this was a text file which was only of this format, alternating rows of A and B, with 5 columns and 2 columns respectively. How do you parse this into an R data.table? My idea how be to create the following format:
1092 - 1093 + 1X 1093 HRDCPMRFYT
1093 + 1094 - 1X 1094 BSZSDFJRVF
1094 + 1095 + 1X 1095 SSTFCLEPVV
...