1
votes

Let's say I have two files, file1.txt, file2.txt.

file1.txt is the following

TITLE   MEARA Repeatv2 Run2 
DATA TYPE       
ORIGIN  JASCO   
OWNER       
DATE    18/03/08    
TIME    22:07:45    
SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00   
RESOLUTION      
DELTAX  -0.1    
XUNITS  NANOMETERS  
YUNITS  CD[mdeg]    
    HT[V]   
FIRSTX  260 
LASTX   200 
NPOINTS 601 
FIRSTY  -4.70495    
MAXY    -4.70277    
MINY    -41.82113   
XYDATA      
260.0   -4.70495    443.669
259.9   -4.70277    443.672
259.8   -4.70929    443.674
259.7   -4.72508    443.681
259.6   -4.72720    443.69

file2.txt is this:

TITLE   MEARA Repeatv2 Run2 
DATA TYPE       
ORIGIN  JASCO   
OWNER       
DATE    18/03/08    
TIME    22:30:34    
SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00   
RESOLUTION      
DELTAX  -0.1    
XUNITS  NANOMETERS  
YUNITS  CD[mdeg]    
    HT[V]   
FIRSTX  260 
LASTX   200 
NPOINTS 601 
FIRSTY  -4.76564    
MAXY    -3.51295    
MINY    -41.95971   
XYDATA      
260 -4.76564    443.152
259.9   -4.77382    443.155
259.8   -4.78663    443.156
259.7   -4.8017 443.162
259.6   -4.83604    443.174

I have written the following Python script to concatenate the two files.

def catFiles(names, outName):
    with open(outName, 'w') as outfile:
        for fname in names:
            fileName=('/'+str(fname))
            with open(fname) as infile:
                outfile.write(infile.read())

while this script works to concatenate the two files, it stacks the files on top of each other, so that one file comes after another. I was wondering how I can modify this or rewrite it, such that the files are stacked next to each other; such that I get the following output

TITLE   MEARA Repeatv2 Run2     TITLE   MEARA Repeatv2 Run2 
DATA TYPE           DATA TYPE       
ORIGIN  JASCO       ORIGIN  JASCO   
OWNER           OWNER       
DATE    18/03/08        DATE    18/03/08    
TIME    22:07:45        TIME    22:30:34    
SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00       SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00   
RESOLUTION          RESOLUTION      
DELTAX  -0.1        DELTAX  -0.1    
XUNITS  NANOMETERS      XUNITS  NANOMETERS  
YUNITS  CD[mdeg]        YUNITS  CD[mdeg]    
    HT[V]           HT[V]   
FIRSTX  260     FIRSTX  260 
LASTX   200     LASTX   200 
NPOINTS 601     NPOINTS 601 
FIRSTY  -4.70495        FIRSTY  -4.76564    
MAXY    -4.70277        MAXY    -3.51295    
MINY    -41.82113       MINY    -41.95971   
XYDATA          XYDATA      
260.0   -4.70495    443.669 260.0   -4.76564    443.152
259.9   -4.70277    443.672 259.9   -4.77382    443.155
259.8   -4.70929    443.674 259.8   -4.78663    443.156
259.7   -4.72508    443.681 259.7   -4.80170    443.162
259.6   -4.72720    443.690 259.6   -4.83604    443.174
2
What you want is not called file concatenation. It is a very different procedure. In brief, open both files, read a line from each file, concatenate the lines, write the new line to the output file, and repeat as needed.DYZ
Do the files always contain the same number of lines?MaxNoe
What do you want to achieve with this? I don't think what you want to achieve is the best way to do it. So why do you want to have both files in one?MaxNoe
What MaxNoe is getting at is that there are decent file comparing tools available if that is your intention. Ones which handle the same text on different line numbers much better.DoubleDouble
You might want to have a look at the Unix tool paste which does exactly this. Of course, this is not a solution in Python, though.Alfe

2 Answers

3
votes
from itertools import zip_longest

with open('file1.txt') as f1, open('file2.txt') as f2, open('out.txt', 'w') as f:
    for left, right in zip_longest(f1, f2, fillvalue='\n'):
        f.write(left.rstrip('\n') + right)
2
votes

A text file does not actually have two dimensions (width and height) as it may seem when looking at it in a text editor. It actually just has one dimension.

For example, this file:

first line
second line
third line

actually contains a string with two newline (\n) characters:

'first line\nsecond line\nthird line'

Now, let's merge that with another file which has these contents:

blue
cheese

(or: 'blue\ncheese')

The normal way, which you call vertical, simply sums the strings:

'first line\nsecond line\nthird lineblue\ncheese'

What you want is something more complex, i.e. merge each line (and probably add some spacing as well):

'first line blue\nsecond line cheese\nthird line'

Doing that directly on the level of two big strings is impossible, so you want to:

  • split each file into list of lines (e.g. ['first line', 'second line', 'third line'] and ['blue', 'cheese'])
  • merge each line of first file with corresponding line of the second file (e.g. 'first line' + ' ' + 'blue')
  • take care of excess lines, because one file may be longer (e.g. 'third line' + '')
  • merge the lines

Here is how to do that, step by step:

To read a file as lines, you can do f.read().splitlines(), but it is better to f.readlines() or just iterate over the file object (for line in f: ...)

To match corresponding lines of two files, you can use zip_longest:

for left_line, right_line in zip_longest(left_lines, right_lines):
    ...

To concatenate, with padding: '{} {}'.format(left_line, right_line)

All together, verbose:

left_lines = []
with open(left_filename, 'rt') as left_file:
    for line in left_file:
        line_without_newline = line.strip('\n')
        left_lines.append(line_without_newline)

right_lines = []
with open(right_filename, 'rt') as right_file:
    for line in right_file:
        line_without_newline = line.strip('\n')
        right_lines.append(line_without_newline)

merged_lines = []
for left_line, right_line in zip_longest(left_lines, right_lines, fillvalue=''):
    merged_lines.append('{}    {}'.format(left_line, right_line))

with open(output_filename, 'wt') as output_file:
    for merged_line in merged_lines:
        output_file.write(merged_line + '\n')

Now you can skip most of the intermediate steps to make it simpler :)

with open(left_filename, 'rt') as left_file,\
     open(right_filename, 'rt') as right_file,\
     open(output_filename, 'wt') as output_file:
    for left_line, right_line in zip_longest(left_file, right_file, fillvalue=''):
        output_file.write('{}    {}\n'.format(left_line.strip('\n'),
                                              right_line.strip('\n')))