python - How to combine 2 csv files with common column value, but both files have different number of lines

Question

file1.csv contains 2 columns: c11;c12
file2.csv contains 2 columns: c21;c22
Common column: c11, c21

Example:

f1.csv

a;text_a            
b;text_b            
f;text_f            
x;text_x

f2.csv

a;path_a
c;path_c
d;path_d
k;path_k
l;path_l
m:path_m

Output f1+f2:

a;text_a;path_a
b;text_b,''
c;'';path_c
d;'';path_d
f;text_f;''
k;'';path_k
l;'';path_l
m;'';path_m
x;text_x;''

How to realize it using python?

If you need just this, have a look at the command-line join tool: linux.die.net/man/1/join — eumiro
Thank you for the suggestion, but an example how to use join command for this case is very welcome — user1042891

BrtH BrtH · Accepted Answer · 2012-08-27T11:15:41

This is quite easily done with the csv module:

import csv

with open('file1.csv') as f:
    r = csv.reader(f, delimiter=';')
    dict1 = {row[0]: row[1] for row in r}

with open('file2.csv') as f:
    r = csv.reader(f, delimiter=';')
    dict2 = {row[0]: row[1] for row in r}

keys = set(dict1.keys() + dict2.keys())
with open('output.csv', 'wb') as f:
    w = csv.writer(f, delimiter=';')
    w.writerows([[key, dict1.get(key, "''"), dict2.get(key, "''")]
                 for key in keys])

python - How to combine 2 csv files with common column value, but both files have different number of lines

2 Answers