0
votes

I have two data files which is some weird format. Need to parse it to some descent format to use that for future purposes. after parsing i end up having two formats on which one has an id and respective information pertaining to that id will be from another file.

Ex : From file 1 i get

Name, Position, PropertyID

from file 2 PropertyId, Property1,Property2

like this i have more columns from both the file.

what is the idle way to store these information in a flat file to server as a database. i don't want to use database(Mysql,MsSql) for some reason.

initially i thought of using single Coma separated file. but ill end up using so many columns which will create problem when i update these information.

I ll be using the parsed data in some other application using java and python can anyone suggest better way to handle this

Thanks

2
A better way? Use a database. "I don't want to use a database" is not a good reason. - Kayaman
Use sqlite database - it does not require a server, is stored in 1 flat file, python has all the drivers, and will be much better than a flat file. Why are you re-inventing the wheel when you can use something so simple? - mikeb
use json format - nafas

2 Answers

2
votes

I would use JSON. JSON can be easily converted to and from objects in either Python or Java. In Python, JSON maps directly to dict. Java has various facilities to convert. Far less work than doing all that yourself. For Java, see JAXB.

Something like this.

File 1: Map people to propertyID

{
   {"firstName": "John", "lastName": "Smith", "position": "sales"} : 123},
   {"firstName": "Jane", "lastName": "Doe", "position": "manager"} : 456} 
}

File 2: Map propertyId to list of properties.

{
    {123: [{"address": "123 street", "city": "LA"}, {"address": "456 street", "city": "SF"}] } ,
    {456: [{"address": "123 ave", "city": "XX"}, {"address": "456 ave", "city": "SF"}] } 
}

p.s. It might make more sense to associate a person with a list of property IDs and have each property have it's own ID. Easier to move things around and reassign. Just my $0.02.

0
votes

Ensure that you normalize your data with an ID to avoid touching so many different data columns with even a single change. Like the file2 you mentioned above, you can reduce the columns to two by having just the propertyId and the property columns. Rather than having 1 propertyId associated with 2 property in a single row you'd have 1 propertyId associated with 1 property per your example above. You need another file to correlate your two main data table. Normalizing your data like this can make your updates to them very minimal when change occurs.

file1:

owner_id | name | position |
1 | Jack Ma | CEO |

file2:

property_id | property |

101 | Hollywood Mansion |

102 | Miami Beach House |

file3:

OwnerId | PropertyId |

1 | 101

1 | 102