2
votes

I am trying to import a CSV file with one specific column being of type date-time using mongoimport utility

As of MongoDB version 3.4 we can use --columnsHaveTypes which instructs mongoimport that the field list specified in --headerline are of specific type.

My CSV file look like this

TIME,TPS
2006-01-02 07:25:24,1
2006-01-02 07:25:25,2
2006-01-02 07:25:26,2
2006-01-02 07:25:27,2
2006-01-02 07:25:28,2
2006-01-02 07:25:29,2
2006-01-02 07:25:30,1
2006-01-02 07:25:31,3

commands the i have executed but have error-ed out

 >>mongoimport --db test1 --collection tpsformat --type csv --file C:\ARMS\TestTPS.csv --headerline TIME.date\(2006-01-02 15:04:05\),TPS.string\(\) --columnsHaveTypes --maintainInsertionOrder

2017-08-22T16:36:52.114+0530    error validating settings: only one positional argument is allowed
2017-08-22T16:36:52.119+0530    try 'mongoimport --help' for more information

>>mongoimport --db test1 --collection tpsformat --type csv --file C:\ARMS\TestTPS.csv --headerline TIME.string\(\),TPS.string\(\) --columnsHaveTypes --maintainInsertionOrder

2017-08-22T16:36:59.539+0530    error validating settings: incompatible options: --file and positional argument(s)
2017-08-22T16:36:59.543+0530    try 'mongoimport --help' for more information

However a simple import without --columnsHaveTypes executes fine

>>mongoimport --db test1 --collection tpsformat --type csv --file C:\ARMS\TestTPS.csv --headerline --maintainInsertionOrder

2017-08-22T16:37:07.928+0530    connected to: localhost
2017-08-22T16:37:07.936+0530    imported 8 documents

Can someone suggest what I am doing wrong here.

Thanks

1
Read the manual "Field names must be in the form of <colName>.<type>(<arg>). You must backslash-escape the following characters if you wish to include them in an argument: (, ), and \.". Your column names are not following that format. - Neil Lunn
--headerline reads from the actual file. If you don't mean that then you need to remove the line from the file and use --fields instead. But you also have the formats wrong. And you do not want to import either as "strings". Strings for dates in particular is really bad. - Neil Lunn
@NeilLunn Both the fields are in required format TIME.string(),TPS.string() where time = <colName> type = string and both (, ) have been escaped using \ . - Kaushik Bose
I already told you why. And deleting and reposting your comment destroys the natural flow of conversation. So don't do that. Like I said. There are clear examples in the manual, and you are not following them, - Neil Lunn
@NeilLunn have removed the first line of csv and have used --fields as well but that also didn't worked. if you see the format is wrong can you please, suggest the correct format? - Kaushik Bose

1 Answers

4
votes

I know this is an old post but I figured I would provide an answer in case someone else is looking for help with this issue. I too had issues until I read the Go documentation and took everything VERY literally.

First, when using the --headerline and --columnsHaveTypes parameters, the headers and column definitions are expected to be in your CSV file. Not defined on the command line.

Second, when defining a date format you MUST literally use the values used to describe the formats from the Go documentation.

Go Time Format

Ex. Mon Jan 2 15:04:05 MST 2006

It is literally meant that you must use Mon (case specific) to state that you want to indicate that you will be sending a three letter day of the week in your date format. If you were to use "Tue" in your format specification, it will not work. You can see this for yourself in the nextStdChunk code within the Go documentation.

Finally, when defining a date field for Mongo you do not need to use quotes around your format. In fact, when I did use quotes it never worked.

To put this all together for the OP...

Your header line in your CSV file should look like this:

TIME.date(2006-01-02 15:04:05),TPS.string()
2006-01-02 07:25:24,1
2006-01-02 07:25:25,2
2006-01-02 07:25:26,2
2006-01-02 07:25:27,2
2006-01-02 07:25:28,2
2006-01-02 07:25:29,2
2006-01-02 07:25:30,1
2006-01-02 07:25:31,3

NOTE: In the date format given above I used 15 to indicate a zero padded 24 hour time value since I can only guess that you are providing 24 hour times.

The mongoimport command line would look like this:

mongoimport --db test1 --collection tpsformat --type csv --file C:\ARMS\TestTPS.csv --headerline --columnsHaveTypes --maintainInsertionOrder