0
votes

I'm working on a Drupal project where I have two types of nodes built with CCK content types -- "venues" (parent nodes) and "concerts" (child nodes).

I've imported a bunch of well-known venues from CSV via node_import module, and now need to import another CSV filled with shows.

How do I:

a. Reference the venues from the concerts CSV -- what I essentially need to do is create a nodereference based on a location.module address (But not the CCK type because of how I imported the venues), but I don't know how to import like this, especially when the same address may vary somewhat between the two sheets (in terms of punctuation, etc.).

b. For some concert listings, it's likely my venue datasheet will be incomplete. When importing, how do I create parent nodes when the importer can't find the address referenced by the child node? Note that the concerts CSV has most of the information from the venues CSV in each row.

I mention I'm using node_import but am thinking I may need to use the Data API for this. I have no experience with the latter (Or preference for a particular importing method, really) and would be grateful for any help whatsoever you can give me.

1

1 Answers

1
votes

feeds will be a good solution for you, I think. You'll need to set it up with importers for the 2 different content types, and depending on how your files are structured you may need to run the same file though both importers.

For the node reference you'll either need to use the node_reference mapper that may only exist as a patch at this time, or write your own (which we ended up doing in an afternoon).

We are currently doing something simular importing 60k auction listings and spread across 30-40 Events. Works like a charm.

node reference mapper: http://drupal.org/node/724536

feeds: http://drupal.org/project/feeds

Regarding why you may need to import the same file more than once, here is a simplified example of how we import a csv file that has info about items and events in the same file. if you have a file that has the following structure:

|item_id|event_id|item_body  |event_body  |

|12231  | 123    | 'price $1'| 'on friday'|

|12232  | 123    | 'price $5'| 'on friday'|

we run this for both an event content type importer, for both the *_id column is the GUID, and the body is imported as the body. The event importer runs first. For the item the event_id is used to find the node that was created for the event, and a node reference is created. The info for the content type that is not being imported gets ignored.

Only the first instance of the event_id needs the body or other fields, but some of our providers duplicate the data which doesn't actually slow things down, from what i remember. We also have providers that send 2 seperate files 1 for events and one for items, with the items containing a column for event id.