Migrate and Table Wizard for CSV migration in Drupal

Migrate and Table Wizard for CSV migration in Drupal

Migrate and Table Wizard for CSV migration in Drupal

We have been exploring different ways to import CSV data into Drupal and found a nice workflow with the Migrate and Table Wizard module. There are still a few tricky parts to the process — documentation is often unclear or conflicting – and following is a description of our steps to import a CSV file into Drupal as nodes with CCK fields.

Migrating a CSV file into a Drupal content type

  1. Prune your CSV file so that it plays nice with Table Wizard. There are some tips in the README.txt for the Migrate module and I recommend you read this, especially if you would like to import taxonomy.
  2. Import the CSV file into a database table using 'Upload delimited files' at Administer > Content management > Table Wizard. You will likely get a message telling you that you need to define a primary key for the table:
    data.csv has been imported into new table csv_data. All fields have initially been defined as text, with no indexing - it is important to use your database management software to set appropriate data types for each field, and to define a single (unique, non-null) field as the primary key. After you have done this, reanalyze the table.
  3. A primary key is a unique way to identify a row in your table, just like how Drupal uses a Node ID. There are a number of ways to assign a primary key for your data: If you are using phpMyAdmin and your dataset already has a primary key, launch phpMyAdmin and make the following changes:
    1. go to the Structure view of your uploaded table
    2. change the data type of the column you wish to use as a key from longtext to a type that is finite (I used int(11))
    3. select the column you wish to use as a key and click on the little key icon to make it the primary key
    If you don't have a primary key column in your data set, you can add one easily in phpMyAdmin. Create a new column (field), give it a name (something like 'rid'), make it of type INT, and make it auto increment (A_I). Or you can run an SQL query to do everything for you: ALTER TABLE `csv_data` ADD `rid` INT NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST
  4. Now that your data has a primary key go back to Administer > Content management > Table Wizard and click on the 'Analysis' link for your data. On the Analysis page click 'Reanalyze'. If Table Wizard recognized your key, you will have a checkmark under 'PK' for your primary key's column.
  5. At this point, if you can bring up your data when you click 'View table contents' then Table Wizard understands your data.
  6. Go to Administer > Content management > Migrate > Content sets and add a content set:
    1. Give it a description
    2. choose the view that contains your data. It will probably be something like:tw: csv_data Which was created by the Table Wizard
    3. Set kind to 'node'
    4. choose the destination content type for your data
  7. This will bring up a form to map your data columns to the node fields. I set Published to a default value of 1 and then just mapped the title and body fields. Blank date fields will be populated with the current time and blank author fields will be filled in with the current user account.
  8. Go to Administer > Content management > Migrate > Process and let's create some nodes
    1. Select the checkbox in the Import column for the content set you created
    2. Hit Submit
    If you are curious what each of the options does:
    • Clear - deletes all imported nodes
    • Import - imports once
    • Scan - checks and imports new rows on a continuing basis via cron
  9. Go to Administer > Content management > Content and make sure you data was imported correctly. We're done!

The Migrate and Table Wizard modules provide a powerful way to work with data external to Drupal. Although this is a simple use case, I think it provides a good introduction to how you might utilize these modules. I'm excited to see this project mature and I hope this can help clarify the current process for others.