Migrating Data Using Drupal's Migrate Module
on
One very common task faced by web developers worldwide is the need to move data from an old site to a new site. Frequently I have been faced with data migrations where the old and the new system are not even compatible systems (i.e. another CMS -> Drupal, or better yet, custom ASP + HTML -> Drupal! Fun!). Drupal's Migrate module is an extremely powerful and robust tool for managing this often complicated task. The Drupal.org site has plenty of documentation on the basic functionality of Migrate. In this article I will discuss some pitfalls of the Migrate module.
There are, as always, multiple options for performing data migrations to Drupal. If you need to import data on a regular basis, such as pulling an RSS feed onto your site, I suggest looking into the Feeds module. Its features are focused more around periodic data imports, although it does do many of the same things as Migrate.
What You Need To Know
Knowledge of PHP is essential to writing good Drupal data migrations. You don't need to be a Drupal expert, but you will need to have some PHP scripting experience, and you should be comfortable writing at least basic PHP scripts.
You will also need to understand your source data. If you are importing from a CSV file you will need to know what each of your fields means, and what type of delineation they use (single- or double-quotes, commas or colons...). If you are importing from an SQL database you should be familiar with Drupal's Database API.
Where To Look
The Migrate Documentation on drupal.org does explain much of the module's functionality. Within the Migrate code itself you will find a "migrate_examples" folder with two example setups: "beer" and "wine". If you are willing and able to jump into the code of these examples, they are quite well commented. Unfortunately comments are usually limited in their explanation of what is happening.
Field Handlers
One annoying aspect of the Migrate module's design is that Field Handlers for special data formats (field types) are handled EITHER by the Migrate Extras module, OR by the field package in question. For example at the time of writing this article the Migrate Extras module (7.x-2.3) has the Field Handler for Address Field but if you want to import a URL using the Link module you need Link 7.x-1.x-dev (7.x-1.0 doesn't have the MigrateLinkFieldHandler yet!).
This complicated my search a few times when I found that my code is, in fact, well written and the field names are all correct, but what happened is that the Migrate code looked at the destination field type, decided it doesn't know how to handle a "link" and quietly skipped that field.
When you are using a special field (such as Link or Phone) and you find that it is not being migrated into, you may need to write your own Field Handler for it. In the case of Phone, the field is a single text string (unlike Address Field, which has many different components), so I was able to take the MigrateEmailFieldHandler from Email 7.x-1.x-dev:
class MigrateEmailFieldHandler extends MigrateFieldHandler {
public function __construct() {
$this->registerTypes(array('email'));
}
public function prepare(stdClass $entity, array $field_info, array $instance, array $values) {
// Setup the Field API array for saving.
$arguments = (isset($values['arguments'])) ? $values['arguments']: array();
$language = $this->getFieldLanguage($entity, $field_info, $arguments);
$delta = 0;
foreach ($values as $value) {
$return[$language][$delta]['email'] = $value;
$delta++;
}
return isset($return) ? $return : NULL;
}
}
-- and change it to MigratePhoneFieldHandler:
class MigratePhoneFieldHandler extends MigrateFieldHandler {
public function __construct() {
$this->registerTypes(array('phone'));
}
public function prepare(stdClass $entity, array $field_info, array $instance, array $values) {
// Setup the Field API array for saving.
$arguments = (isset($values['arguments'])) ? $values['arguments']: array();
$language = $this->getFieldLanguage($entity, $field_info, $arguments);
$delta = 0;
foreach ($values as $value) {
$return[$language][$delta]['value'] = $value;
$delta++;
}
return isset($return) ? $return : NULL;
}
}
Notice the simplicity of this change. Without a MigratePhoneFieldHandler class, with a registered type "phone", the Migrate module will silently fail to import to phone field destinations, even though there is nothing particularly special about phone fields (they are just text strings at the end of the day). *ARGH*
Destination Handlers
If you have modified a standard content type such as 'user' or 'article' and it has new fields and properties, you are all set to use Migrate. However, if you have gone and created your own entity type(s), you will need to tell your migration code how to import into this new type. For this you will need to extend MigrateDestinationEntity. I used the code for MigrateDestinationNode as a starting point. This process is much longer than new field types, and requires a lot of variable name changes and some function call changes - for example, MigrateDestinationNode uses node_load() which clearly won't work for my custom entity, so I changed my code to use an EntityFieldQuery object instead.
Field Arguments
Field argument passing has changed in Migrate 7.x-2.4. It should now be possible to migrate data into complex-value fields, such as AddressField, with the simple use of a colon. Here is an example of the back-end database schema of AddressField 7.x-1.0-beta2:
+-------------------------------------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------------------------+------------------+------+-----+---------+-------+
| entity_type | varchar(128) | NO | PRI | | |
| bundle | varchar(128) | NO | MUL | | |
| deleted | tinyint(4) | NO | PRI | 0 | |
| entity_id | int(10) unsigned | NO | PRI | NULL | |
| revision_id | int(10) unsigned | YES | MUL | NULL | |
| language | varchar(32) | NO | PRI | | |
| delta | int(10) unsigned | NO | PRI | NULL | |
| field_org_address_country | varchar(2) | YES | | | |
| field_org_address_administrative_area | varchar(255) | YES | | | |
| field_org_address_sub_administrative_area | varchar(255) | YES | | | |
| field_org_address_locality | varchar(255) | YES | | | |
| field_org_address_dependent_locality | varchar(255) | YES | | | |
| field_org_address_postal_code | varchar(255) | YES | | | |
| field_org_address_thoroughfare | varchar(255) | YES | | | |
| field_org_address_premise | varchar(255) | YES | | | |
| field_org_address_sub_premise | varchar(255) | YES | | | |
| field_org_address_organisation_name | varchar(255) | YES | | | |
| field_org_address_name_line | varchar(255) | YES | | | |
| field_org_address_first_name | varchar(255) | YES | | | |
| field_org_address_last_name | varchar(255) | YES | | | |
| field_org_address_data | longtext | YES | | NULL | |
+-------------------------------------------+------------------+------+-----+---------+-------+
As you can see, it is not made up of a single text field like Phone or Email. Before Migrate 2.4, getting these sub-values into this field was not straightforward. Since 2.4, you should be able to write code for AddressField like this:
$this->addFieldMapping('field_address', 'source_country');
$this->addFieldMapping('field_address:locality', 'source_city');
$this->addFieldMapping('field_address:thoroughfare', 'source_street');
Yay!
In Conclusion
In the interests of brevity I will stop writing here. Suffice it to say Migrate is an extremely powerful module for transferring ANY old site's data into Drupal. Data migration in general is not a straightforward practice - it depends on knowledge of both your source data schema and your destination schema - and specifically with Migrate you must have good working knowledge of PHP as well as familiarity with Drupal's fields and class structure. The API is your friend!