Drupal Migrations in ddev

Blog
Publication date:

Examples to make migrations a little bit easier.

Oskar Yildiz, Unsplash

Drupal migrations belong to the hardest things to debug. First of all you usually have to deal with large amounts of data, since migrating 5 or 10 or 25 items is best done by hand, and yes, that means copy / paste actions.

But running a migration of thousands of items and waiting a long time before discovering that you used a wrong target field is not very productive, and if you use the ‘limit’-option during a migration you will miss all that new content that was added last, with all the new features that were added. Luckily there is a solution for this, and for the problem this solution causes.

Another problem when running a local site in ddev is getting debugging to work with PHPStorm for a Drush command. I will show what we did to make sure our breakpoints really, well, break.

While configuring a number of migrations I found that not all documentation found on the web is, let us say, completely up to date or accurate. I don’t think I can solve this, but will give a number of examples that worked for me.

Lots of data

So what to do if you have lots of data and do not want to watch a command prompt with a slowly increasing percentage for hours? As mentioned before, using the limit option wil not only limit the number of items you import but also the variation in the set of imported data. A solution we use here at LimoenGroen is to use a custom source plugin and alter the query used to get the source items.

Basic skeleton

Thanks to OO-inheritance doing so is remarkably simple as shown in this code fragment which builds upon the core D7 node source plugin. We will start with a basic skeleton like this:

<?php
​
namespace Drupal\test_migrate\Plugin\migrate\source;
​
use Drupal\node\Plugin\migrate\source\d7\Node;
​
/**
 * Class TestSource
 *
 * @package Drupal\test\Plugin\migrate\source
 *
 *  * @MigrateSource(
 *   id = 'd7_node_complete_test',
 *   source_module = 'node'
 * )
 */
 */
class TestSource extends Node{
  /**
   * {@inheritdoc}
   */
  public function query() {
    $query = parent::query();
    return $query;
  }
   
  /**
   * {@inheritdoc}
   */
  public function prepareRow(Row $row) {
    // Allow parent classes to ignore rows.
    if (parent::prepareRow($row) === FALSE) {
      return FALSE;
    } 
}

Representative subset

First of all we want to ensure we get a limited but representative subset. We achieve this by adding a setting to the Drupal settings file and use this to reduce the number of items. In the settings file we add:

$settings['test_migrate_reduce_factor'] = 50;

And we adjust the query-function like this:

public function query() {
    $query = parent::query();
    $reduce_factor = Settings::get('test_migrate_reduce_factor');
    if ($reduce_factor && is_int($reduce_factor)) {
      $query->where('MOD(n.nid, :reduce_factor) = 0', [':reduce_factor' => $reduce_factor]);
    }
    return $query;
  }

By adding a modulus (‘MOD’) clause to the query, we make sure that only each n-th (in our example 50th) source item is included in the migration. Using a setting variable is not mandatory, but it makes it easy to test different subsets of source items.

This however presents us with a new problem, especially when updating content: which content was updated? A quick solution for this is adding a second setting:

$settings['test_migrate_debug_source_field'] = 'title';

and adjust the prepareRow-function like this:

public function prepareRow(Row $row) {
    // Allow parent classes to ignore rows.
    if (parent::prepareRow($row) === FALSE) {
      return FALSE;
    } 
    $debug_property = Settings::get('test_migrate_debug_source_field');
    if (!empty($debug_property)) {
      $field = $row->getSourceProperty($debug_property);
      if (is_array($field)) {
        $field = var_export($field);
      }
      // Only show output if run from command line.
      if (PHP_SAPI === 'cli') {
         Drush::output()->writeln(dt($field));
      } 
    }
  }

This little piece of code prints the value of the source property defined in $settings['test_migrate_debug_source_field'] to the shell when running the Drush command. With this it is easy to find the content that was updated in the Drupal admin. And because it is a setting it is also easy to define a logical source value for any migration.

To output the debug information we use a function from Drush and to make sure we are running the migration with Drush, and not via the Migrate UI, we wrap the function call in an if-statement which checks if PHP is run from the cli, see the remark about drupal_cli on drupal.org.

Debugging a migration

When using PHPStorm, debugging Drupal code is made extremely easy with the so called ‘Zero-configuration debugging’. But using Drush in a ddev-environment adds some extra layers which make additional configuration necessary.

The first problem was that when running Drush PHPStorm needs the server name and that this was not added automatically. It can be added by adding the ‘PHP_IDE_CONFIG’ variable to .ddev/docker-compose.environment.yaml

# Ddev environment variables.
​
version: '3.6'
​
services:
  web:
    environment:
      - DRUSH_OPTIONS_URI=$DDEV_PRIMARY_URL
      - PHP_IDE_CONFIG=serverName=<local url>

Where <local url> should be replace by the url on which your local development site can be reached.

Running the Drush command from within the ddev-container proved much more reliable than running it from the host.

$ ddev ssh
$ cd sites/dev/
$ /var/www/html/vendor/bin/drush mim my_migration_id --update

This assumes you have installed a local version of Drush with composer require drush/drush but this is generally a good idea, especially when the site is not hosted on your own servers and you have little or no influence on the software installed on the server.

If you use debugging for the first time in a project PHPStorm will ask you for the mapping of the Drush executable. Simply add the local path.

Some examples

Body to paragraph

In most migrations we implement at LimoenGroen we have a number of default actions. One of this is migrating all or a part of the original body text of a Drupal 7 site to a paragraph in the new Drupal 8 site.

This is a two step process, first we create the paragraphs with a migration defined as:

langcode: nl
status: true
dependencies: {  }
id: test_d7_paragraphs_news_item
class: Drupal\node\Plugin\migrate\D7NodeTranslation
field_plugin_method: null
cck_plugin_method: null
migration_tags:
  - 'Drupal 7'
  - Content
migration_group: test_drupal_7
label: 'Node News item body -> News paragraph'
source:
  plugin: d7_node_complete_test
  node_type: news_item
process:
  field_text:
    plugin: sub_process
    source: body
    process:
      value:
        plugin: remove_media_tags
        source: value
      format:
        plugin: default_value
        default_value: filtered_html
destination:
  plugin: 'entity_reference_revisions:paragraph'
  default_bundle: text
migration_dependencies: null

First of all we use our extended source plugin d7_node_complete_test. We also use a processor to remove all embedded media, which is rather specific for this case but shows the usage of such a plugin.

In the destination section we set the default bundle to our custom paragraph type text which has a rich text field field_text which is filled in the process.

The next step is to add the created paragraphs to the nodes we have imported. The (partial) configuration for this:

langcode: nl
status: true
dependencies: {  }
id: test_d7_node_complete_news_item
class: Drupal\node\Plugin\migrate\D7NodeTranslation
field_plugin_method: null
cck_plugin_method: null
migration_tags:
  - 'Drupal 7'
  - Content
migration_group: test_drupal_7
label: 'Node complete (News item -> News)'
source:
  plugin: d7_node_complete_fwo
  node_type: news_item
process:
  (...)
  text_paragraph:
    plugin: migration_lookup
    migration: test_d7_paragraphs_news_item
    source: nid
  field_paragraphs:
    plugin: sub_process
    source:
      - '@text_paragraph'
    process:
      target_id: '0'
      target_revision_id: '1'
destination:
  plugin: 'entity_complete:node'
  translations: true
  default_bundle: article
migration_dependencies:
  required: 
    - test_d7_paragraphs_news_item
  optional: {}

What happens here is that first we look up the paragraph created in the migration with the id test_d7_paragraphs_news_item with the migration_lookup plugin and store this in a pseudo-field.

Next we add this paragraph to the correct field (in our case the field field_paragraphs). In the process-section of this field we also set the target_id and the target_revision_id which are internal fields on which the entity revision entities depends.

List of links

To migrate a multi-value field of links from a Drupal 7 site use the following snippet

field_links:
    plugin: sub_process
    source: field_links
    process:
      uri: url
      title: title
      options: attributes

In which the name of the Drupal 7 field is the same as the name of the Drupal 8 field, field_links. Note the small but important difference between the Drupal 7 url and the Drupal 8 uri values.

Language of the node

The above examples were taken from sites in which two languages were defined (English and Dutch), but only Dutch was used.

So it seemed safe to assume that we could simply use the source language from the Drupal 7 node as target language for the Drupal 8 site. At first it looked like this worked fine, until we examined a number of nodes that were previously imported from a Drupal 6 site (yes, the sites have a long Drupal history…).

These nodes did not have language ‘nl’ but ‘und’ which led to a number of problems in the Drupal 8 site with respect to translations and automatic URL aliases. The solution was simple, in the process section

langcode:
    -
      plugin: default_value
      source: language
      default_value: nl

Conclusion

Migrations will always be hard, despite all the effort the Drupal community has put in to the migrate classes. So debugging migrations will continue to be a necessary skill, maybe this article will help you to get the job done.