Drupal 8

Extending Drupal 8 Fields That Contain Data

The Exception

Data protection is one of the primary advantages of Drupal, but sometimes there are exceptions to the rule and you might need to modify a field to account for some change in business needs. There are a few rule bends (read hacks) that can be done to circumvent Drupal's checks and still maintain data integrity. You should only perform this when extending a field or changing something reasonable that is allowed by the database server. For example, expanding a varchar length or changing unformatted string into a formatted text field.

For a bit of background, I subscribe to the orthodoxy that every Drupal 8 deployment must run through the following command chain without errors:

drush updatedb -y; drush config-import -y; drush entity-updates -y; drush cache-rebuild

And so, in between each code snippet below, I would perform test deployments with this chain on my local, where a failure during the entity schema update lead me to the last "ahHa!" moment.

In this example, I had to convert the Page Header Content field from a plain text area to a WYSIWYG input. This seemed to be an extra simple situation, since the field_page_header_content_value column was already typed as LONGTEXT in the database and the only change needed was a new column for the filter format value in the tables, paragraph__field_page_header_content and paragraph_revision__field_page_header_content. So simple enough that it seemed that migrating to a whole new field would be more excessive than making an exception to Drupal's rules.

Alter the YAML Files & Database Tables

My approach started off with comparing the configuration YAMLs between unformatted and formatted text fields. Simply hand-altering the field's files produced no visible change to the site and the files would revert in a following configuration export. So, I continued with a method that had worked in Drupal 7 with Features, which was forcing the database table and configuration changes during an update script. Adding the new column to the database tables was the easy part and I assumed this this would allow the configuration import to take hold.

function mycustom_update_8001() {
  // Variables to add the format column and its index
  $table  = 'field_page_header_content';
  $column = 'format';
  $field  = [
    'type'   => 'varchar_ascii',
    'length' => 255,
  ];
  $schema = Database::getConnection()->schema();
 
  // Update the data table
  $schema->addField('node__' . $table, $table . '_' . $column, $field);
  $schema->addIndex('node__' . $table, $table . '_' . $column, [$table . '_' . $column], [
    'fields' => [$table . '_' . $column => $field]
  ]);
  // The revision table
  $schema->addField('node_revision__' . $table, $table . '_' . $column, $field);
  $schema->addIndex('node_revision__' . $table, $table . '_' . $column, [$table . '_' . $column], [
    'fields' => [$table . '_' . $column => $field]
  ]);
  ...
}

When testing it, the edit form did display a WYSIWYG instead of a text area, but the format did not save and the field's value was still sanitized through the check plain filter. Again, a following configuration export reverted the YAML, and so it seemed that the import was being silently defiant somewhere. But, I was getting a little closer, so next I tried to force the change to the currently installed configuration during the update, before the import is even ran.

Force the Field Configuration

function mycustom_update_8001() {
  ...
  // Force the current configuration to be exactly like the YAML,
  // so that the subsequent import does not detect a change
  $config              = \Drupal::configFactory()
    ->getEditable('field.storage.node.field_page_header_content');
  $depends             = $config->get('dependencies');
  $depends['module'][] = 'text';
  $config->set('dependencies', $depends);
  $config->set('type', 'text_long');
  $config->set('settings', []);
  $config->set('module', 'text');
  $config->save();
  ...
}

After a mock deployment, Drupal now detected that the entity schema needed to be updated and during the entity-update command, an error was thrown:

The SQL storage cannot change the schema for an existing field (field_page_header_content in node entity) with data

This was interesting to me as I had already modified the database and updated the field configurations, so something else was keeping record and failing the schema update. It turned out that the configuration entity, EntityLastInstalledSchemaRepository, is used to compare for any new changes that need to be done to the database. So, if it is rewritten to reflect configuration changes that have recently been made, then that error should not be thrown.

Rewrite History

function mycustom_update_8001() {
  ...
  // Current node field configurations
  $field_manager = \Drupal::getContainer()->get('entity_field.manager');
  // Because the manager was already loaded before the above config was forced,
  // it will return the old configuration that was cached and so the cache needs clearing
  $field_manager->clearCachedFieldDefinitions();
  $field_storage_configs = $field_manager->getFieldStorageDefinitions('node');
 
  // Get the last installed configuration manager
  // This is the gatekeeper that determines if an update is needed or can be done
  $last_installed_repo = \Drupal::getContainer()->get('entity.last_installed_schema.repository');
 
  // Get the last installed configurations for all node fields
  // These are iterative objects and need to stored as such, not just native arrays,
  // so reusing the previously set configuration tree is not an option
  $last_installed_configs = $last_installed_repo->getLastInstalledFieldStorageDefinitions('node');
 
  // Force the last installed config to be the current for the field
  $last_installed_configs['field_page_header_content'] = $field_storage_configs['field_page_header_content'];
  $last_installed_repo->setLastInstalledFieldStorageDefinitions('node', $last_installed_configs);
}

At this point, all of the field configurations were in agreement that the field had always been this new configuration and, after an error-free deployment, the field acted entirely as desired by allowing users to format their WYSIWYG content. To reiterate, this only works if you modify a Drupal field entity's storage in a way that your database server allows and data is not lost. Also, note that when making some alters on a very large (gigs) table, the server can still bog down or crash depending on its configurations, so YMMV.


The Whole Code

mycustom.install

function mycustom_update_8001() {
  // Set up date to add the format column
  $table  = 'field_page_header_content';
  $column = 'format';
  $field  = [
    'type'   => 'varchar_ascii',
    'length' => 255,
  ];
  $schema = Database::getConnection()->schema();
 
  // Update the data table
  $schema->addField('node__' . $table, $table . '_' . $column, $field);
  $schema->addIndex('node__' . $table, $table . '_' . $column, [$table . '_' . $column], [
    'fields' => [$table . '_' . $column => $field]
  ]);
  // The revision table
  $schema->addField('node_revision__' . $table, $table . '_' . $column, $field);
  $schema->addIndex('node_revision__' . $table, $table . '_' . $column, [$table . '_' . $column], [
    'fields' => [$table . '_' . $column => $field]
  ]);
 
  // Force the current configuration to be exactly like the YAML,
  // so that the subsequent import does not detect a change
  $config              = \Drupal::configFactory()
    ->getEditable('field.storage.node.field_page_header_content');
  $depends             = $config->get('dependencies');
  $depends['module'][] = 'text';
  $config->set('dependencies', $depends);
  $config->set('type', 'text_long');
  $config->set('settings', []);
  $config->set('module', 'text');
  $config->save();
 
  // Current node field configurations
  $field_manager = \Drupal::getContainer()->get('entity_field.manager');
  // Because the manager was already loaded before the above config was forced,
  // it will return the old configuration that was cached
  $field_manager->clearCachedFieldDefinitions();
  $field_storage_configs = $field_manager->getFieldStorageDefinitions('node');
 
  // Get the last installed manager, this is the gatekeeper that determines if
  // an update is needed or can be done
  $last_installed_repo = \Drupal::getContainer()->get('entity.last_installed_schema.repository');
 
  // Get the last installed configurations for node fields
  // These are iterative objects and need to stored as such, not just simple arrays,
  // so reusing the previously set configs is not an option
  $last_installed_configs = $last_installed_repo->getLastInstalledFieldStorageDefinitions('node');
 
  // Force the last installed config to be the current for the field
  $last_installed_configs['field_page_header_content'] = $field_storage_configs['field_page_header_content'];
  $last_installed_repo->setLastInstalledFieldStorageDefinitions('node', $last_installed_configs);
}

Git Diff of Configuration Ymls

From b5e515e907821a228b16174b98fb488554cc41f6 Mon Sep 17 00:00:00 2001
From: Marcus Bernal
Date: Mon, 19 Dec 2016 15:06:59 -0800
Subject: [PATCH] configs
 
---
 ...tity_form_display.node.landing_page.default.yml |  3 +-
 ...tity_view_display.node.landing_page.default.yml |  3 +-
 ...node.landing_page.field_page_header_content.yml |  4 +-
 ...ield.storage.node.field_page_header_content.yml |  8 ++--
 4 files changed, 11 insertions(+), 7 deletions(-)
 
diff --git a/cim/sync/core.entity_form_display.node.landing_page.default.yml b/cim/sync/core.entity_form_display.node.landing_page.default.yml
index 0455522..90ff8ae 100644
--- a/cim/sync/core.entity_form_display.node.landing_page.default.yml
+++ b/cim/sync/core.entity_form_display.node.landing_page.default.yml
@@ -10,6 +10,7 @@ dependencies:
   module:
     - paragraphs
     - path
+    - text
 id: node.landing_page.default
 targetEntityType: node
 bundle: landing_page
@@ -28,7 +29,7 @@ content:
       rows: 5
       placeholder: ''
     third_party_settings: {  }
-    type: string_textarea
+    type: text_textarea
   field_page_sections:
     type: entity_reference_paragraphs
     weight: 3
diff --git a/cim/sync/core.entity_view_display.node.landing_page.default.yml b/cim/sync/core.entity_view_display.node.landing_page.default.yml
index e650f47..1e69210 100644
--- a/cim/sync/core.entity_view_display.node.landing_page.default.yml
+++ b/cim/sync/core.entity_view_display.node.landing_page.default.yml
@@ -10,6 +10,7 @@ dependencies:
   module:
     - entity_reference_revisions
     - user
+    - text
 id: node.landing_page.default
 targetEntityType: node
 bundle: landing_page
@@ -27,7 +28,7 @@ content:
     label: hidden
     settings: {  }
     third_party_settings: {  }
-    type: basic_string
+    type: text_default
   field_page_sections:
     type: entity_reference_revisions_entity_view
     weight: 1
diff --git a/cim/sync/field.field.node.landing_page.field_page_header_content.yml b/cim/sync/field.field.node.landing_page.field_page_header_content.yml
index 44cf387..6c147c8 100644
--- a/cim/sync/field.field.node.landing_page.field_page_header_content.yml
+++ b/cim/sync/field.field.node.landing_page.field_page_header_content.yml
@@ -5,6 +5,8 @@ dependencies:
   config:
     - field.storage.node.field_page_header_content
     - node.type.landing_page
+  module:
+    - text
 id: node.landing_page.field_page_header_content
 field_name: field_page_header_content
 entity_type: node
@@ -16,4 +18,4 @@ translatable: false
 default_value: {  }
 default_value_callback: ''
 settings: {  }
-field_type: string_long
+field_type: text_long
diff --git a/cim/sync/field.storage.node.field_page_header_content.yml b/cim/sync/field.storage.node.field_page_header_content.yml
index 45b669d..13c1369 100644
--- a/cim/sync/field.storage.node.field_page_header_content.yml
+++ b/cim/sync/field.storage.node.field_page_header_content.yml
@@ -4,13 +4,13 @@ status: true
 dependencies:
   module:
     - node
+    - text
 id: node.field_page_header_content
 field_name: field_page_header_content
 entity_type: node
-type: string_long
-settings:
-  case_sensitive: false
-module: core
+type: text_long
+settings: {  }
+module: text
 locked: false
 cardinality: 1
 translatable: true
-- 
2.8.1

> So simple enough that it seemed that migrating to a whole new field would be more excessive than making an exception to Drupal's rules.

> To reiterate, this only works if you modify a Drupal field entity's storage in a way that your database server allows and data is not lost. Also, note that when making some alters on a very large (gigs) table, the server can still bog down or crash depending on its configurations

Given everything that you went through, and these qualifiers, I think the next time I run into this scenario, I'll just migrate to a new field.

I've been through a similar process, but found another solution: if you programmatically save the field definition, overwriting the 'original' property on it, it means the system won't 'notice' that the definition has changed, and so gets around the error. Here's an adaptation of my code, which I wrote having read the link in its function doxygen:

<?php
/**
 * Changing field type carefully.
 *
 * @see https://www.drupal.org/node/2535476
 */
function MYMODULE_update_8113() {
  // Get existing data, ready to be cleared (to allow a column schema change)
  // and later re-inserted.
  $database = \Drupal::database();
  $tables = [
    'paragraph__field_to_be_changed' => [],
    'paragraph_revision__field_to_be_changed' => [],
  ];
  foreach ($tables as $table => $values) {
    $tables[$table] = $database->select($table, 't')
      ->fields('t', array())
      ->execute()
      ->fetchAll(\PDO::FETCH_ASSOC);
 
    $database->truncate($table)->execute();
  }
 
  // Read new field storage data.
  $file_storage = new FileStorage(drupal_get_path('module', 'MYMODULE') . '/' . InstallStorage::CONFIG_INSTALL_DIRECTORY);
  $name = 'field.storage.paragraph.field_to_be_changed';
  $data = $file_storage->read($name);
 
  // Create a new field storage config entity, based on the new data, merged
  // with the old one (just to ensure the UUID is the same really).
  $definition = $data + FieldStorageConfig::load('paragraph.field_to_be_changed')->toArray();
  $definition = FieldStorageConfig::create($definition);
  // Set that the 'original' definition was no different, so that the system
  // does not notice the type has changed.
  $definition->original = $definition;
  $definition->enforceIsNew(FALSE);
  // Save the field storage definition.
  $definition->save();
 
  // Put the values back in the table.
  foreach ($tables as $table => $values) {
    $query = $database->insert($table);
    foreach ($values as $row) {
      $query->values($row);
    }
  }
}
?>

Very good that you had the time and patience to achieve the result.
In Drupal 7, there used to be the helper module (helper) that had an abstraction level over this and was very helpful in such situations.
It is handy script to have in the pocket .Thanks for sharing.

Based on the great code snippet by James Williams I was trying to find a generic solution for this. James uses a fixed table name ("paragraph_...") in his code. But for general field types we can't know in which entities the field type is used.

So my starting point was
<?php
\Drupal::entityManager()->getFieldMap();
?>
to find out where the field type is in use ( ['type'] key). It also tells us about the field names. But not about the table names. I think there must be a clever way through TableMappingInterface (https://api.drupal.org/api/drupal/core!lib!Drupal!Core!Entity!Sql!Table…) or something like that to get alle tables representing this field. Does someone have an idea?

If we could replace
<?php
$tables = [
'paragraph__field_to_be_changed' => [],
'paragraph_revision__field_to_be_changed' => [],
];
?>
by just setting the field type as parameter this would be the perfect helper function for all cases where a field type schema changes. Or is there already such a helper function in core?

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <cpp>, <java>, <php>. The supported tag styles are: <foo>, [foo].
  • Web page addresses and email addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Ready for transformation?