Illegal mix of collations exception caused by Drupal 8 Autocomplete with Entity Reference View

Posted by patrick on Fri, 09/08/2017 - 11:48

Looking at a Recent log messages report of one of our Drupal 8 instaces, I came across this error message:

Drupal\Core\Database\DatabaseExceptionWrapper: Exception in Media Autocomplete[media_autocomplete]: SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (ascii_general_ci,IMPLICIT) and (utf8mb4_general_ci,COERCIBLE) for operation 'like': SELECT media_field_data.langcode AS media_field_data_langcode, media_field_data.mid AS mid, AS media_field_data_name, media_field_data.bundle AS media_field_data_bundle FROM {media_field_data} media_field_data WHERE ((media_field_data.bundle = :db_condition_placeholder_0) AND (( LIKE :db_condition_placeholder_1 ESCAPE '\\') OR (media_field_data.bundle LIKE :db_condition_placeholder_2 ESCAPE '\\'))) AND ((media_field_data.status = :db_condition_placeholder_3) AND (media_field_data.langcode IN (:db_condition_placeholder_4, :db_condition_placeholder_5, :db_condition_placeholder_6))) LIMIT 10 OFFSET 0; Array ( [:db_condition_placeholder_0] => image [:db_condition_placeholder_1] => %gläsernes labor% [:db_condition_placeholder_2] => %gläsernes labor% [:db_condition_placeholder_3] => 1 [:db_condition_placeholder_4] => de [:db_condition_placeholder_5] => und [:db_condition_placeholder_6] => zxx ) in Drupal\views\Plugin\views\query\Sql->execute() (line 1488 of core/modules/views/src/Plugin/views/query/Sql.php).

Clearly, an editor was trying to use the autocomplete functionality of an entity reference field searching for media entities matching the search string "gläsernes labor" (in English: transparent laboratory). This would actually happen for any search string entered into the autocomplete that didn't contain ascii characters only, so in this case the "ä" in "gläsernes" was causing the exception. Looking closer at the query, we can see that the MySQL table is "media_field_data" and the search string is in two placeholders: "db_condition_placeholder_1" and "db_condition_placeholder_2". These are used to check for matches in "" or "media_field_data.bundle" columns. So I asked the database for the collation of these fields:

MariaDB [my_drupal_8_database]> SELECT TABLE_NAME, COLUMN_NAME, COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA='my_drupal_8_database' AND TABLE_NAME='media_field_data' AND COLUMN_NAME IN ('name', 'bundle');
| media_field_data | bundle      | ascii_general_ci   |
| media_field_data | name        | utf8mb4_general_ci |

Tada! The bundle column of the table uses ascii collation. For our autocomplete search to work, those two columns would need to have the same collation or we would need to change the query to specify the collation to use for each like condition separately. Another option would be to just query the name field or the bundle field.

But why was the bundle field searched in the first place? We set up an entity reference view for this entity reference field (after applying a patch, mind you ;). We did this because we wanted to show the bundle and some other stuff along the label of the entity in the results returned by the autocomplete. When we set up the view we selected "Name" and "Bundle" for the "Search fields" setting of the "Entity reference list" style plugin.

Our client didn't really care that much about searching the bundle field, so I just removed it from the search fields. And voilá, you could enter strings with umlauts into the autocomplete again.

Drupal 8