Database: Account for `utf8` being renamed to `utf8mb3` in newer MariaDB and MySQL versions.

From [https://mariadb.com/kb/en/mariadb-1061-release-notes/ MariaDB 10.6.1 release notes]:
> The `utf8` [https://mariadb.com/kb/en/character-sets/ character set] (and related collations) is now by default an alias for `utf8mb3` rather than the other way around. It can be set to imply `utf8mb4` by changing the value of the [https://mariadb.com/kb/en/server-system-variables/#old_mode old_mode] system variable ([https://jira.mariadb.org/browse/MDEV-8334 MDEV-8334]).
From [https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-30.html#mysqld-8-0-30-charset MySQL 8.0.30 release notes]:
> **Important Change:** A previous change renamed character sets having deprecated names prefixed with `utf8_` to use `utf8mb3_` instead. In this release, we rename the `utf8_` collations as well, using the `utf8mb3_` prefix; this is to make the collation names consistent with those of the character sets, not to rely any longer on the deprecated collation names, and to clarify the distinction between `utf8mb3` and `utf8mb4`. The names using the `utf8mb3_` prefix are now used exclusively for these collations in the output of `SHOW` statements such as `SHOW CREATE TABLE`, as well as in the values displayed in the columns of Information Schema tables including the `COLLATIONS` and `COLUMNS` tables.

This commit adds `utf8mb3_bin` and `utf8mb3_general_ci` to the list of safe collations recognized by `wpdb::check_safe_collation()`. The full list is now as follows:
* `utf8_bin`
* `utf8_general_ci`
* `utf8mb3_bin`
* `utf8mb3_general_ci`
* `utf8mb4_bin`
* `utf8mb4_general_ci`

The change is covered by existing database charset unit tests: six tests which previously failed on MariaDB 10.6.1+ or MySQL 8.0.30+ now pass.

Includes:
* Adjusting the expected test results based on MariaDB and MySQL version.
* Using named data providers for the affected tests to make test output more descriptive.
* Adding a failure message to each assertion when multiple assertions are used in the test.

References:
* [https://mariadb.com/kb/en/mariadb-1061-release-notes/ MariaDB 10.6.1 release notes]
* [https://jira.mariadb.org/browse/MDEV-8334 MDEV-8334 Rename utf8 to utf8mb3]
* [https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-30.html#mysqld-8-0-30-charset MySQL 8.0.30 release notes]
* [https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding)]

Follow-up to [30345], [32162], [37320].

Props skithund, ayeshrajans, JavierCasares, SergeyBiryukov.
Fixes #53623.
Built from https://develop.svn.wordpress.org/trunk@53918


git-svn-id: http://core.svn.wordpress.org/trunk@53477 1a063a9b-81f0-0310-95a4-ce76da25c4cd
This commit is contained in:
Sergey Biryukov 2022-08-22 15:39:13 +00:00
parent eb822d7fef
commit fab7d60c64
2 changed files with 11 additions and 2 deletions

View File

@ -3376,12 +3376,21 @@ class wpdb {
}
// If any of the columns don't have one of these collations, it needs more sanity checking.
$safe_collations = array(
'utf8_bin',
'utf8_general_ci',
'utf8mb3_bin',
'utf8mb3_general_ci',
'utf8mb4_bin',
'utf8mb4_general_ci',
);
foreach ( $this->col_meta[ $table ] as $col ) {
if ( empty( $col->Collation ) ) {
continue;
}
if ( ! in_array( $col->Collation, array( 'utf8_general_ci', 'utf8_bin', 'utf8mb4_general_ci', 'utf8mb4_bin' ), true ) ) {
if ( ! in_array( $col->Collation, $safe_collations, true ) ) {
return false;
}
}

View File

@ -16,7 +16,7 @@
*
* @global string $wp_version
*/
$wp_version = '6.1-alpha-53917';
$wp_version = '6.1-alpha-53918';
/**
* Holds the WordPress DB revision, increments when changes are made to the WordPress DB schema.