MySQL 数据库字符集
前言一、MySQL 中的字符集二、字符校对规则的含义三、设置 MySQL 数据库中的字符集
前言
本环境是基于 Centos 7.8 系统构建MySQL-5.7.14 具体构建,请参考 MySQL-5.7.14 环境构建
一、MySQL 中的字符集
MySQL 支持的字符集
mysql
> show character set;
+
| Charset | Description
| Default collation
| Maxlen
|
+
| big5
| Big5 Traditional Chinese
| big5_chinese_ci
| 2 |
| dec8
| DEC West European
| dec8_swedish_ci
| 1 |
| cp850
| DOS West European
| cp850_general_ci
| 1 |
| hp8
| HP West European
| hp8_english_ci
| 1 |
| koi8r
| KOI8
-R Relcom Russian
| koi8r_general_ci
| 1 |
| latin1
| cp1252 West European
| latin1_swedish_ci
| 1 |
| latin2
| ISO
8859-2 Central European
| latin2_general_ci
| 1 |
| swe7
| 7bit Swedish
| swe7_swedish_ci
| 1 |
| ascii
| US ASCII
| ascii_general_ci
| 1 |
| ujis
| EUC
-JP Japanese
| ujis_japanese_ci
| 3 |
| sjis
| Shift
-JIS Japanese
| sjis_japanese_ci
| 2 |
| hebrew
| ISO
8859-8 Hebrew
| hebrew_general_ci
| 1 |
| tis620
| TIS620 Thai
| tis620_thai_ci
| 1 |
| euckr
| EUC
-KR Korean
| euckr_korean_ci
| 2 |
| koi8u
| KOI8
-U Ukrainian
| koi8u_general_ci
| 1 |
| gb2312
| GB2312 Simplified Chinese
| gb2312_chinese_ci
| 2 |
| greek
| ISO
8859-7 Greek
| greek_general_ci
| 1 |
| cp1250
| Windows Central European
| cp1250_general_ci
| 1 |
| gbk
| GBK Simplified Chinese
| gbk_chinese_ci
| 2 |
| latin5
| ISO
8859-9 Turkish
| latin5_turkish_ci
| 1 |
| armscii8
| ARMSCII
-8 Armenian
| armscii8_general_ci
| 1 |
| utf8
| UTF
-8 Unicode
| utf8_general_ci
| 3 |
| ucs2
| UCS
-2 Unicode
| ucs2_general_ci
| 2 |
| cp866
| DOS Russian
| cp866_general_ci
| 1 |
| keybcs2
| DOS Kamenicky Czech
-Slovak
| keybcs2_general_ci
| 1 |
| macce
| Mac Central European
| macce_general_ci
| 1 |
| macroman
| Mac West European
| macroman_general_ci
| 1 |
| cp852
| DOS Central European
| cp852_general_ci
| 1 |
| latin7
| ISO
8859-13 Baltic
| latin7_general_ci
| 1 |
| utf8mb4
| UTF
-8 Unicode
| utf8mb4_general_ci
| 4 |
| cp1251
| Windows Cyrillic
| cp1251_general_ci
| 1 |
| utf16
| UTF
-16 Unicode
| utf16_general_ci
| 4 |
| utf16le
| UTF
-16LE Unicode
| utf16le_general_ci
| 4 |
| cp1256
| Windows Arabic
| cp1256_general_ci
| 1 |
| cp1257
| Windows Baltic
| cp1257_general_ci
| 1 |
| utf32
| UTF
-32 Unicode
| utf32_general_ci
| 4 |
| binary | Binary pseudo
charset | binary | 1 |
| geostd8
| GEOSTD8 Georgian
| geostd8_general_ci
| 1 |
| cp932
| SJIS
for Windows Japanese
| cp932_japanese_ci
| 2 |
| eucjpms
| UJIS
for Windows Japanese
| eucjpms_japanese_ci
| 3 |
| gb18030
| China
National Standard GB18030
| gb18030_chinese_ci
| 4 |
+
41 rows in set (0.01 sec
)
MySQL 中字符集 utf8 支持的字符校对规则
mysql
> show collation
like 'utf8%';
+
| Collation
| Charset | Id
| Default | Compiled
| Sortlen
|
+
| utf8_general_ci
| utf8
| 33 | Yes
| Yes
| 1 |
| utf8_bin
| utf8
| 83 | | Yes
| 1 |
| utf8_unicode_ci
| utf8
| 192 | | Yes
| 8 |
| utf8_icelandic_ci
| utf8
| 193 | | Yes
| 8 |
| utf8_latvian_ci
| utf8
| 194 | | Yes
| 8 |
| utf8_romanian_ci
| utf8
| 195 | | Yes
| 8 |
| utf8_slovenian_ci
| utf8
| 196 | | Yes
| 8 |
| utf8_polish_ci
| utf8
| 197 | | Yes
| 8 |
| utf8_estonian_ci
| utf8
| 198 | | Yes
| 8 |
| utf8_spanish_ci
| utf8
| 199 | | Yes
| 8 |
| utf8_swedish_ci
| utf8
| 200 | | Yes
| 8 |
| utf8_turkish_ci
| utf8
| 201 | | Yes
| 8 |
| utf8_czech_ci
| utf8
| 202 | | Yes
| 8 |
| utf8_danish_ci
| utf8
| 203 | | Yes
| 8 |
| utf8_lithuanian_ci
| utf8
| 204 | | Yes
| 8 |
| utf8_slovak_ci
| utf8
| 205 | | Yes
| 8 |
| utf8_spanish2_ci
| utf8
| 206 | | Yes
| 8 |
| utf8_roman_ci
| utf8
| 207 | | Yes
| 8 |
| utf8_persian_ci
| utf8
| 208 | | Yes
| 8 |
| utf8_esperanto_ci
| utf8
| 209 | | Yes
| 8 |
| utf8_hungarian_ci
| utf8
| 210 | | Yes
| 8 |
| utf8_sinhala_ci
| utf8
| 211 | | Yes
| 8 |
| utf8_german2_ci
| utf8
| 212 | | Yes
| 8 |
| utf8_croatian_ci
| utf8
| 213 | | Yes
| 8 |
| utf8_unicode_520_ci
| utf8
| 214 | | Yes
| 8 |
| utf8_vietnamese_ci
| utf8
| 215 | | Yes
| 8 |
| utf8_general_mysql500_ci
| utf8
| 223 | | Yes
| 1 |
| utf8mb4_general_ci
| utf8mb4
| 45 | Yes
| Yes
| 1 |
| utf8mb4_bin
| utf8mb4
| 46 | | Yes
| 1 |
| utf8mb4_unicode_ci
| utf8mb4
| 224 | | Yes
| 8 |
| utf8mb4_icelandic_ci
| utf8mb4
| 225 | | Yes
| 8 |
| utf8mb4_latvian_ci
| utf8mb4
| 226 | | Yes
| 8 |
| utf8mb4_romanian_ci
| utf8mb4
| 227 | | Yes
| 8 |
| utf8mb4_slovenian_ci
| utf8mb4
| 228 | | Yes
| 8 |
| utf8mb4_polish_ci
| utf8mb4
| 229 | | Yes
| 8 |
| utf8mb4_estonian_ci
| utf8mb4
| 230 | | Yes
| 8 |
| utf8mb4_spanish_ci
| utf8mb4
| 231 | | Yes
| 8 |
| utf8mb4_swedish_ci
| utf8mb4
| 232 | | Yes
| 8 |
| utf8mb4_turkish_ci
| utf8mb4
| 233 | | Yes
| 8 |
| utf8mb4_czech_ci
| utf8mb4
| 234 | | Yes
| 8 |
| utf8mb4_danish_ci
| utf8mb4
| 235 | | Yes
| 8 |
| utf8mb4_lithuanian_ci
| utf8mb4
| 236 | | Yes
| 8 |
| utf8mb4_slovak_ci
| utf8mb4
| 237 | | Yes
| 8 |
| utf8mb4_spanish2_ci
| utf8mb4
| 238 | | Yes
| 8 |
| utf8mb4_roman_ci
| utf8mb4
| 239 | | Yes
| 8 |
| utf8mb4_persian_ci
| utf8mb4
| 240 | | Yes
| 8 |
| utf8mb4_esperanto_ci
| utf8mb4
| 241 | | Yes
| 8 |
| utf8mb4_hungarian_ci
| utf8mb4
| 242 | | Yes
| 8 |
| utf8mb4_sinhala_ci
| utf8mb4
| 243 | | Yes
| 8 |
| utf8mb4_german2_ci
| utf8mb4
| 244 | | Yes
| 8 |
| utf8mb4_croatian_ci
| utf8mb4
| 245 | | Yes
| 8 |
| utf8mb4_unicode_520_ci
| utf8mb4
| 246 | | Yes
| 8 |
| utf8mb4_vietnamese_ci
| utf8mb4
| 247 | | Yes
| 8 |
+
53 rows in set (0.00 sec
)
二、字符校对规则的含义
latin1校对规则有下面的含义
校对规则一般有这些特征:
· 两个不同的字符集不能有相同的校对规则。
· 每个字符集有一个默认校对规则。例如,latin1默认校对规则是>latin1_swedish_ci。
· 存在校对规则命名约定:它们以其相关的字符集名开始,通常包括一个语言名,并且以_ci(大小写不敏感)、_cs(大小写敏感)或_bin(二元)结束。
三、设置 MySQL 数据库中的字符集
我们通过修改MySQL服务的配置文件,修改数据库的字符集,全局生效 查看当前数据库的字符集
mysql
> show variables
like '%character%';
+
| Variable_name
| Value |
+
| character_set_client
| utf8
|
| character_set_connection
| utf8
|
| character_set_database
| latin1
|
| character_set_filesystem
| binary |
| character_set_results
| utf8
|
| character_set_server
| latin1
|
| character_set_system
| utf8
|
| character_sets_dir
| /usr
/share/mysql
/charsets
/ |
+
8 rows in set (0.00 sec
)
修改数据库字符集
[root@mysql-server ~
]
[mysqld
]
character-set-server
=utf8
[client
]
default-character-set
=utf8
退出数据库,重新登录,再次查看数据库的字符集
mysql
> show variables
like '%character%';
+
| Variable_name
| Value |
+
| character_set_client
| utf8
|
| character_set_connection
| utf8
|
| character_set_database
| utf8
|
| character_set_filesystem
| binary |
| character_set_results
| utf8
|
| character_set_server
| utf8
|
| character_set_system
| utf8
|
| character_sets_dir
| /usr
/share/mysql
/charsets
/ |
+
8 rows in set (0.00 sec
)
注:character_set_filesystem 默认字符集为binary,为固定设置,不可修改