测试 utf8 utf8mb4 占用体积的对比
- 2019 年 10 月 4 日
- 笔记
测试对比 , utf8 和 utf8mb4 在空间占用上的区别
版本:mysql5.7.25官方社区版,默认utf8字符集
mysql [localhost:5725] {root} ((none)) > status ; -------------- /root/opt/mysql/5.7.25/bin/mysql Ver 14.14 Distrib 5.7.25, for linux-glibc2.12 (x86_64) using EditLine wrapper Connection id:6 Current database: Current user:root@localhost SSL:Not in use Current pager:stdout Using outfile:'' Using delimiter:; Server version:5.7.25-log MySQL Community Server (GPL) Protocol version:10 Connection:Localhost via UNIX socket Server characterset:utf8 Db characterset:utf8 Client characterset:utf8 Conn. characterset:utf8 UNIX socket:/tmp/mysql_sandbox5725.sock Uptime:4 hours 58 min 57 sec Threads: 1 Questions: 131 Slow queries: 0 Opens: 121 Flush tables: 1 Open tables: 107 Queries per second avg: 0.007 -------------- mysql [localhost:5725] {root} ((none)) > show global variables like 'char%'; +--------------------------+----------------------------------------+ | Variable_name | Value | +--------------------------+----------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /root/opt/mysql/5.7.25/share/charsets/ | +--------------------------+----------------------------------------+ 8 rows in set (0.00 sec)
# 创建2张测试表并造些测试数据
mysql [localhost:5725] {root} (test) > CREATE TABLE `t1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'pk', `name` varchar(10) NOT NULL DEFAULT '' COMMENT 'name', `age` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'age', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; mysql [localhost:5725] {root} (test) > CREATE TABLE `t2` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'pk', `name` varchar(10) NOT NULL DEFAULT '' COMMENT 'name', `age` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'age', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
insert into t1 (name,age) values('abcde12345',25); insert into t1 (name,age) select name,age from t1 ; 用脚本多执行几次这个sql insert into t2 (name,age) values('abcde12345',25); insert into t2 (name,age) select name,age from t2 ; 用脚本多执行几次这个sql
select max(id) from t1; +----------+ | max(id) | +----------+ | 17301353 | +----------+ select max(id) from t2; +----------+ | max(id) | +----------+ | 17301353 | +----------+
# 加个索引
alter table t1 add index idx_name (name); alter table t2 add index idx_name (name);
# 查看2张表的体积
mysql [localhost:5725] {root} (test) > show table status like 't1' G *************************** 1. row *************************** Name: t1 Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 16349147 Avg_row_length: 38 Data_length: 631242752 Max_data_length: 0 Index_length: 0 Data_free: 4194304 Auto_increment: 17366761 Create_time: 2019-05-07 10:26:57 Update_time: 2019-05-07 10:22:35 Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) mysql [localhost:5725] {root} (test) > show table status like 't2' G *************************** 1. row *************************** Name: t2 Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 16357733 Avg_row_length: 37 Data_length: 618659840 Max_data_length: 0 Index_length: 0 Data_free: 4194304 Auto_increment: 17366761 Create_time: 2019-05-07 10:29:16 Update_time: 2019-05-07 10:24:09 Check_time: NULL Collation: utf8mb4_general_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) # 查看表物理文件占用的大小 [root@Dev-K8s-N1 /root/sandboxes/msb_5_7_25/data/test ] # ll *.ibd -rw-r----- 1 root root 1098907648 May 7 10:26 t1.ibd -rw-r----- 1 root root 1098907648 May 7 10:29 t2.ibd
可以看到存相同的的数据时候, utf8mb4 并没有比 utf8 多占空间。 同时,utf8mb4 能支持emoji表情字符,因此强烈推荐建表时候使用utf8mb4字符集(注意代码连接数据库的时候也要改成utf8mb4的才行)
另外, MySQL8.0中,字符集默认就是 utf8mb4了。 大势所趋。新上的数据库,赶紧都用 utf8mb4 吧。