關於MySQL中insert ignore,insert on duplicate和replace into，你可能沒想過區別

2020 年 2 月 13 日
筆記

讀完需要10分鐘

速讀僅需5分鐘

在數據流轉中或者日常的數據操作中，勢必會有數據寫入的過程，如果把一些數據寫入一張數據庫表中，如果寫入量有100萬，而重複的數據有90萬，那麼如何讓這10%的數據能夠更高更高效的寫入。

在MySQL方向提供了Insert ignore into,insert into on duplicate,replace into這幾種寫入的方式，看起來好像都差不多，但是實際上在一些場景下的差異還比較大，如果使用不當，恰恰是性能的瓶頸。

整體上我分為兩個大的部分，會分別測試這三種數據寫入場景。

第一部分基於id,name的數據列，其中id為主鍵，自增

第二部分基於id,xid,name的數據列，其中id為主鍵，自增，xid為唯一性索引

至於為什麼要這麼分，我們可以先看結果再做討論。

基於id,name的數據列，其中id為主鍵，自增

為了三種測試場景的基準對等，數據初始化會按照如下的三種方式來進行。

數據初始化

create table test_data(id int primary key auto_increment,name varchar(30)) engine=innodb;
insert into test_data values(1,'aa'),(2,'bb'),(3,'cc');
show create table test_dataG Table: test_dataCreate Table: CREATE TABLE `test_data` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8

insert ignore

insert ignore into test_data values(1,'aa');Query OK, 0 rows affected, 1 warning (0.00 sec)>>show warnings;+———+——+—————————————+\| Level \| Code \| Message \|+———+——+—————————————+\| Warning \| 1062 \| Duplicate entry '1' for key 'PRIMARY' \|+———+——+—————————————+1 row in set (0.00 sec)
show create table test_dataG Table: test_dataCreate Table: CREATE TABLE `test_data` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
insert ignore into test_data values(1,'aaa');Query OK, 0 rows affected, 1 warning (0.01 sec)>>show warnings;+———+——+—————————————+\| Level \| Code \| Message \|+———+——+—————————————+\| Warning \| 1062 \| Duplicate entry '1' for key 'PRIMARY' \|+———+——+—————————————+1 row in set (0.00 sec)
insert ignore into test_data values(4,'cc');Query OK, 1 row affected (0.01 sec)
select * from test_data;+—-+——+\| id \| name \|+—-+——+\| 1 \| aa \|\| 2 \| bb \|\| 3 \| cc \|\| 4 \| cc \|+—-+——+4 rows in set (0.00 sec)

replace into場景

>>replace into test_data values(1,'aa');Query OK, 1 row affected (0.01 sec)
show create table test_dataG Table: test_dataCreate Table: CREATE TABLE `test_data` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
replace into test_data values(1,'aaa');Query OK, 2 rows affected (0.00 sec)
replace into test_data values(4,'cc');Query OK, 1 row affected (0.00 sec)
select *from test_data;+—-+——+\| id \| name \|+—-+——+\| 1 \| aaa \|\| 2 \| bb \|\| 3 \| cc \|\| 4 \| cc \|+—-+——+4 rows in set (0.00 sec)

insert into on duplicate場景

insert into test_data values(1,'aa') on duplicate key update id=id;Query OK, 0 rows affected (0.00 sec)insert into test_data values(1,'aa') on duplicate key update id=id, name=name;Query OK, 0 rows affected (0.00 sec)
show create table test_dataG Table: test_dataCreate Table: CREATE TABLE `test_data` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
insert into test_data values(1,'aaa') on duplicate key update id=id;Query OK, 0 rows affected (0.00 sec)insert into test_data values(1,'aaa') on duplicate key update id=id,name=name;Query OK, 0 rows affected (0.00 sec)
insert into test_data values(4,'cc') on duplicate key update id=id;Query OK, 1 row affected (0.01 sec)insert into test_data values(4,'ccc') on duplicate key update id=id, name=name;Query OK, 0 rows affected (0.00 sec)
select * from test_data;+—-+——+\| id \| name \|+—-+——+\| 1 \| aa \|\| 2 \| bb \|\| 3 \| cc \|\| 4 \| cc \|+—-+——+4 rows in set (0.00 sec)

小結：這三種場景的結果從自增列的處理方式來看是完全對等的，但是對於重複數據的處理方式還是存在差異。

相比而言，replace into和insert into on duplicate存在本質的區別，replace into是覆蓋寫，即刪除原來的，寫入新的。不光是主鍵列，其他列也會保持一致

insert into on duplicate則可以根據自己的需求來定製重複數據的處理策略，不會主動改變數據。

insert ignore into 在這種場景下最為通用，而且對於數據的侵入性最小。

所以如果要保證源端的數據基於主鍵完全一致，不管非主鍵列的數據是否一致，都需要完全覆蓋，選擇replace into是一種好的方法。

否則採用insert into on duplcate或者insert ignore into

基於id,xid,name的數據列，其中id為主鍵，自增,xid為唯一性索引

為了三種測試場景的基準對等，數據初始化會按照如下的三種方式來進行。

數據初始化

create table test_data(id int primary key auto_increment,xid int unique key,name varchar(30)) engine=innodb;
insert into test_data(xid,name) values(1,'aa'),(2,'bb'),(3,'cc');Query OK, 3 rows affected (0.01 sec)Records: 3 Duplicates: 0 Warnings: 0
select *from test_data;+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 1 \| 1 \| aa \|\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|+—-+——+——+3 rows in set (0.00 sec)

insert ignore into

insert ignore into test_data(xid,name) values(1,'aa');Query OK, 0 rows affected, 1 warning
CREATE TABLE `test_data` ( `id` int(11) NOT NULL AUTO_INCREMENT, `xid` int(11) DEFAULT NULL, `name` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `xid` (`xid`)) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8
insert ignore into test_data(xid,name) values(1,'aaa');Query OK, 0 rows affected, 1 warning (0.01 sec)mysql–root@localhost:test 18:58:13>>show warnings;+———+——+———————————–+\| Level \| Code \| Message \|+———+——+———————————–+\| Warning \| 1062 \| Duplicate entry '1' for key 'xid' \|+———+——+———————————–+
insert ignore into test_data(xid,name) values(4,'dd');Query OK, 1 row affected (0.00 sec)
Create Table: CREATE TABLE `test_data` ( `id` int(11) NOT NULL AUTO_INCREMENT, `xid` int(11) DEFAULT NULL, `name` varchar(30) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `xid` (`xid`)) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8
>select * from test_data;+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 1 \| 1 \| aa \|\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|\| 6 \| 4 \| dd \|+—-+——+——+4 rows in set (0.00 sec)

replace into

replace into test_data(xid,name) values(1,'aa');Query OK, 2 rows affected (0.00 sec)
+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|\| 4 \| 1 \| aa \|+—-+——+——+3 rows in set (0.00 sec)
replace into test_data(xid,name) values(1,'aaa');Query OK, 2 rows affected (0.01 sec)
select *from test_data;+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|\| 5 \| 1 \| aaa \|+—-+——+——+
replace into test_data(xid,name) values(4,'cc');Query OK, 1 row affected (0.00 sec)
select *from test_data;+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|\| 5 \| 1 \| aaa \|\| 6 \| 4 \| dd \|+—-+——+——+4 rows in set (0.00 sec)

insert into on duplicate

insert into test_data(xid,name) values(1,'aa') on duplicate key update xid=xid;Query OK, 0 rows affected (0.00 sec)insert into test_data(xid,name) values(1,'aa') on duplicate key update xid=xid, name=name;Query OK, 0 rows affected (0.01 sec)
+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 1 \| 1 \| aa \|\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|+—-+——+——+3 rows in set (0.00 sec)
insert into test_data(xid,name) values(1,'aaa') on duplicate key update xid=xid;Query OK, 0 rows affected (0.01 sec)insert into test_data(xid,name) values(1,'aaa') on duplicate key update xid=xid,name=name;Query OK, 0 rows affected (0.00 sec)
insert into test_data(xid,name) values(4,'cc') on duplicate key update xid=xid;Query OK, 1 row affected (0.01 sec)insert into test_data(xid,name) values(4,'ccc') on duplicate key update xid=xid, name=name;Query OK, 0 rows affected (0.00 sec)
select * from test_data;+—-+——+——+\| id \| xid \| name \|+—-+——+——+\| 1 \| 1 \| aa \|\| 2 \| 2 \| bb \|\| 3 \| 3 \| cc \|\| 8 \| 4 \| cc \|+—-+——+——+4 rows in set (0.00 sec)

小結：在這個場景裏面，可以看到三種場景的變化真是很大，而且區別也很明顯。

insert ignore into如果不指定自增列，儘管沒有寫入數據，但是自增列依然會自增

replace into如果不指定自增列，會看到數據重新寫入的效果已經非常明顯，而且自增列始終會自動維護。

insert into on duplicate對於重複數據依然會消耗自增列值，實現相對更加靈活。

關於MySQL中insert ignore,insert on duplicate和replace into，你可能沒想過區別

VirMach 便宜 VPS

QNews

關於MySQL中insert ignore,insert on duplicate和replace into，你可能沒想過區別

分享此文：

Related Posts

小白學 Python（4）：變量基礎操作

淺析流量洪峰下的雲開發高可用架構設計

CSS-T | Mysql Client 任意文件讀取攻擊鏈拓展

IDEA主題美化

VirMach 便宜 VPS

QNews

熱門文章

熱門搜尋