DBLE分库分表实战

  • 2019 年 10 月 4 日
  • 筆記

环境: DBLE 2.19.03.0

OS版本: CentOS Linux release 7.6.1810 (Core) 

IP:  192.168.20.10/24

MySQL版本: MySQL-社区版-5.7.26

添加2个账号授权:

create user 'rw'@'%' identified by 'rw123456';  create user 'rd'@'%' identified by 'rd123456';  GRANT SELECT, INSERT, UPDATE, DELETE, CREATE,REFERENCES,CREATE TEMPORARY TABLES,INDEX ON *.* TO  rw@'%' ;  GRANT SELECT ON *.* TO 'rd'@'%' ;

连接方式:

读写:

mysql -urw -prw123456 –port 8066 -h 192.168.20.10 testdb 

只读:

mysql -urd -prd123456 –port 8066 -h 192.168.20.10 testdb 

ddl专用:

mysql -uop -p123456 –port 8066 -h 192.168.20.10 testdb 

管理账号:

mysql -uman1 -p654321 –port 9066 -h 192.168.20.10 

解压DBLE:

tar xf dble-2.19.03.tar.gz  /usr/local/

cd /usr/local

ln -s dble-2.19.03 dble

cd conf/

vim schema.xml   修改后的如下:

<?xml version="1.0"?>  <!DOCTYPE dble:schema SYSTEM "schema.dtd">  <dble:schema xmlns:dble="http://dble.cloud/" version="2.19.03.0">        <schema name="testdb">          <!-- 全局表 -->          <table name="company" primaryKey="id" type="global" dataNode="dn1,dn2,dn3"/>            <!-- range分区2 -->          <table name="travelrecord" primaryKey="id" dataNode="dn1,dn2,dn3" rule="sharding-by-range_t"/>            <!-- hash mod 3 分区 -->          <table name="hotnews" primaryKey="id" dataNode="dn1,dn2,dn3" rule="id-sharding-by-mod3"/>            <!-- hashStringmod3 分区 -->          <table name="user_auth" primaryKey="open_id" dataNode="dn1,dn2,dn3" rule="user-auth-sharding-by-open_id" />            <!-- ER 分区 -->          <table name="order1" dataNode="dn1,dn2,dn3" rule="id-sharding-by-mod3">               <childTable name="order_detail" primaryKey="id" joinKey="order_id" parentKey="id" />           </table>      </schema>            <dataNode name="dn1" dataHost="192.168.20.10" database="db1"/>      <dataNode name="dn2" dataHost="192.168.20.10" database="db2"/>      <dataNode name="dn3" dataHost="192.168.20.10" database="db3"/>            <dataHost name="192.168.20.10" maxCon="500" minCon="10" balance="0" switchType="-1" slaveThreshold="100">          <heartbeat>select user()</heartbeat>          <writeHost host="hostM" url="192.168.20.10:3306" user="rw" password="rw123456">              <readHost host="hostS" url="192.168.20.10:3306" user="rd" password="rd123456"/>          </writeHost>      </dataHost>        </dble:schema>

vim rule.xml 修改后的内容如下:

    <tableRule name="sharding-by-range_t">          <rule>              <columns>id</columns>              <algorithm>rangeLong2</algorithm>          </rule>      </tableRule>        <tableRule name="id-sharding-by-mod3">          <rule>              <columns>id</columns>              <algorithm>hashmod3</algorithm>          </rule>      </tableRule>        <tableRule name="user-auth-sharding-by-open_id">          <rule>                                                                                                                                    <columns>open_id</columns>              <algorithm>hashStringmod3</algorithm>          </rule>      </tableRule>        <function name="rangeLong2" class="NumberRange">          <property name="mapFile">autopartition-long_t.txt</property>          <property name="defaultNode">0</property><!-- 不符合条件的插入到第一个分区去 -->      </function>            <function name="hashmod3" class="Hash">          <property name="partitionCount">3</property>          <property name="partitionLength">1</property>      </function>            <function name="hashStringmod3" class="StringHash">          <property name="partitionCount">3</property>          <property name="partitionLength">1</property>          <property name="hashSlice">0:20</property>  <!-- 表示取前20位进行hash取模后再决定数据落在那个分片上 -->      </function>

[root@centos7 /usr/local/dble/conf ]#  vim autopartition-long_t.txt  # 增加一个路由规则文件

# range start-end ,data node index  # K=1000,M=10000.  # 范围:前开后闭 (开区间,闭区间]  0-1M=0  1M-2M=1  2M-3M=2

vim server.xml 内容如下:

修改user部分为如下:       <user name="man1">          <property name="password">654321</property>          <property name="manager">true</property>          <!-- manager user can't set schema-->      </user>      <user name="op">          <property name="password">123456</property>          <property name="schemas">testdb</property>      </user>          <!-- table's DML privileges  INSERT/UPDATE/SELECT/DELETE -->  <!--          <privileges check="true">              <schema name="testdb" dml="0110" >                  <table name="employee" dml="1111"></table>              </schema>          </privileges>  -->      <user name="rw">          <property name="password">rw123456</property>          <property name="schemas">testdb</property>      </user>      <user name="rd">          <property name="password">rd123456</property>          <property name="schemas">testdb</property>          <property name="readOnly">true</property>      </user>

然后, reload 下 dble , 进行测试

ddl专用:

mysql -uop -p123456 –port 8066 -h 192.168.20.10 testdb 

去创建符合上面的要求的几个表,并写入数据测试:

## 测试range分区  (testdb) > create table travelrecord (  id bigint not null primary key,  user_id varchar(100),  traveldate DATE,   fee decimal(10,2),  days int  ) ENGINE=InnoDB DEFAULT CHARSET=utf8;    (testdb) > insert into travelrecord (id,user_id,traveldate,fee,days) values(10,'wang','2014-01-05',510,3);  (testdb) > insert into travelrecord (id,user_id,traveldate,fee,days) values(13000,'lee','2011-01-05',26.5,3);  (testdb) > insert into travelrecord (id,user_id,traveldate,fee,days) values(29800,'zhang','2018-01-05',23.3,3);    (testdb) > select * from travelrecord ;  +-------+---------+------------+--------+------+  | id    | user_id | traveldate | fee    | days |  +-------+---------+------------+--------+------+  |    10 | wang    | 2014-01-05 | 510.00 |    3 |  | 13000 | lee     | 2011-01-05 |  26.50 |    3 |  | 29800 | zhang   | 2018-01-05 |  23.30 |    3 |  +-------+---------+------------+--------+------+
## 测试全局表  (testdb) > create table company(id int not null primary key,name varchar(100));     (testdb) > insert into company(id,name) values(1,'hp');  (testdb) > insert into company(id,name) values(2,'ibm');  (testdb) > insert into company(id,name) values(3,'oracle');    (testdb) > select * from company ;  +----+--------+  | id | name   |  +----+--------+  |  1 | hp     |  |  2 | ibm    |  |  3 | oracle |  +----+--------+  3 rows in set (0.01 sec)    多执行几次,你会看到三个分片上都插入了3条数据,因为company定义为全局表。    (testdb) > explain insert into company(id,name) values(1,'hp');  +-----------+----------+---------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                    |  +-----------+----------+---------------------------------------------+  | dn1       | BASE SQL | insert into company(id,name) values(1,'hp') |  | dn2       | BASE SQL | insert into company(id,name) values(1,'hp') |  | dn3       | BASE SQL | insert into company(id,name) values(1,'hp') |  +-----------+----------+---------------------------------------------+  3 rows in set (0.00 sec)    使用 explain select * from company ;   命令也可以看到随机分发到3个节点的。
## 测试hashmod分区  create table hotnews (id bigint unsigned not null primary key ,title varchar(400) ,created_time datetime) ENGINE=InnoDB DEFAULT CHARSET=utf8;    然后, 我们写个脚本,批量插入些数据,看看情况:    for i in {1..1000}; do     mysql -uop -p123456 --port 8066 -h 192.168.20.10 testdb  -e "insert into hotnews(id,title,created_time) values($i,'one',now());"  done    然后,到后端的3个分片上看下数据量,大致如下,还是比较均匀的:  (db1) > select count(*)  from db1.hotnews;  +----------+  | count(*) |  +----------+  |      333 |  +----------+  1 row in set (0.00 sec)    (db1) > select count(*)  from db2.hotnews;  +----------+  | count(*) |  +----------+  |      334 |  +----------+  1 row in set (0.00 sec)    (db1) > select count(*)  from db3.hotnews;  +----------+  | count(*) |  +----------+  |      333 |  +----------+  1 row in set (0.00 sec)
## hashStringmod分区  CREATE TABLE `user_auth` (    `id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '主键id',    `open_id` varchar(100) NOT NULL DEFAULT '' COMMENT '第三方授权id',    `union_id` varchar(100) NOT NULL DEFAULT '' COMMENT '授权的关联id',    PRIMARY KEY (`id`)  ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='用户AUTH信息表' ;    #### 注意:实际生产环境的主键id需要由程序去保证唯一性(例如使用雪花算法)    (testdb) > insert into user_auth (id,open_id,union_id) values(1,'331116828422393856','oy0IAj9mdPUr7bLMl879Jp37eV3Y');  (testdb) > insert into user_auth (id,open_id,union_id) values(2,'341170994247204864','oy0IA3Yj9mdPUr7bLMl879Jp37eV');  (testdb) > insert into user_auth (id,open_id,union_id) values(3,'330414325695332352','oy0IAj9mdPU3Yr7bLMl879Jp37eV');  (testdb) > insert into user_auth (id,open_id,union_id) values(4,'328588424011591680','oy0IAj9mdPUr7bLMl8Jp37e79V');  (testdb) > insert into user_auth (id,open_id,union_id) values(5,'330414325695332352','oy0IA3Yj9mdPUr7p37ebLMl879JV3Y');  (testdb) > insert into user_auth (id,open_id,union_id) values(6,'341172222247211111','oy0IAj9bLMl879Jp37eV3YmdPUr7');  (testdb) > insert into user_auth (id,open_id,union_id) values(7,'341173334247755464','Jp37eoy0IAj9mdPUr73YbLMl879V');    (testdb) > select id,open_id,union_id from user_auth order by id asc ;  +----+--------------------+--------------------------------+  | id | open_id            | union_id                       |  +----+--------------------+--------------------------------+  |  1 | 331116828422393856 | oy0IAj9mdPUr7bLMl879Jp37eV3Y   |  |  2 | 341170994247204864 | oy0IA3Yj9mdPUr7bLMl879Jp37eV   |  |  3 | 330414325695332352 | oy0IAj9mdPU3Yr7bLMl879Jp37eV   |  |  4 | 328588424011591680 | oy0IAj9mdPUr7bLMl8Jp37e79V     |  |  5 | 330414325695332352 | oy0IA3Yj9mdPUr7p37ebLMl879JV3Y |  |  6 | 341172222247211111 | oy0IAj9bLMl879Jp37eV3YmdPUr7   |  |  7 | 341173334247755464 | Jp37eoy0IAj9mdPUr73YbLMl879V   |  +----+--------------------+--------------------------------+  7 rows in set (0.00 sec)    (testdb) > explain select id,open_id,union_id from user_auth where open_id = '341173334247755464' ;  +-----------+----------+--------------------------------------------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                                                                        |  +-----------+----------+--------------------------------------------------------------------------------+  | dn2       | BASE SQL | select id,open_id,union_id from user_auth where open_id = '341173334247755464' |  +-----------+----------+--------------------------------------------------------------------------------+  1 row in set (0.00 sec)    (testdb) > explain select id,open_id,union_id from user_auth where open_id = '331116828422393856' ;  +-----------+----------+--------------------------------------------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                                                                        |  +-----------+----------+--------------------------------------------------------------------------------+  | dn1       | BASE SQL | select id,open_id,union_id from user_auth where open_id = '331116828422393856' |  +-----------+----------+--------------------------------------------------------------------------------+  1 row in set (0.00 sec)    (testdb) > explain select id,open_id,union_id from user_auth where open_id = '328588424011591680' ;  +-----------+----------+--------------------------------------------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                                                                        |  +-----------+----------+--------------------------------------------------------------------------------+  | dn3       | BASE SQL | select id,open_id,union_id from user_auth where open_id = '328588424011591680' |  +-----------+----------+--------------------------------------------------------------------------------+  1 row in set (0.00 sec)

上面就是几种常用的分区了, 另外还有种 date类型按时间分区的可能在日志表的场景下也常用些。

date类型分区的实验:

先去后端的db上创建物理的库:

create database userdb1 ;  create database userdb2 ;  create database userdb3 ;  create database userdb4 ;  create database userdb5 ;  create database userdb6 ;  create database userdb7 ;  create database userdb8 ;  create database userdb9 ;  create database userdb10 ;  create database userdb11 ;  create database userdb12 ;  create database userdb13 ;

修改后的 schema.xml 类似如下:

<?xml version="1.0"?>  <!DOCTYPE dble:schema SYSTEM "schema.dtd">    <dble:schema xmlns:dble="http://dble.cloud/" version="2.19.03.0">        <schema name="testdb">          <!-- 按月分片 -->          <table name="user" dataNode="user_dn$1-13" rule="sharding-by-month-user"/>      </schema>            <dataNode name="user_dn1" dataHost="192.168.20.10" database="userdb1"/>      <dataNode name="user_dn2" dataHost="192.168.20.10" database="userdb2"/>      <dataNode name="user_dn3" dataHost="192.168.20.10" database="userdb3"/>      <dataNode name="user_dn4" dataHost="192.168.20.10" database="userdb4"/>      <dataNode name="user_dn5" dataHost="192.168.20.10" database="userdb5"/>      <dataNode name="user_dn6" dataHost="192.168.20.10" database="userdb6"/>      <dataNode name="user_dn7" dataHost="192.168.20.10" database="userdb7"/>      <dataNode name="user_dn8" dataHost="192.168.20.10" database="userdb8"/>      <dataNode name="user_dn9" dataHost="192.168.20.10" database="userdb9"/>      <dataNode name="user_dn10" dataHost="192.168.20.10" database="userdb10"/>      <dataNode name="user_dn11" dataHost="192.168.20.10" database="userdb11"/>      <dataNode name="user_dn12" dataHost="192.168.20.10" database="userdb12"/>      <dataNode name="user_dn13" dataHost="192.168.20.10" database="userdb13"/>            <dataHost name="192.168.20.10" maxCon="500" minCon="10" balance="0" switchType="-1" slaveThreshold="100">          <heartbeat>select user()</heartbeat>          <writeHost host="hostM" url="192.168.20.10:3306" user="rw" password="rw123456">              <readHost host="hostS" url="192.168.20.10:3306" user="rd" password="rd123456"/>          </writeHost>      </dataHost>        </dble:schema>

然后,到 rule.xml中添加规则:

<tableRule name="sharding-by-month-user">      <rule>          <columns>addData</columns>          <algorithm>partbymonth-user</algorithm>      </rule>  </tableRule>    <!-- 加的基于月份的分片规则, 注意如果数量超了 会插入报错 -->      <function name="partbymonth-user" class="Date">          <property name="dateFormat">yyyy-MM-dd</property>              <property name="sBeginDate">2018-01-01</property>           <!--  <property name="sEndDate">2019-02-31</property> -->          <property name="sPartionDay">30</property>  <!-- 默认是每10天一个分片。我这里改成每30天一个分片,另外注意并不按照固定的月来写入 -->          <property name="defaultNode">0</property><!-- 默认小于 2018-01-01 的数据插入到dn1去 -->      </function>
(testdb) > create table if not exists user (addData date, dbname varchar(32),username varchar(32),province varchar(16),age int(3));    (testdb) > insert into user (addData,dbname,username,age) values ('2015-01-01',database(),'user1',12);    (testdb) > insert into user (addData,dbname,username,age) values ('2016-02-01',database(),'user1',12);    (testdb) > explain  insert into user (addData,dbname,username,age) values ('2017-03-01',database(),'user1',12);  +-----------+----------+--------------------------------------------------------------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                                                                                          |  +-----------+----------+--------------------------------------------------------------------------------------------------+  | user_dn1  | BASE SQL | INSERT INTO user (addData, dbname, username, age) VALUES ('2017-03-01', DATABASE(), 'user1', 12) |  +-----------+----------+--------------------------------------------------------------------------------------------------+  (testdb) > insert into user (addData,dbname,username,age) values ('2017-03-01',database(),'user1',12);      (testdb) > insert into user (addData,dbname,username,age) values ('2018-04-01',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-04-11',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-04-21',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-04-25',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-04-30',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-05-01',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-05-03',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-05-05',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-06-21',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2018-07-30',database(),'user1',12);  (testdb) > insert into user (addData,dbname,username,age) values ('2019-01-01',database(),'user1',12);    (testdb) > insert into user (addData,dbname,username,age) values ('2019-06-01',database(),'user1',12);  ERROR 1064 (HY000): can't find any valid data node :user -> ADDDATA -> 2019-06-01      因此,我们需要提前人工把分片加好 并做好可用分区的监控,不然会造成无法写入数据的事故出现。      (testdb) > select * from user order by addData asc ;  +------------+----------+----------+----------+------+  | addData    | dbname   | username | province | age  |  +------------+----------+----------+----------+------+  | 2015-01-01 | userdb1  | user1    | NULL     |   12 |  | 2016-02-01 | userdb1  | user1    | NULL     |   12 |  | 2017-03-01 | userdb1  | user1    | NULL     |   12 |  | 2018-04-01 | userdb4  | user1    | NULL     |   12 |  | 2018-04-11 | userdb4  | user1    | NULL     |   12 |  | 2018-04-21 | userdb4  | user1    | NULL     |   12 |  | 2018-04-25 | userdb4  | user1    | NULL     |   12 |  | 2018-04-30 | userdb4  | user1    | NULL     |   12 |  | 2018-05-01 | userdb5  | user1    | NULL     |   12 |  | 2018-05-03 | userdb5  | user1    | NULL     |   12 |  | 2018-05-05 | userdb5  | user1    | NULL     |   12 |  | 2018-06-21 | userdb6  | user1    | NULL     |   12 |  | 2018-07-30 | userdb8  | user1    | NULL     |   12 |  | 2019-01-01 | userdb13 | user1    | NULL     |   12 |  +------------+----------+----------+----------+------+  14 rows in set (0.02 sec)      查询测试:  (testdb) > explain select * from user where addData between '2018-04-01' and '2018-04-30' ;  +-----------+----------+------------------------------------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                                                                |  +-----------+----------+------------------------------------------------------------------------+  | user_dn4  | BASE SQL | select * from user where addData between '2018-04-01' and '2018-04-30' |  +-----------+----------+------------------------------------------------------------------------+  1 row in set (0.00 sec)      (testdb) > select * from user where addData between '2018-04-01' and '2018-04-30' ;  +------------+---------+----------+----------+------+  | addData    | dbname  | username | province | age  |  +------------+---------+----------+----------+------+  | 2018-04-01 | userdb4 | user1    | NULL     |   12 |  | 2018-04-11 | userdb4 | user1    | NULL     |   12 |  | 2018-04-21 | userdb4 | user1    | NULL     |   12 |  | 2018-04-25 | userdb4 | user1    | NULL     |   12 |  | 2018-04-30 | userdb4 | user1    | NULL     |   12 |  +------------+---------+----------+----------+------+  5 rows in set (0.01 sec)      (testdb) > explain select * from user where addData between '2018-04-01' and '2018-05-30' order by addData asc ;  +-----------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  | DATA_NODE       | TYPE          | SQL/REF                                                                                                                                                                                 |  +-----------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  | user_dn4_0      | BASE SQL      | select `user`.`addData`,`user`.`dbname`,`user`.`username`,`user`.`province`,`user`.`age` from  `user` where addData BETWEEN '2018-04-01' AND '2018-05-30' ORDER BY `user`.`addData` ASC |  | user_dn5_0      | BASE SQL      | select `user`.`addData`,`user`.`dbname`,`user`.`username`,`user`.`province`,`user`.`age` from  `user` where addData BETWEEN '2018-04-01' AND '2018-05-30' ORDER BY `user`.`addData` ASC |  | merge_1         | MERGE         | user_dn4_0; user_dn5_0                                                                                                                                                                  |  | shuffle_field_1 | SHUFFLE_FIELD | merge_1                                                                                                                                                                                 |  +-----------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  4 rows in set (0.00 sec)      (testdb) > select * from user where addData between '2018-04-01' and '2018-05-30' order by addData asc ;  +------------+---------+----------+----------+------+  | addData    | dbname  | username | province | age  |  +------------+---------+----------+----------+------+  | 2018-04-01 | userdb4 | user1    | NULL     |   12 |  | 2018-04-11 | userdb4 | user1    | NULL     |   12 |  | 2018-04-21 | userdb4 | user1    | NULL     |   12 |  | 2018-04-25 | userdb4 | user1    | NULL     |   12 |  | 2018-04-30 | userdb4 | user1    | NULL     |   12 |  | 2018-05-01 | userdb5 | user1    | NULL     |   12 |  | 2018-05-03 | userdb5 | user1    | NULL     |   12 |  | 2018-05-05 | userdb5 | user1    | NULL     |   12 |  +------------+---------+----------+----------+------+  8 rows in set (0.01 sec)

date类型的可用分区的监控(脚本的原理同样适用于其他类型的分区):

简单的做法就是定期执行一个explain的insert插入测试, 如果有ERROR关键字就告警出来

一个简单的脚本如下:  # 提前60天预警    DAYS=$(date -d 60days  +%F)  echo $DAYS    if mysql -urw -prw123456 --port 8066 -h 192.168.20.10 testdb 2>/dev/null -e "explain insert into user (addData,dbname,username,age) values ("$DAYS",database(),'user1',12);" ; then       echo "当前可用分片数量处于安全状态"  else      echo "需要加新的分片了"  fi

date类型加新的分片的方法: 

1、修改schema.xml 加上新的分片的配置信息,修改后大致这样:    <?xml version="1.0"?>  <!DOCTYPE dble:schema SYSTEM "schema.dtd">  <dble:schema xmlns:dble="http://dble.cloud/" version="2.19.03.0">        <schema name="testdb">          <!-- 按月分片 -->          <table name="user" dataNode="user_dn$1-23" rule="sharding-by-month-user"/>      </schema>        <dataNode name="user_dn1" dataHost="192.168.20.10" database="userdb1"/>      <dataNode name="user_dn2" dataHost="192.168.20.10" database="userdb2"/>      <dataNode name="user_dn3" dataHost="192.168.20.10" database="userdb3"/>      <dataNode name="user_dn4" dataHost="192.168.20.10" database="userdb4"/>      <dataNode name="user_dn5" dataHost="192.168.20.10" database="userdb5"/>      <dataNode name="user_dn6" dataHost="192.168.20.10" database="userdb6"/>      <dataNode name="user_dn7" dataHost="192.168.20.10" database="userdb7"/>      <dataNode name="user_dn8" dataHost="192.168.20.10" database="userdb8"/>      <dataNode name="user_dn9" dataHost="192.168.20.10" database="userdb9"/>      <dataNode name="user_dn10" dataHost="192.168.20.10" database="userdb10"/>      <dataNode name="user_dn11" dataHost="192.168.20.10" database="userdb11"/>      <dataNode name="user_dn12" dataHost="192.168.20.10" database="userdb12"/>      <dataNode name="user_dn13" dataHost="192.168.20.10" database="userdb13"/>      <dataNode name="user_dn14" dataHost="192.168.20.10" database="userdb14"/>      <dataNode name="user_dn15" dataHost="192.168.20.10" database="userdb15"/>      <dataNode name="user_dn16" dataHost="192.168.20.10" database="userdb16"/>      <dataNode name="user_dn17" dataHost="192.168.20.10" database="userdb17"/>      <dataNode name="user_dn18" dataHost="192.168.20.10" database="userdb18"/>      <dataNode name="user_dn19" dataHost="192.168.20.10" database="userdb19"/>      <dataNode name="user_dn20" dataHost="192.168.20.10" database="userdb20"/>      <dataNode name="user_dn21" dataHost="192.168.20.10" database="userdb21"/>      <dataNode name="user_dn22" dataHost="192.168.20.10" database="userdb22"/>      <dataNode name="user_dn23" dataHost="192.168.20.10" database="userdb23"/>        <dataHost name="192.168.20.10" maxCon="500" minCon="10" balance="0" switchType="-1" slaveThreshold="100">          <heartbeat>select user()</heartbeat>          <writeHost host="hostM" url="192.168.20.10:3306" user="rw" password="rw123456">              <readHost host="hostS" url="192.168.20.10:3306" user="rd" password="rd123456"/>          </writeHost>      </dataHost>  </dble:schema>      2、重载配置文件  reload @@config_all ;      3、去后端创建对应的物理库    create database userdb14;  .....这里省略其它的建库语句.......  create database userdb23;    4、通过dble再次下发下建表命令  create table if not exists user (addData date, dbname varchar(32),username varchar(32),province varchar(16),age int(3));      5、插入数据测试  (testdb) > explain insert into user (addData,dbname,username,age) values ('2019-11-01',database(),'user1',12);  +-----------+----------+--------------------------------------------------------------------------------------------------+  | DATA_NODE | TYPE     | SQL/REF                                                                                          |  +-----------+----------+--------------------------------------------------------------------------------------------------+  | user_dn23 | BASE SQL | INSERT INTO user (addData, dbname, username, age) VALUES ('2019-11-01', DATABASE(), 'user1', 12) |  +-----------+----------+--------------------------------------------------------------------------------------------------+  1 row in set (0.00 sec)    (testdb) > explain insert into user (addData,dbname,username,age) values ('2019-12-01',database(),'user1',12);  ERROR 1064 (HY000): can't find any valid data node :user -> ADDDATA -> 2019-12-01

ER 表 (互联网场景下用多表JOIN的不多,因此ER分片规则不太常用到,但是需要大致的了解):

下面的内容大篇幅参考: https://blog.csdn.net/zhanglei_16/article/details/50779929

1:ER分片关系简介

有一类业务,例如订单(ORDER)跟订单明细表(ORDER_DETAIL),明细表会依赖订单单,就是该会存在表的主从关系,

这类似业务的切分可以抽象出合适的切分规则,比如根据用户ID切分,其它相关的表都依赖于用户ID,再或者根据订单ID进行切分,

总之部分业务总会可以抽象出父子关系的表。这类表适用于ER分片表,子表的记录与所关联的父表记录存放在同一个数据分片上,

避免数据Join跨库操作,以order与order_detail例子为例,schema.xml中定义合适的分片配置,order,order_detail 根据order_id

迕行数据切分,保证相同order_id的数据分到同一个分片上,在进行数据插入操作时,Mycat会获取order所在的分片,

然后将order_detail也插入到order所在的分片

2:父表按照主键ID分片,字表的分片字段与主表ID关联,配置为ER分片

2.1:在schema.xml添加如下配置配置文件修改

<!-- ER 分区 -->  <table name="order1" dataNode="dn1,dn2,dn3" rule="id-sharding-by-mod3">    <childTable name="order_detail" primaryKey="id" joinKey="order_id" parentKey="id" />   </table>

在rule.xml里面设定分片规则:

    <tableRule name="id-sharding-by-mod3">          <rule>                                                                                                                                                               <columns>id</columns>              <algorithm>hashmod3</algorithm>          </rule>      </tableRule>      <!-- mod 3 -->      <function name="hashmod3" class="Hash">                                                                                                                              <property name="partitionCount">3</property>          <property name="partitionLength">1</property>      </function>

然后, reload 下 dble 

2.2 先建表, order 和 order_detail 表,有主外键关系

mysql> explain CREATE TABLE order1 (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) ENGINE=InnoDB DEFAULT CHARSET=utf8;  +-----------+-----------------------------------------------------------------------------------------------------+  | DATA_NODE | SQL                                                                                                 |  +-----------+-----------------------------------------------------------------------------------------------------+  | dn1       | CREATE TABLE order1(id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) |  | dn2       | CREATE TABLE order1(id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) |  | dn3       | CREATE TABLE order1(id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) |  +-----------+-----------------------------------------------------------------------------------------------------+  3 rows in set (0.02 sec)  mysql> CREATE TABLE order1(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) ENGINE=InnoDB DEFAULT CHARSET=utf8;  Query OK, 0 rows affected (0.35 sec)  mysql> CREATE TABLE order_detail(id INT AUTO_INCREMENT PRIMARY KEY, order_id INT,ord_status CHAR(1),address VARCHAR(128),create_time DATETIME,CONSTRAINT FK_ORDid FOREIGN KEY (order_id) REFERENCES order1 (id)) ENGINE=InnoDB DEFAULT CHARSET=utf8;  Query OK, 0 rows affected (0.44 sec)

3.3 录入数据:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(1,'BJ0001',NOW());  +-----------+----------------------------------------------------------------+  | DATA_NODE | SQL                                                            |  +-----------+----------------------------------------------------------------+  | dn2       | INSERT INTO order1(id,sn,create_time) VALUES(1,'BJ0001',NOW()) |  +-----------+----------------------------------------------------------------+  1 row in set (0.03 sec)

录入数据,一组组录入,涉及到外键关系: 

第一组北京的订单

mysql> INSERT INTO order1(id,sn,create_time) VALUES(1,'BJ0001',NOW());  Query OK, 1 row affected (0.05 sec)  mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (1,1,'1','test data  of order1(id=1,BJ001) ',NOW());

第二组上海的订单:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(3,'SHH001',NOW());  +-----------+----------------------------------------------------------------+  | DATA_NODE | SQL                                                            |  +-----------+----------------------------------------------------------------+  | dn1       | INSERT INTO order1(id,sn,create_time) VALUES(3,'SHH001',NOW()) |  +-----------+----------------------------------------------------------------+  1 row in set (0.02 sec)  mysql> INSERT INTO order1(id,sn,create_time) VALUES(3,'SHH001',NOW());  Query OK, 1 row affected (0.04 sec)  mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (3,3,'1','test data of order1(id=3,SHH001)',NOW());  Query OK, 1 row affected (0.06 sec)

第三组广州的订单:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(4,'GZH004',NOW());  +-----------+----------------------------------------------------------------+  | DATA_NODE | SQL                                                            |  +-----------+----------------------------------------------------------------+  | dn2       | INSERT INTO order1(id,sn,create_time) VALUES(4,'GZH004',NOW()) |  +-----------+----------------------------------------------------------------+  1 row in set (0.00 sec)  mysql> INSERT INTO order1(id,sn,create_time) VALUES(4,'GZH004',NOW());  Query OK, 1 row affected (0.06 sec)  mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (4,4,'1','test data  of order1(id=4,GZH004) ',NOW());  Query OK, 1 row affected (0.05 sec)

第四组 武汉的订单,这里故意将order_id设置成4,看看效果,是否随id为4的广州的那组分片:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(5,'WUHAN005',NOW());  +-----------+------------------------------------------------------------------+  | DATA_NODE | SQL                                                              |  +-----------+------------------------------------------------------------------+  | dn3       | INSERT INTO order1(id,sn,create_time) VALUES(5,'WUHAN005',NOW()) |  +-----------+------------------------------------------------------------------+  1 row in set (0.01 sec)  mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(6,'WUHAN006',NOW());  Query OK, 1 row affected (0.03 sec)  mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (6,4,'1','test data  of order1(id=6,WUHAN006) ',NOW());  Query OK, 1 row affected (0.05 sec)

通过DBLE,查看下数据写入的情况:

(testdb) > select * from order1;  +----+--------+---------------------+  | id | sn     | create_time         |  +----+--------+---------------------+  |  1 | BJ0001 | 2019-08-31 23:05:36 |  |  4 | GZH004 | 2019-08-31 23:06:57 |  |  3 | SHH001 | 2019-08-31 23:06:43 |  +----+--------+---------------------+  3 rows in set (0.01 sec)  (testdb) > select * from order_detail ;  +----+----------+------------+--------------------------------------+---------------------+  | id | order_id | ord_status | address                              | create_time         |  +----+----------+------------+--------------------------------------+---------------------+  |  1 |        1 | 1          | test data  of ORDER1(ID=1,BJ001)     | 2019-08-31 23:06:17 |  |  4 |        4 | 1          | test data  of ORDER1(ID=4,GZH004)    | 2019-08-31 23:07:01 |  |  6 |        4 | 1          | test data  of ORDER1(ID=6,WUHAN006)  | 2019-08-31 23:07:23 |  |  3 |        3 | 1          | test data of ORDER1(ID=3,SHH001)     | 2019-08-31 23:06:47 |  +----+----------+------------+--------------------------------------+---------------------+  4 rows in set (0.01 sec)

直连后端的db1,看下数据情况 (db2 和 db3 上面的数据查看,使用同样的方法);

((none)) > select * from db1.order1;  +----+--------+---------------------+  | id | sn     | create_time         |  +----+--------+---------------------+  |  3 | SHH001 | 2019-08-31 23:06:43 |  +----+--------+---------------------+  1 row in set (0.00 sec)  ((none)) > select * from db1.order_detail;  +----+----------+------------+----------------------------------+---------------------+  | id | order_id | ord_status | address                          | create_time         |  +----+----------+------------+----------------------------------+---------------------+  |  3 |        3 | 1          | test data of ORDER1(ID=3,SHH001) | 2019-08-31 23:06:47 |  +----+----------+------------+----------------------------------+---------------------+  1 row in set (0.00 sec)

2.6 走DBLE,模拟下业务的查询:

(testdb) > explain select t1.*,t2.* from order1 t1,order_detail t2 where t2.ord_status='1' and t2.id=1 and t1.id=t2.order_id;  +-----------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  | DATA_NODE       | TYPE          | SQL/REF                                                                                                                                                                                                                                                      |  +-----------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  | dn1_0           | BASE SQL      | select `t2`.`id`,`t2`.`order_id`,`t2`.`ord_status`,`t2`.`address`,`t2`.`create_time`,`t1`.`id`,`t1`.`sn`,`t1`.`create_time` from  `order1` `t1` join  `order_detail` `t2` on `t1`.`id` = `t2`.`order_id` where (`t2`.`ord_status` = '1') AND (`t2`.`id` = 1) |  | dn2_0           | BASE SQL      | select `t2`.`id`,`t2`.`order_id`,`t2`.`ord_status`,`t2`.`address`,`t2`.`create_time`,`t1`.`id`,`t1`.`sn`,`t1`.`create_time` from  `order1` `t1` join  `order_detail` `t2` on `t1`.`id` = `t2`.`order_id` where (`t2`.`ord_status` = '1') AND (`t2`.`id` = 1) |  | dn3_0           | BASE SQL      | select `t2`.`id`,`t2`.`order_id`,`t2`.`ord_status`,`t2`.`address`,`t2`.`create_time`,`t1`.`id`,`t1`.`sn`,`t1`.`create_time` from  `order1` `t1` join  `order_detail` `t2` on `t1`.`id` = `t2`.`order_id` where (`t2`.`ord_status` = '1') AND (`t2`.`id` = 1) |  | merge_1         | MERGE         | dn1_0; dn2_0; dn3_0                                                                                                                                                                                                                                          |  | shuffle_field_1 | SHUFFLE_FIELD | merge_1                                                                                                                                                                                                                                                      |  +-----------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  5 rows in set (0.00 sec)  (testdb) > SELECT    t1.*,    t2.*  FROM    order1 t1,    order_detail t2  WHERE t2.ord_status = '1'    AND t2.id = 1    AND t1.id = t2.order_id ;  +----+--------+---------------------+----+----------+------------+-----------------------------------+---------------------+  | id | sn     | create_time         | id | order_id | ord_status | address                           | create_time         |  +----+--------+---------------------+----+----------+------------+-----------------------------------+---------------------+  |  1 | BJ0001 | 2019-08-31 23:05:36 |  1 |        1 | 1          | test data  of ORDER1(ID=1,BJ001)  | 2019-08-31 23:06:17 |  +----+--------+---------------------+----+----------+------------+-----------------------------------+---------------------+  1 row in set (0.00 sec)

2.7 总结:当子表与父表的关联字段正好是父表的分片字段时,子表直接根据父表规则进行分片,在数据录入的时候子表直接放在父表的分片上面,在进行关联查询join的时候,走的是父表的路由。

【重要】其它的总结:

当子表与父表的关联字段不是父表的分片字段时,必须通过查找对应的父表记录来确认子表所在分片,如果找不到则会抛出错误,在join查询的时候,路由走的是所有分片节点。