分布式文档存储数据库之MongoDB备份与恢复

  前文我们聊了下mongodb的访问控制以及用户创建和角色分配,回顾请参考//www.cnblogs.com/qiuhom-1874/p/13974656.html;今天我们来了解下mongodb的备份与恢复

  为什么要备份?

  备份的目的是对数据做冗余的一种方式,它能够让我们在某种情况下保证最少数据的丢失;之前我们对mongodb做副本集也是对数据做冗余,但是这种在副本集上做数据冗余仅仅是针对系统故障或服务异常等一些非人为的故障发生时,保证数据服务的可用性;它不能够避免人为的误操作;为了使得数据的安全,将数据损失降低到最小,我们必须对数据库周期性的做备份;

  常用备份方法

  提示:上图主要描述了mongodb数据库上常用备份策略,我们可以逻辑备份,逻辑备份是将数据库中的数据导出成语句,通常使用专用工具导出和导入来完成一次备份与恢复;其次我们也可以物理备份,简单讲物理备份就是把数据库文件打包,备份;恢复时直接将对应的数据库文件解压恢复即可;另外一种快速物理备份的方式就是给数据拍快照,拍快照可以将数据保存为当前拍快照时的状态;如果我们要进行恢复直接恢复快照即可;

  mongodb逻辑备份和物理备份比较

  提示:从上图描述可以看出总体上来看物理备份效率和恢复效率要高于逻辑;物理备份效率高于逻辑备份,其主要原因是逻辑备份是通过数据库接口将数据读取出来,然后保存为对应格式的文件,而物理备份只需要将数据文件直接打包备份,不需要一条一条的读取数据,然后写入到其他文件,这中间就省去了读写过程,所以物理备份效率高;恢复也是类似的过程,物理恢复也是省去了读写的过程;

  mongodb逻辑备份工具

  在mongodb中使用逻辑备份的工具有两组,第一组是mongodump/mongorestore,使用mongodump/mongorestore这组工具来逻辑的备份数据,它备份出来的数据是BSON格式,BSON是一种二进制格式,通常无法使用文本编辑器直接打开查看其内容,对人类的可读性较差,但它的优点是保存的文件体积要小;使用这组命令导出的数据,在恢复是依赖mongodb版本,不同版本导出的BSON格式略有不同,所以恢复时,可能存在版本不同而导致恢复数据失败的情况;另外一组是mongoexport/mongoimport,这组工具导出的数据是json格式的数据,通常我们可以使用文本编辑器打开直接查看,对人类的可读性较好,但体积相对BSON格式的数据要大,恢复时不依赖版本;所以跨版本备份要先查看下对应版本的兼容性,如果兼容使用mongodump/mongorestore,不兼容的话建议使用mongoexport/mongoimport;这里需要注意一点,JSON格式虽然可读性很好,也很通用,但是它只是保留了数据部分,而没有保留索引,账户等基础信息,在使用是应该注意;

  使用mongodump备份数据

  插入数据

> use testdb
switched to db testdb
> for(i=1;i<=1000;i++) db.test.insert({id:i,name:"test"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show tables
test
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> db.test.count()
1000
> 

  备份所有数据库

[[email protected] ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -o ./node12_mongodb_full_backup
2020-11-15T21:47:45.439+0800    writing admin.system.users to node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T21:47:45.442+0800    done dumping admin.system.users (4 documents)
2020-11-15T21:47:45.443+0800    writing admin.system.version to node12_mongodb_full_backup/admin/system.version.bson
2020-11-15T21:47:45.447+0800    done dumping admin.system.version (2 documents)
2020-11-15T21:47:45.448+0800    writing testdb.test to node12_mongodb_full_backup/testdb/test.bson
2020-11-15T21:47:45.454+0800    done dumping testdb.test (1000 documents)
[[email protected] ~]# ls
node12_mongodb_full_backup
[[email protected] ~]# ll node12_mongodb_full_backup/
total 0
drwxr-xr-x 2 root root 128 Nov 15 21:47 admin
drwxr-xr-x 2 root root  49 Nov 15 21:47 testdb
[[email protected] ~]# tree node12_mongodb_full_backup/
node12_mongodb_full_backup/
├── admin
│   ├── system.users.bson
│   ├── system.users.metadata.json
│   ├── system.version.bson
│   └── system.version.metadata.json
└── testdb
    ├── test.bson
    └── test.metadata.json

2 directories, 6 files
[[email protected] ~]# 

  提示:-u用于指定用户,-p指定对应用户的密码,-h指定数据库地址,–authenticationDatabase 指定验证用户和密码对应的数据库 -o指定要存放备份文件的目录名称;

  只备份单个testdb数据库

[[email protected] ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -o ./node12_testdb
2020-11-15T21:53:36.523+0800    writing testdb.test to node12_testdb/testdb/test.bson
2020-11-15T21:53:36.526+0800    done dumping testdb.test (1000 documents)
[[email protected] ~]# tree ./node12_testdb
./node12_testdb
└── testdb
    ├── test.bson
    └── test.metadata.json

1 directory, 2 files
[[email protected] ~]# 

  提示:-d用户指定要备份的数据库名称;

  只备份testdb下的test集合

[[email protected] ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test -o ./node12_testdb_test-collection
2020-11-15T21:55:48.217+0800    writing testdb.test to node12_testdb_test-collection/testdb/test.bson
2020-11-15T21:55:48.219+0800    done dumping testdb.test (1000 documents)
[[email protected] ~]# tree ./node12_testdb_test-collection
./node12_testdb_test-collection
└── testdb
    ├── test.bson
    └── test.metadata.json

1 directory, 2 files
[[email protected] ~]# 

  提示:-c用于指定要备份的集合(collection)名称;

  压缩备份testdb库

[[email protected] ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip -o ./node12_mongodb_testdb-gzip 
2020-11-15T22:00:52.268+0800    writing testdb.test to node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:00:52.273+0800    done dumping testdb.test (1000 documents)
[[email protected] ~]# tree ./node12_mongodb_testdb-gzip
./node12_mongodb_testdb-gzip
└── testdb
    ├── test.bson.gz
    └── test.metadata.json.gz

1 directory, 2 files
[[email protected] ~]# 

  提示:可以看到使用压缩,只需要加上–gzip选项即可,备份出来的数据就是.gz后缀结尾的压缩文件;

  压缩备份testdb库下的test集合

[[email protected] ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --gzip -o ./node12_mongodb_testdb-test-gzip 
2020-11-15T22:01:31.492+0800    writing testdb.test to node12_mongodb_testdb-test-gzip/testdb/test.bson.gz
2020-11-15T22:01:31.500+0800    done dumping testdb.test (1000 documents)
[[email protected] ~]# tree ./node12_mongodb_testdb-test-gzip
./node12_mongodb_testdb-test-gzip
└── testdb
    ├── test.bson.gz
    └── test.metadata.json.gz

1 directory, 2 files
[[email protected] ~]# 

  使用mongorestore恢复数据

  在node12上删除testdb

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  全量恢复所有数据库

[[email protected] ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --drop ./node12_mongodb_full_backup
2020-11-15T22:07:35.465+0800    preparing collections to restore from
2020-11-15T22:07:35.467+0800    reading metadata for testdb.test from node12_mongodb_full_backup/testdb/test.metadata.json
2020-11-15T22:07:35.475+0800    restoring testdb.test from node12_mongodb_full_backup/testdb/test.bson
2020-11-15T22:07:35.486+0800    no indexes to restore
2020-11-15T22:07:35.486+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:07:35.486+0800    restoring users from node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T22:07:35.528+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[[email protected] ~]#

  提示:–drop用于指定,恢复是如果对应数据库或者colleciton存在,则先删除然后在恢复,这样做的目的是保证恢复的数据和备份的数据一致;

  验证:登录192.168.0.52:27017查看对应testdb数据库是否恢复?

[[email protected] ~]# mongo -utom -p123456 192.168.0.52:27017/admin 
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("af96cb64-a2a4-4d59-b60a-86ccbbe77e3e") }
MongoDB server version: 4.4.1
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        //docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
        //community.mongodb.com
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  恢复单个库

  删除testdb库

> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  使用mongorestore恢复testdb库

[[email protected] ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --drop ./node12_testdb/testdb/
2020-11-15T22:29:03.718+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:29:03.718+0800    building a list of collections to restore from node12_testdb/testdb dir
2020-11-15T22:29:03.719+0800    reading metadata for testdb.test from node12_testdb/testdb/test.metadata.json
2020-11-15T22:29:03.736+0800    restoring testdb.test from node12_testdb/testdb/test.bson
2020-11-15T22:29:03.755+0800    no indexes to restore
2020-11-15T22:29:03.755+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:29:03.755+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[[email protected] ~]# mongo -utom -p123456 192.168.0.52:27017/admin 
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("f5e73939-bb87-4d45-bf80-9ff1e7f6f15d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show tables
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  恢复单个集合

  删除testdb下的test集合

> db
testdb
> show collections
test
> db.test.drop()
true
> show collections
> 

  使用mongorestore恢复testdb下的test集合

[[email protected] ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --drop ./node12_testdb_test-collection/testdb/test.bson 
2020-11-15T22:36:15.615+0800    checking for collection data in node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.616+0800    reading metadata for testdb.test from node12_testdb_test-collection/testdb/test.metadata.json
2020-11-15T22:36:15.625+0800    restoring testdb.test from node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.669+0800    no indexes to restore
2020-11-15T22:36:15.669+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:36:15.669+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[[email protected] ~]# mongo -utom -p123456 192.168.0.52:27017/admin                                              MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("27d15d9e-3fdf-4efc-b871-1ec6716e51e3") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  使用压缩文件恢复数据库

  删除testdb数据库

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  使用mongorestore工具加载压缩文件恢复数据库

[[email protected] ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip --drop ./node12_mongodb_testdb-gzip/testdb/
2020-11-15T22:39:55.313+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:39:55.313+0800    building a list of collections to restore from node12_mongodb_testdb-gzip/testdb dir
2020-11-15T22:39:55.314+0800    reading metadata for testdb.test from node12_mongodb_testdb-gzip/testdb/test.metadata.json.gz
2020-11-15T22:39:55.321+0800    restoring testdb.test from node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:39:55.332+0800    no indexes to restore
2020-11-15T22:39:55.332+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:39:55.332+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[[email protected] ~]# mongo -utom -p123456 192.168.0.52:27017/admin                                              MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("73d98c33-f8f7-40e3-89bd-fda8c702e407") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  提示:使用mongorestore恢复单个库使用-d选线指定要恢复的数据库,恢复单个集合使用-c指定集合名称即可,以及使用压缩文件恢复加上对应的–gzip选项即可,总之,备份时用的选项在恢复时也应当使用对应的选项,这个mongodump备份使用的选项没有特别的不同;

  使用mongoexport备份数据

  新建peoples数据库,并向peoples_info集合中插入数据

> use peoples
switched to db peoples
> for(i=1;i<=10000;i++) db.peoples_info.insert({id:i,name:"peoples"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show dbs
admin    0.000GB
config   0.000GB
local    0.000GB
peoples  0.000GB
testdb   0.000GB
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
        "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1,
        "classes" : 1
}
> 

  使用mongoexport工具peoples库下的peoples_info集合

[[email protected] ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type json -o ./peoples-peopels_info.json
2020-11-15T22:54:18.287+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:54:18.370+0800    exported 10000 records
[[email protected] ~]# ll
total 1004
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[[email protected] ~]# head -n 1 peoples-peopels_info.json 
{"_id":{"$oid":"5fb13f35012870b3c8e3c895"},"id":1.0,"name":"peoples1","age":1.0,"classes":1.0}
[[email protected] ~]# 

  提示:使用–type可以指定导出数据文件的格式,默认是json格式,当然也可以指定csv格式;这里还需要注意mongoexport这个工具导出数据必须要指定数据库和对应集合,它不能直接对整个数据库下的所有集合做导出;只能单个单个的导;

  导出csv格式的数据文件

[[email protected] ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -o ./peoples-peopels_info.csv
2020-11-15T22:58:30.495+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:58:30.498+0800    Failed: CSV mode requires a field list
[[email protected] ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -f id,name,age -o ./peoples-peopels_info.csv  
2020-11-15T22:59:26.090+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:59:26.143+0800    exported 10000 records
[[email protected] ~]# head -n 1 ./peoples-peopels_info.csv
id,name,age
[[email protected] ~]# head  ./peoples-peopels_info.csv    
id,name,age
1,peoples1,1
2,peoples2,2
3,peoples3,3
4,peoples4,4
5,peoples5,5
6,peoples6,6
7,peoples7,7
8,peoples8,8
9,peoples9,9
[[email protected] ~]# 

  提示:导出指定格式为csv时,必须用-f选项指定导出的字段名称,分别用逗号隔开;

  将数据导入到node11的mongodb上

  导入json格式数据

[[email protected] ~]# systemctl start mongod.service 
[[email protected] ~]# ss -tnl
State      Recv-Q Send-Q         Local Address:Port                        Peer Address:Port              
LISTEN     0      128                        *:22                                     *:*                  
LISTEN     0      100                127.0.0.1:25                                     *:*                  
LISTEN     0      128                127.0.0.1:27017                                  *:*                  
LISTEN     0      128                       :::22                                    :::*                  
LISTEN     0      100                      ::1:25                                    :::*                  
[[email protected] ~]# ll
total 1200
-rw-r--r-- 1 root root  198621 Nov 15 22:59 peoples-peopels_info.csv
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[[email protected] ~]# mongoimport  -d testdb -c peoples_info --drop peoples-peopels_info.json 
2020-11-15T23:05:03.004+0800    connected to: mongodb://localhost/
2020-11-15T23:05:03.005+0800    dropping: testdb.peoples_info
2020-11-15T23:05:03.186+0800    10000 document(s) imported successfully. 0 document(s) failed to import.
[[email protected] ~]#

  提示:导入数据时可以任意指定数据库以及集合名称;

  验证:查看node11上的testdb库下是否有peoples_info集合?集合中是否有数据呢?

[[email protected] ~]# mongo
MongoDB shell version v4.4.1
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("4e3a00b0-8367-4b3a-9a77-e61d03bb1b3d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T23:03:39.669+08:00: ***** SERVER RESTARTED *****
        2020-11-15T23:03:40.681+08:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
        2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
peoples_info
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
        "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1,
        "classes" : 1
}
> 

  导入csv格式数据到node12上的testdb库下的test1集合中去

[[email protected] ~]# mongoimport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test1 --type csv --headerline --file ./peoples-peopels_info.csv 
2020-11-15T23:11:42.595+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T23:11:42.692+0800    10000 document(s) imported successfully. 0 document(s) failed to import.
[[email protected] ~]#

  提示:导入csv格式的数据需要明确指定类型为csv,然后使用–headerline指定不导入第一行列名,–file使用用于指定csv格式文件的名称;

  验证:登录node12的mongodb,查看testdb库下是否有test1集合?对应集合是否有数据呢?

[[email protected] ~]# mongo -utom -p123456 192.168.0.52:27017/admin
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("72a07318-ac04-46f9-a310-13b1241d2f77") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin    0.000GB
config   0.000GB
local    0.000GB
peoples  0.000GB
testdb   0.000GB
> use testdb
switched to db testdb
> show collections
test
test1
> db.test1.count()
10000
> db.test1.findOne()
{
        "_id" : ObjectId("5fb1452ef09b563b65405f7c"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1
}
> 

  提示:可以看到testdb库下的test1结合就没有classes字段信息了,这是因为我们导出数据时没有指定要导出classes字段,所以导入的数据当然也是没有classes字段信息;以上就是mongodump/mongorestore和mongoexport/mongoimport工具的使用和测试;

  全量备份加oplog实现恢复mongodb数据库到指定时间点的数据

  在mongodump备份数据时,我们可以使用–oplog选项来记录开始dump数据到dump数据结束后的中间一段时间mongodb数据发生变化的日志;我们知道oplog就是用来记录mongodb中的集合写操作的日志,类似mysql中的binlog;我们可以使用oplog将备份期间发生变化的数据一起恢复,这样恢复出来的数据才是我们真正备份时的所有数据;

  模拟备份时,一边插入数据,一边备份数据

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=1000000;i++) db.test3.insert({id:i,name:"test3-oplog"+i,commit:"test3"+i})

  

  在另外一边同时对数据做备份

[[email protected] ~]# rm -rf *
[[email protected] ~]# ll
total 0
[[email protected] ~]#  mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplog -o ./alldatabase
2020-11-15T23:51:40.606+0800    writing admin.system.users to alldatabase/admin/system.users.bson
2020-11-15T23:51:40.606+0800    done dumping admin.system.users (4 documents)
2020-11-15T23:51:40.607+0800    writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-15T23:51:40.608+0800    done dumping admin.system.version (2 documents)
2020-11-15T23:51:40.609+0800    writing testdb.test1 to alldatabase/testdb/test1.bson
2020-11-15T23:51:40.611+0800    writing testdb.test3 to alldatabase/testdb/test3.bson
2020-11-15T23:51:40.612+0800    writing testdb.test to alldatabase/testdb/test.bson
2020-11-15T23:51:40.612+0800    writing peoples.peoples_info to alldatabase/peoples/peoples_info.bson
2020-11-15T23:51:40.696+0800    done dumping peoples.peoples_info (10000 documents)
2020-11-15T23:51:40.761+0800    done dumping testdb.test3 (54167 documents)
2020-11-15T23:51:40.803+0800    done dumping testdb.test (31571 documents)
2020-11-15T23:51:40.966+0800    done dumping testdb.test1 (79830 documents)
2020-11-15T23:51:40.972+0800    writing captured oplog to 
2020-11-15T23:51:40.980+0800            dumped 916 oplog entries
[[email protected] ~]# ll
total 0
drwxr-xr-x 5 root root 66 Nov 15 23:51 alldatabase
[[email protected] ~]# tree alldatabase/
alldatabase/
├── admin
│   ├── system.users.bson
│   ├── system.users.metadata.json
│   ├── system.version.bson
│   └── system.version.metadata.json
├── oplog.bson
├── peoples
│   ├── peoples_info.bson
│   └── peoples_info.metadata.json
└── testdb
    ├── test1.bson
    ├── test1.metadata.json
    ├── test3.bson
    ├── test3.metadata.json
    ├── test.bson
    └── test.metadata.json

3 directories, 13 files
[[email protected] ~]# 

  提示:可以看到现在备份就多了一个oplog.bson;

  查看oplog.bson中第一行记录的数据和第二行记录的数据

[[email protected] ~]# ls
alldatabase
[[email protected] ~]# cd alldatabase/
[[email protected] alldatabase]# ls
admin  oplog.bson  peoples  testdb
[[email protected] alldatabase]# bsondump oplog.bson > /tmp/oplog.bson.tmp
2020-11-15T23:55:04.801+0800    916 objects found
[[email protected] alldatabase]# head -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a2ac"},"id":{"$numberDouble":"54101.0"},"name":"test3-oplog54101","commit":"test354101"},"ts":{"$timestamp":{"t":1605455500,"i":1880}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500608"}},"v":{"$numberLong":"2"}}
[[email protected] alldatabase]# tail -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a63f"},"id":{"$numberDouble":"55016.0"},"name":"test3-oplog55016","commit":"test355016"},"ts":{"$timestamp":{"t":1605455500,"i":2795}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500961"}},"v":{"$numberLong":"2"}}
[[email protected] alldatabase]# 

  提示:可以看到oplog中记录了id为54101-55016数据,这也就说明了我们开始dump数据时,到dump结束后,数据一致在发生变化,所以我们dump下来的数据是一个中间状态的数据;这里需要说明一点使用mongodump –oplog选项时,不能指定库,因为oplog是对所有库,而不针对某个库记录,所以–oplog只有在备份所有数据库生效;

  删除testdb数据库,然后基于我们刚才dump的数据做数据恢复

test_replset:PRIMARY> show dbs
admin    0.000GB
config   0.000GB
local    0.014GB
peoples  0.000GB
testdb   0.019GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> db.dropDatabase()
{
        "dropped" : "testdb",
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1605456134, 4),
                "signature" : {
                        "hash" : BinData(0,"cRAdXcUj5c48Q77rCJ1DeeF10u8="),
                        "keyId" : NumberLong("6895378399531892740")
                }
        },
        "operationTime" : Timestamp(1605456134, 4)
}
test_replset:PRIMARY> show dbs
admin    0.000GB
config   0.000GB
local    0.014GB
peoples  0.000GB
test_replset:PRIMARY> 

  使用mongorestore恢复数据

[[email protected] ~]# ls
alldatabase
[[email protected] ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplogReplay --drop ./alldatabase/
2020-11-16T00:06:32.049+0800    preparing collections to restore from
2020-11-16T00:06:32.053+0800    reading metadata for testdb.test1 from alldatabase/testdb/test1.metadata.json
2020-11-16T00:06:32.060+0800    reading metadata for testdb.test3 from alldatabase/testdb/test3.metadata.json
2020-11-16T00:06:32.064+0800    reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T00:06:32.064+0800    restoring testdb.test1 from alldatabase/testdb/test1.bson
2020-11-16T00:06:32.074+0800    restoring testdb.test3 from alldatabase/testdb/test3.bson
2020-11-16T00:06:32.093+0800    restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T00:06:32.098+0800    reading metadata for peoples.peoples_info from alldatabase/peoples/peoples_info.metadata.json
2020-11-16T00:06:32.110+0800    restoring peoples.peoples_info from alldatabase/peoples/peoples_info.bson
2020-11-16T00:06:32.333+0800    no indexes to restore
2020-11-16T00:06:32.333+0800    finished restoring peoples.peoples_info (10000 documents, 0 failures)
2020-11-16T00:06:32.766+0800    no indexes to restore
2020-11-16T00:06:32.766+0800    finished restoring testdb.test (31571 documents, 0 failures)
2020-11-16T00:06:33.023+0800    no indexes to restore
2020-11-16T00:06:33.023+0800    finished restoring testdb.test3 (54167 documents, 0 failures)
2020-11-16T00:06:33.370+0800    no indexes to restore
2020-11-16T00:06:33.370+0800    finished restoring testdb.test1 (79830 documents, 0 failures)
2020-11-16T00:06:33.370+0800    restoring users from alldatabase/admin/system.users.bson
2020-11-16T00:06:33.416+0800    replaying oplog
2020-11-16T00:06:33.850+0800    applied 916 oplog entries
2020-11-16T00:06:33.850+0800    175568 document(s) restored successfully. 0 document(s) failed to restore.
[[email protected] ~]# 

  提示:恢复是需要使用–oplogReplay选项来指定重放oplog.bson中的内容;从上面恢复日志可以看到从oplog中恢复了916条数据;也就是说从dump数据开始的那一刻开始到dump结束期间有916条数据发生变化;

  验证:连接数据库,看看对应的testdb库下的test3集合恢复了多少条数据?

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test1
test3
test_replset:PRIMARY> db.test3.count()
55016
test_replset:PRIMARY> 

  提示:可以看到test3集合恢复了55016条数据;刚好可以和oplog.bson中的最后一条数据的id对应起来;

  备份oplog.rs实现指定恢复到某个时间节点

  为了演示容易看出效果,我这里从新将数据库清空,关闭了认证功能

  插入数据

test_replset:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=100000;i++) db.test.insert({id:(i+10000),name:"test-oplog"+i,commit:"test"+i})

  同时备份数据,这次不加–oplog选项

[[email protected] ~]# ll
total 0
[[email protected] ~]# mongodump -h node12:27017  -o ./alldatabase
2020-11-16T09:38:00.921+0800	writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-16T09:38:00.923+0800	done dumping admin.system.version (1 document)
2020-11-16T09:38:00.924+0800	writing testdb.test to alldatabase/testdb/test.bson
2020-11-16T09:38:00.960+0800	done dumping testdb.test (16377 documents)
[[email protected] ~]# ll 
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[[email protected] ~]# tree ./alldatabase
./alldatabase
├── admin
│   ├── system.version.bson
│   └── system.version.metadata.json
└── testdb
    ├── test.bson
    └── test.metadata.json

2 directories, 4 files
[[email protected] ~]# 

  提示:我们在一边插入数据,一边备份数据,从上面的被日志可以看到,我们备份testdb库下的test集合16377条数据,很显然这不是testdb.test集合的所有数据;我们备份的只是部分数据;正常情况等数据插入完成以后,testdb.test集合应该有100000条数据;

  验证:查看testdb.test集合是否有100000条数据?

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY> 

  模拟误操作删除testdb.test集合所有数据

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.remove({})
WriteResult({ "nRemoved" : 100000 })
test_replset:PRIMARY> 

  提示:现在我们不小心把testdb.test集合给删除了,现在如果用之前的备份肯定只能恢复部分数据,怎么办呢?我们这个时候可以导出oplog.rs集合,这个集合就是oplog存放数据的集合,它位于local库下;

  备份local库中的oplog.rs集合

[[email protected] ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[[email protected] ~]# mongodump -h node12:27017 -d local -c oplog.rs -o ./oplog-rs
2020-11-16T09:43:38.594+0800	writing local.oplog.rs to oplog-rs/local/oplog.rs.bson
2020-11-16T09:43:38.932+0800	done dumping local.oplog.rs (200039 documents)
[[email protected] ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
drwxr-xr-x 3 root root 19 Nov 16 09:43 oplog-rs
[[email protected] ~]# tree ./oplog-rs
./oplog-rs
└── local
    ├── oplog.rs.bson
    └── oplog.rs.metadata.json

1 directory, 2 files
[[email protected] ~]# 

  提示:oplog存放在local库下的oplog.rs集合中,以上操作就是备份所有的oplog;现在我们准备好一个oplog,但是现在还不能直接恢复,如果直接恢复,我们的误操作也会跟着一起重放没有任何意义,现在我们需要找到误操作的时间点,然后在恢复;

  在oplog中查找误删除的时间

[[email protected] ~]# bsondump oplog-rs/local/oplog.rs.bson |egrep "\"op\":\"d\""|head -n 3
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa146"}},"ts":{"$timestamp":{"t":1605490915,"i":1}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915399"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa147"}},"ts":{"$timestamp":{"t":1605490915,"i":2}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa148"}},"ts":{"$timestamp":{"t":1605490915,"i":3}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
2020-11-16T09:46:20.363+0800	100074 objects found
2020-11-16T09:46:20.363+0800	write /dev/stdout: broken pipe
[[email protected] ~]# 

  提示:我们要恢复到第一次删除前的数据,我们就选择第一条日志中的$timestamp字段中的{“t”:1605490915,”i”:1};这个就是我们第一次删除的时间信息;

  复制oplog.rs.bson到备份的数据目录为oplog.bson,模拟出使用–oplog选项备份的备份环境

[[email protected] ~]# cp ./oplog-rs/local/oplog.rs.bson ./alldatabase/oplog.bson
[[email protected] ~]# 

  在使用mongorestore进行恢复数据,指定恢复到第一次删除数据前的时间点所有数据

[[email protected] ~]# mongorestore -h node12:27017 --oplogReplay  --oplogLimit "1605490915:1" --drop ./alldatabase/
2020-11-16T09:51:19.658+0800	preparing collections to restore from
2020-11-16T09:51:19.668+0800	reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T09:51:19.693+0800	restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T09:51:19.983+0800	no indexes to restore
2020-11-16T09:51:19.983+0800	finished restoring testdb.test (16377 documents, 0 failures)
2020-11-16T09:51:19.983+0800	replaying oplog
2020-11-16T09:51:22.657+0800	oplog  537KB
2020-11-16T09:51:25.657+0800	oplog  1.12MB
2020-11-16T09:51:28.657+0800	oplog  1.72MB
2020-11-16T09:51:31.657+0800	oplog  2.32MB
2020-11-16T09:51:34.657+0800	oplog  2.92MB
2020-11-16T09:51:37.657+0800	oplog  3.51MB
2020-11-16T09:51:40.657+0800	oplog  4.11MB
2020-11-16T09:51:43.657+0800	oplog  4.71MB
2020-11-16T09:51:46.657+0800	oplog  5.30MB
2020-11-16T09:51:49.657+0800	oplog  5.90MB
2020-11-16T09:51:52.657+0800	oplog  6.46MB
2020-11-16T09:51:55.657+0800	oplog  7.04MB
2020-11-16T09:51:58.657+0800	oplog  7.61MB
2020-11-16T09:52:01.657+0800	oplog  8.20MB
2020-11-16T09:52:04.657+0800	oplog  8.77MB
2020-11-16T09:52:07.657+0800	oplog  9.36MB
2020-11-16T09:52:10.657+0800	oplog  9.96MB
2020-11-16T09:52:13.657+0800	oplog  10.6MB
2020-11-16T09:52:16.656+0800	oplog  11.2MB
2020-11-16T09:52:19.657+0800	oplog  11.8MB
2020-11-16T09:52:22.657+0800	oplog  12.4MB
2020-11-16T09:52:25.657+0800	oplog  13.0MB
2020-11-16T09:52:28.657+0800	oplog  13.6MB
2020-11-16T09:52:31.657+0800	oplog  14.2MB
2020-11-16T09:52:34.657+0800	oplog  14.8MB
2020-11-16T09:52:37.657+0800	oplog  15.4MB
2020-11-16T09:52:40.657+0800	oplog  16.0MB
2020-11-16T09:52:43.657+0800	oplog  16.6MB
2020-11-16T09:52:46.657+0800	oplog  17.2MB
2020-11-16T09:52:49.657+0800	oplog  17.8MB
2020-11-16T09:52:52.433+0800	skipping applying the config.system.sessions namespace in applyOps
2020-11-16T09:52:52.433+0800	applied 100008 oplog entries
2020-11-16T09:52:52.433+0800	oplog  18.4MB
2020-11-16T09:52:52.433+0800	16377 document(s) restored successfully. 0 document(s) failed to restore.
[[email protected] ~]# 

  提示:从上面的恢复日志可以看到oplog恢复了100008条,备份的16377条数据也成功恢复;

  验证:查看testdb.test集合是否恢复?数据恢复了多少条呢?

test_replset:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.010GB
testdb  0.004GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY> 

  提示:可以看到testdb.test集合恢复了100000条数据;

  以上就是mongodb的备份与恢复相关话题的实践;