分佈式文檔存儲數據庫之MongoDB備份與恢復

  前文我們聊了下mongodb的訪問控制以及用戶創建和角色分配,回顧請參考//www.cnblogs.com/qiuhom-1874/p/13974656.html;今天我們來了解下mongodb的備份與恢復

  為什麼要備份?

  備份的目的是對數據做冗餘的一種方式,它能夠讓我們在某種情況下保證最少數據的丟失;之前我們對mongodb做副本集也是對數據做冗餘,但是這種在副本集上做數據冗餘僅僅是針對系統故障或服務異常等一些非人為的故障發生時,保證數據服務的可用性;它不能夠避免人為的誤操作;為了使得數據的安全,將數據損失降低到最小,我們必須對數據庫周期性的做備份;

  常用備份方法

  提示:上圖主要描述了mongodb數據庫上常用備份策略,我們可以邏輯備份,邏輯備份是將數據庫中的數據導出成語句,通常使用專用工具導出和導入來完成一次備份與恢復;其次我們也可以物理備份,簡單講物理備份就是把數據庫文件打包,備份;恢復時直接將對應的數據庫文件解壓恢復即可;另外一種快速物理備份的方式就是給數據拍快照,拍快照可以將數據保存為當前拍快照時的狀態;如果我們要進行恢復直接恢復快照即可;

  mongodb邏輯備份和物理備份比較

  提示:從上圖描述可以看出總體上來看物理備份效率和恢復效率要高於邏輯;物理備份效率高於邏輯備份,其主要原因是邏輯備份是通過數據庫接口將數據讀取出來,然後保存為對應格式的文件,而物理備份只需要將數據文件直接打包備份,不需要一條一條的讀取數據,然後寫入到其他文件,這中間就省去了讀寫過程,所以物理備份效率高;恢復也是類似的過程,物理恢復也是省去了讀寫的過程;

  mongodb邏輯備份工具

  在mongodb中使用邏輯備份的工具有兩組,第一組是mongodump/mongorestore,使用mongodump/mongorestore這組工具來邏輯的備份數據,它備份出來的數據是BSON格式,BSON是一種二進制格式,通常無法使用文本編輯器直接打開查看其內容,對人類的可讀性較差,但它的優點是保存的文件體積要小;使用這組命令導出的數據,在恢復是依賴mongodb版本,不同版本導出的BSON格式略有不同,所以恢復時,可能存在版本不同而導致恢複數據失敗的情況;另外一組是mongoexport/mongoimport,這組工具導出的數據是json格式的數據,通常我們可以使用文本編輯器打開直接查看,對人類的可讀性較好,但體積相對BSON格式的數據要大,恢復時不依賴版本;所以跨版本備份要先查看下對應版本的兼容性,如果兼容使用mongodump/mongorestore,不兼容的話建議使用mongoexport/mongoimport;這裡需要注意一點,JSON格式雖然可讀性很好,也很通用,但是它只是保留了數據部分,而沒有保留索引,賬戶等基礎信息,在使用是應該注意;

  使用mongodump備份數據

  插入數據

> use testdb
switched to db testdb
> for(i=1;i<=1000;i++) db.test.insert({id:i,name:"test"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show tables
test
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> db.test.count()
1000
> 

  備份所有數據庫

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -o ./node12_mongodb_full_backup
2020-11-15T21:47:45.439+0800    writing admin.system.users to node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T21:47:45.442+0800    done dumping admin.system.users (4 documents)
2020-11-15T21:47:45.443+0800    writing admin.system.version to node12_mongodb_full_backup/admin/system.version.bson
2020-11-15T21:47:45.447+0800    done dumping admin.system.version (2 documents)
2020-11-15T21:47:45.448+0800    writing testdb.test to node12_mongodb_full_backup/testdb/test.bson
2020-11-15T21:47:45.454+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# ls
node12_mongodb_full_backup
[root@node11 ~]# ll node12_mongodb_full_backup/
total 0
drwxr-xr-x 2 root root 128 Nov 15 21:47 admin
drwxr-xr-x 2 root root  49 Nov 15 21:47 testdb
[root@node11 ~]# tree node12_mongodb_full_backup/
node12_mongodb_full_backup/
├── admin
│   ├── system.users.bson
│   ├── system.users.metadata.json
│   ├── system.version.bson
│   └── system.version.metadata.json
└── testdb
    ├── test.bson
    └── test.metadata.json

2 directories, 6 files
[root@node11 ~]# 

  提示:-u用於指定用戶,-p指定對應用戶的密碼,-h指定數據庫地址,–authenticationDatabase 指定驗證用戶和密碼對應的數據庫 -o指定要存放備份文件的目錄名稱;

  只備份單個testdb數據庫

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -o ./node12_testdb
2020-11-15T21:53:36.523+0800    writing testdb.test to node12_testdb/testdb/test.bson
2020-11-15T21:53:36.526+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_testdb
./node12_testdb
└── testdb
    ├── test.bson
    └── test.metadata.json

1 directory, 2 files
[root@node11 ~]# 

  提示:-d用戶指定要備份的數據庫名稱;

  只備份testdb下的test集合

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test -o ./node12_testdb_test-collection
2020-11-15T21:55:48.217+0800    writing testdb.test to node12_testdb_test-collection/testdb/test.bson
2020-11-15T21:55:48.219+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_testdb_test-collection
./node12_testdb_test-collection
└── testdb
    ├── test.bson
    └── test.metadata.json

1 directory, 2 files
[root@node11 ~]# 

  提示:-c用於指定要備份的集合(collection)名稱;

  壓縮備份testdb庫

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip -o ./node12_mongodb_testdb-gzip 
2020-11-15T22:00:52.268+0800    writing testdb.test to node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:00:52.273+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_mongodb_testdb-gzip
./node12_mongodb_testdb-gzip
└── testdb
    ├── test.bson.gz
    └── test.metadata.json.gz

1 directory, 2 files
[root@node11 ~]# 

  提示:可以看到使用壓縮,只需要加上–gzip選項即可,備份出來的數據就是.gz後綴結尾的壓縮文件;

  壓縮備份testdb庫下的test集合

[root@node11 ~]# mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --gzip -o ./node12_mongodb_testdb-test-gzip 
2020-11-15T22:01:31.492+0800    writing testdb.test to node12_mongodb_testdb-test-gzip/testdb/test.bson.gz
2020-11-15T22:01:31.500+0800    done dumping testdb.test (1000 documents)
[root@node11 ~]# tree ./node12_mongodb_testdb-test-gzip
./node12_mongodb_testdb-test-gzip
└── testdb
    ├── test.bson.gz
    └── test.metadata.json.gz

1 directory, 2 files
[root@node11 ~]# 

  使用mongorestore恢複數據

  在node12上刪除testdb

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  全量恢復所有數據庫

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --drop ./node12_mongodb_full_backup
2020-11-15T22:07:35.465+0800    preparing collections to restore from
2020-11-15T22:07:35.467+0800    reading metadata for testdb.test from node12_mongodb_full_backup/testdb/test.metadata.json
2020-11-15T22:07:35.475+0800    restoring testdb.test from node12_mongodb_full_backup/testdb/test.bson
2020-11-15T22:07:35.486+0800    no indexes to restore
2020-11-15T22:07:35.486+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:07:35.486+0800    restoring users from node12_mongodb_full_backup/admin/system.users.bson
2020-11-15T22:07:35.528+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]#

  提示:–drop用於指定,恢復是如果對應數據庫或者colleciton存在,則先刪除然後在恢復,這樣做的目的是保證恢復的數據和備份的數據一致;

  驗證:登錄192.168.0.52:27017查看對應testdb數據庫是否恢復?

[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin 
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("af96cb64-a2a4-4d59-b60a-86ccbbe77e3e") }
MongoDB server version: 4.4.1
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        //docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
        //community.mongodb.com
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  恢復單個庫

  刪除testdb庫

> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  使用mongorestore恢復testdb庫

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --drop ./node12_testdb/testdb/
2020-11-15T22:29:03.718+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:29:03.718+0800    building a list of collections to restore from node12_testdb/testdb dir
2020-11-15T22:29:03.719+0800    reading metadata for testdb.test from node12_testdb/testdb/test.metadata.json
2020-11-15T22:29:03.736+0800    restoring testdb.test from node12_testdb/testdb/test.bson
2020-11-15T22:29:03.755+0800    no indexes to restore
2020-11-15T22:29:03.755+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:29:03.755+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin 
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("f5e73939-bb87-4d45-bf80-9ff1e7f6f15d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show tables
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  恢復單個集合

  刪除testdb下的test集合

> db
testdb
> show collections
test
> db.test.drop()
true
> show collections
> 

  使用mongorestore恢復testdb下的test集合

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test --drop ./node12_testdb_test-collection/testdb/test.bson 
2020-11-15T22:36:15.615+0800    checking for collection data in node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.616+0800    reading metadata for testdb.test from node12_testdb_test-collection/testdb/test.metadata.json
2020-11-15T22:36:15.625+0800    restoring testdb.test from node12_testdb_test-collection/testdb/test.bson
2020-11-15T22:36:15.669+0800    no indexes to restore
2020-11-15T22:36:15.669+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:36:15.669+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin                                              MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("27d15d9e-3fdf-4efc-b871-1ec6716e51e3") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  使用壓縮文件恢複數據庫

  刪除testdb數據庫

> db
testdb
> db.dropDatabase()
{ "dropped" : "testdb", "ok" : 1 }
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
> 

  使用mongorestore工具加載壓縮文件恢複數據庫

[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb --gzip --drop ./node12_mongodb_testdb-gzip/testdb/
2020-11-15T22:39:55.313+0800    The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2020-11-15T22:39:55.313+0800    building a list of collections to restore from node12_mongodb_testdb-gzip/testdb dir
2020-11-15T22:39:55.314+0800    reading metadata for testdb.test from node12_mongodb_testdb-gzip/testdb/test.metadata.json.gz
2020-11-15T22:39:55.321+0800    restoring testdb.test from node12_mongodb_testdb-gzip/testdb/test.bson.gz
2020-11-15T22:39:55.332+0800    no indexes to restore
2020-11-15T22:39:55.332+0800    finished restoring testdb.test (1000 documents, 0 failures)
2020-11-15T22:39:55.332+0800    1000 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin                                              MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("73d98c33-f8f7-40e3-89bd-fda8c702e407") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
test
> db.test.count()
1000
> db.test.findOne()
{
        "_id" : ObjectId("5fb130da012870b3c8e3c4ad"),
        "id" : 1,
        "name" : "test1",
        "age" : 1,
        "classes" : 1
}
> 

  提示:使用mongorestore恢復單個庫使用-d選線指定要恢復的數據庫,恢復單個集合使用-c指定集合名稱即可,以及使用壓縮文件恢復加上對應的–gzip選項即可,總之,備份時用的選項在恢復時也應當使用對應的選項,這個mongodump備份使用的選項沒有特別的不同;

  使用mongoexport備份數據

  新建peoples數據庫,並向peoples_info集合中插入數據

> use peoples
switched to db peoples
> for(i=1;i<=10000;i++) db.peoples_info.insert({id:i,name:"peoples"+i,age:(i%120),classes:(i%25)})
WriteResult({ "nInserted" : 1 })
> show dbs
admin    0.000GB
config   0.000GB
local    0.000GB
peoples  0.000GB
testdb   0.000GB
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
        "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1,
        "classes" : 1
}
> 

  使用mongoexport工具peoples庫下的peoples_info集合

[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type json -o ./peoples-peopels_info.json
2020-11-15T22:54:18.287+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:54:18.370+0800    exported 10000 records
[root@node11 ~]# ll
total 1004
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[root@node11 ~]# head -n 1 peoples-peopels_info.json 
{"_id":{"$oid":"5fb13f35012870b3c8e3c895"},"id":1.0,"name":"peoples1","age":1.0,"classes":1.0}
[root@node11 ~]# 

  提示:使用–type可以指定導出數據文件的格式,默認是json格式,當然也可以指定csv格式;這裡還需要注意mongoexport這個工具導出數據必須要指定數據庫和對應集合,它不能直接對整個數據庫下的所有集合做導出;只能單個單個的導;

  導出csv格式的數據文件

[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -o ./peoples-peopels_info.csv
2020-11-15T22:58:30.495+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:58:30.498+0800    Failed: CSV mode requires a field list
[root@node11 ~]# mongoexport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d peoples -c peoples_info --type csv -f id,name,age -o ./peoples-peopels_info.csv  
2020-11-15T22:59:26.090+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T22:59:26.143+0800    exported 10000 records
[root@node11 ~]# head -n 1 ./peoples-peopels_info.csv
id,name,age
[root@node11 ~]# head  ./peoples-peopels_info.csv    
id,name,age
1,peoples1,1
2,peoples2,2
3,peoples3,3
4,peoples4,4
5,peoples5,5
6,peoples6,6
7,peoples7,7
8,peoples8,8
9,peoples9,9
[root@node11 ~]# 

  提示:導出指定格式為csv時,必須用-f選項指定導出的字段名稱,分別用逗號隔開;

  將數據導入到node11的mongodb上

  導入json格式數據

[root@node11 ~]# systemctl start mongod.service 
[root@node11 ~]# ss -tnl
State      Recv-Q Send-Q         Local Address:Port                        Peer Address:Port              
LISTEN     0      128                        *:22                                     *:*                  
LISTEN     0      100                127.0.0.1:25                                     *:*                  
LISTEN     0      128                127.0.0.1:27017                                  *:*                  
LISTEN     0      128                       :::22                                    :::*                  
LISTEN     0      100                      ::1:25                                    :::*                  
[root@node11 ~]# ll
total 1200
-rw-r--r-- 1 root root  198621 Nov 15 22:59 peoples-peopels_info.csv
-rw-r--r-- 1 root root 1024609 Nov 15 22:54 peoples-peopels_info.json
[root@node11 ~]# mongoimport  -d testdb -c peoples_info --drop peoples-peopels_info.json 
2020-11-15T23:05:03.004+0800    connected to: mongodb://localhost/
2020-11-15T23:05:03.005+0800    dropping: testdb.peoples_info
2020-11-15T23:05:03.186+0800    10000 document(s) imported successfully. 0 document(s) failed to import.
[root@node11 ~]#

  提示:導入數據時可以任意指定數據庫以及集合名稱;

  驗證:查看node11上的testdb庫下是否有peoples_info集合?集合中是否有數據呢?

[root@node11 ~]# mongo
MongoDB shell version v4.4.1
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("4e3a00b0-8367-4b3a-9a77-e61d03bb1b3d") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T23:03:39.669+08:00: ***** SERVER RESTARTED *****
        2020-11-15T23:03:40.681+08:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
        2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T23:03:40.681+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
testdb  0.000GB
> use testdb
switched to db testdb
> show collections
peoples_info
> db.peoples_info.count()
10000
> db.peoples_info.findOne()
{
        "_id" : ObjectId("5fb13f35012870b3c8e3c895"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1,
        "classes" : 1
}
> 

  導入csv格式數據到node12上的testdb庫下的test1集合中去

[root@node11 ~]# mongoimport -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin -d testdb -c test1 --type csv --headerline --file ./peoples-peopels_info.csv 
2020-11-15T23:11:42.595+0800    connected to: mongodb://192.168.0.52:27017/
2020-11-15T23:11:42.692+0800    10000 document(s) imported successfully. 0 document(s) failed to import.
[root@node11 ~]#

  提示:導入csv格式的數據需要明確指定類型為csv,然後使用–headerline指定不導入第一行列名,–file使用用於指定csv格式文件的名稱;

  驗證:登錄node12的mongodb,查看testdb庫下是否有test1集合?對應集合是否有數據呢?

[root@node11 ~]# mongo -utom -p123456 192.168.0.52:27017/admin
MongoDB shell version v4.4.1
connecting to: mongodb://192.168.0.52:27017/admin?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("72a07318-ac04-46f9-a310-13b1241d2f77") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-11-15T20:42:23.774+08:00: ***** SERVER RESTARTED *****
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-11-15T20:42:29.198+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> show dbs
admin    0.000GB
config   0.000GB
local    0.000GB
peoples  0.000GB
testdb   0.000GB
> use testdb
switched to db testdb
> show collections
test
test1
> db.test1.count()
10000
> db.test1.findOne()
{
        "_id" : ObjectId("5fb1452ef09b563b65405f7c"),
        "id" : 1,
        "name" : "peoples1",
        "age" : 1
}
> 

  提示:可以看到testdb庫下的test1結合就沒有classes字段信息了,這是因為我們導出數據時沒有指定要導出classes字段,所以導入的數據當然也是沒有classes字段信息;以上就是mongodump/mongorestore和mongoexport/mongoimport工具的使用和測試;

  全量備份加oplog實現恢復mongodb數據庫到指定時間點的數據

  在mongodump備份數據時,我們可以使用–oplog選項來記錄開始dump數據到dump數據結束後的中間一段時間mongodb數據發生變化的日誌;我們知道oplog就是用來記錄mongodb中的集合寫操作的日誌,類似mysql中的binlog;我們可以使用oplog將備份期間發生變化的數據一起恢復,這樣恢復出來的數據才是我們真正備份時的所有數據;

  模擬備份時,一邊插入數據,一邊備份數據

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=1000000;i++) db.test3.insert({id:i,name:"test3-oplog"+i,commit:"test3"+i})

  

  在另外一邊同時對數據做備份

[root@node11 ~]# rm -rf *
[root@node11 ~]# ll
total 0
[root@node11 ~]#  mongodump -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplog -o ./alldatabase
2020-11-15T23:51:40.606+0800    writing admin.system.users to alldatabase/admin/system.users.bson
2020-11-15T23:51:40.606+0800    done dumping admin.system.users (4 documents)
2020-11-15T23:51:40.607+0800    writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-15T23:51:40.608+0800    done dumping admin.system.version (2 documents)
2020-11-15T23:51:40.609+0800    writing testdb.test1 to alldatabase/testdb/test1.bson
2020-11-15T23:51:40.611+0800    writing testdb.test3 to alldatabase/testdb/test3.bson
2020-11-15T23:51:40.612+0800    writing testdb.test to alldatabase/testdb/test.bson
2020-11-15T23:51:40.612+0800    writing peoples.peoples_info to alldatabase/peoples/peoples_info.bson
2020-11-15T23:51:40.696+0800    done dumping peoples.peoples_info (10000 documents)
2020-11-15T23:51:40.761+0800    done dumping testdb.test3 (54167 documents)
2020-11-15T23:51:40.803+0800    done dumping testdb.test (31571 documents)
2020-11-15T23:51:40.966+0800    done dumping testdb.test1 (79830 documents)
2020-11-15T23:51:40.972+0800    writing captured oplog to 
2020-11-15T23:51:40.980+0800            dumped 916 oplog entries
[root@node11 ~]# ll
total 0
drwxr-xr-x 5 root root 66 Nov 15 23:51 alldatabase
[root@node11 ~]# tree alldatabase/
alldatabase/
├── admin
│   ├── system.users.bson
│   ├── system.users.metadata.json
│   ├── system.version.bson
│   └── system.version.metadata.json
├── oplog.bson
├── peoples
│   ├── peoples_info.bson
│   └── peoples_info.metadata.json
└── testdb
    ├── test1.bson
    ├── test1.metadata.json
    ├── test3.bson
    ├── test3.metadata.json
    ├── test.bson
    └── test.metadata.json

3 directories, 13 files
[root@node11 ~]# 

  提示:可以看到現在備份就多了一個oplog.bson;

  查看oplog.bson中第一行記錄的數據和第二行記錄的數據

[root@node11 ~]# ls
alldatabase
[root@node11 ~]# cd alldatabase/
[root@node11 alldatabase]# ls
admin  oplog.bson  peoples  testdb
[root@node11 alldatabase]# bsondump oplog.bson > /tmp/oplog.bson.tmp
2020-11-15T23:55:04.801+0800    916 objects found
[root@node11 alldatabase]# head -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a2ac"},"id":{"$numberDouble":"54101.0"},"name":"test3-oplog54101","commit":"test354101"},"ts":{"$timestamp":{"t":1605455500,"i":1880}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500608"}},"v":{"$numberLong":"2"}}
[root@node11 alldatabase]# tail -n 1 /tmp/oplog.bson.tmp
{"op":"i","ns":"testdb.test3","ui":{"$binary":{"base64":"7PmE47CASOiQZt5sMGDZKw==","subType":"04"}},"o":{"_id":{"$oid":"5fb14e8c01fff06b2b50a63f"},"id":{"$numberDouble":"55016.0"},"name":"test3-oplog55016","commit":"test355016"},"ts":{"$timestamp":{"t":1605455500,"i":2795}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605455500961"}},"v":{"$numberLong":"2"}}
[root@node11 alldatabase]# 

  提示:可以看到oplog中記錄了id為54101-55016數據,這也就說明了我們開始dump數據時,到dump結束後,數據一致在發生變化,所以我們dump下來的數據是一個中間狀態的數據;這裡需要說明一點使用mongodump –oplog選項時,不能指定庫,因為oplog是對所有庫,而不針對某個庫記錄,所以–oplog只有在備份所有數據庫生效;

  刪除testdb數據庫,然後基於我們剛才dump的數據做數據恢復

test_replset:PRIMARY> show dbs
admin    0.000GB
config   0.000GB
local    0.014GB
peoples  0.000GB
testdb   0.019GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> db.dropDatabase()
{
        "dropped" : "testdb",
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1605456134, 4),
                "signature" : {
                        "hash" : BinData(0,"cRAdXcUj5c48Q77rCJ1DeeF10u8="),
                        "keyId" : NumberLong("6895378399531892740")
                }
        },
        "operationTime" : Timestamp(1605456134, 4)
}
test_replset:PRIMARY> show dbs
admin    0.000GB
config   0.000GB
local    0.014GB
peoples  0.000GB
test_replset:PRIMARY> 

  使用mongorestore恢複數據

[root@node11 ~]# ls
alldatabase
[root@node11 ~]# mongorestore -utom -p123456 -h 192.168.0.52:27017 --authenticationDatabase admin --oplogReplay --drop ./alldatabase/
2020-11-16T00:06:32.049+0800    preparing collections to restore from
2020-11-16T00:06:32.053+0800    reading metadata for testdb.test1 from alldatabase/testdb/test1.metadata.json
2020-11-16T00:06:32.060+0800    reading metadata for testdb.test3 from alldatabase/testdb/test3.metadata.json
2020-11-16T00:06:32.064+0800    reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T00:06:32.064+0800    restoring testdb.test1 from alldatabase/testdb/test1.bson
2020-11-16T00:06:32.074+0800    restoring testdb.test3 from alldatabase/testdb/test3.bson
2020-11-16T00:06:32.093+0800    restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T00:06:32.098+0800    reading metadata for peoples.peoples_info from alldatabase/peoples/peoples_info.metadata.json
2020-11-16T00:06:32.110+0800    restoring peoples.peoples_info from alldatabase/peoples/peoples_info.bson
2020-11-16T00:06:32.333+0800    no indexes to restore
2020-11-16T00:06:32.333+0800    finished restoring peoples.peoples_info (10000 documents, 0 failures)
2020-11-16T00:06:32.766+0800    no indexes to restore
2020-11-16T00:06:32.766+0800    finished restoring testdb.test (31571 documents, 0 failures)
2020-11-16T00:06:33.023+0800    no indexes to restore
2020-11-16T00:06:33.023+0800    finished restoring testdb.test3 (54167 documents, 0 failures)
2020-11-16T00:06:33.370+0800    no indexes to restore
2020-11-16T00:06:33.370+0800    finished restoring testdb.test1 (79830 documents, 0 failures)
2020-11-16T00:06:33.370+0800    restoring users from alldatabase/admin/system.users.bson
2020-11-16T00:06:33.416+0800    replaying oplog
2020-11-16T00:06:33.850+0800    applied 916 oplog entries
2020-11-16T00:06:33.850+0800    175568 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# 

  提示:恢復是需要使用–oplogReplay選項來指定重放oplog.bson中的內容;從上面恢復日誌可以看到從oplog中恢復了916條數據;也就是說從dump數據開始的那一刻開始到dump結束期間有916條數據發生變化;

  驗證:連接數據庫,看看對應的testdb庫下的test3集合恢復了多少條數據?

test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test1
test3
test_replset:PRIMARY> db.test3.count()
55016
test_replset:PRIMARY> 

  提示:可以看到test3集合恢復了55016條數據;剛好可以和oplog.bson中的最後一條數據的id對應起來;

  備份oplog.rs實現指定恢復到某個時間節點

  為了演示容易看出效果,我這裡從新將數據庫清空,關閉了認證功能

  插入數據

test_replset:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> for(i=1;i<=100000;i++) db.test.insert({id:(i+10000),name:"test-oplog"+i,commit:"test"+i})

  同時備份數據,這次不加–oplog選項

[root@node11 ~]# ll
total 0
[root@node11 ~]# mongodump -h node12:27017  -o ./alldatabase
2020-11-16T09:38:00.921+0800	writing admin.system.version to alldatabase/admin/system.version.bson
2020-11-16T09:38:00.923+0800	done dumping admin.system.version (1 document)
2020-11-16T09:38:00.924+0800	writing testdb.test to alldatabase/testdb/test.bson
2020-11-16T09:38:00.960+0800	done dumping testdb.test (16377 documents)
[root@node11 ~]# ll 
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[root@node11 ~]# tree ./alldatabase
./alldatabase
├── admin
│   ├── system.version.bson
│   └── system.version.metadata.json
└── testdb
    ├── test.bson
    └── test.metadata.json

2 directories, 4 files
[root@node11 ~]# 

  提示:我們在一邊插入數據,一邊備份數據,從上面的被日誌可以看到,我們備份testdb庫下的test集合16377條數據,很顯然這不是testdb.test集合的所有數據;我們備份的只是部分數據;正常情況等數據插入完成以後,testdb.test集合應該有100000條數據;

  驗證:查看testdb.test集合是否有100000條數據?

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY> 

  模擬誤操作刪除testdb.test集合所有數據

test_replset:PRIMARY> db
testdb
test_replset:PRIMARY> show collections
test
test_replset:PRIMARY> db.test.remove({})
WriteResult({ "nRemoved" : 100000 })
test_replset:PRIMARY> 

  提示:現在我們不小心把testdb.test集合給刪除了,現在如果用之前的備份肯定只能恢復部分數據,怎麼辦呢?我們這個時候可以導出oplog.rs集合,這個集合就是oplog存放數據的集合,它位於local庫下;

  備份local庫中的oplog.rs集合

[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
[root@node11 ~]# mongodump -h node12:27017 -d local -c oplog.rs -o ./oplog-rs
2020-11-16T09:43:38.594+0800	writing local.oplog.rs to oplog-rs/local/oplog.rs.bson
2020-11-16T09:43:38.932+0800	done dumping local.oplog.rs (200039 documents)
[root@node11 ~]# ll
total 0
drwxr-xr-x 4 root root 33 Nov 16 09:38 alldatabase
drwxr-xr-x 3 root root 19 Nov 16 09:43 oplog-rs
[root@node11 ~]# tree ./oplog-rs
./oplog-rs
└── local
    ├── oplog.rs.bson
    └── oplog.rs.metadata.json

1 directory, 2 files
[root@node11 ~]# 

  提示:oplog存放在local庫下的oplog.rs集合中,以上操作就是備份所有的oplog;現在我們準備好一個oplog,但是現在還不能直接恢復,如果直接恢復,我們的誤操作也會跟着一起重放沒有任何意義,現在我們需要找到誤操作的時間點,然後在恢復;

  在oplog中查找誤刪除的時間

[root@node11 ~]# bsondump oplog-rs/local/oplog.rs.bson |egrep "\"op\":\"d\""|head -n 3
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa146"}},"ts":{"$timestamp":{"t":1605490915,"i":1}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915399"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa147"}},"ts":{"$timestamp":{"t":1605490915,"i":2}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
{"op":"d","ns":"testdb.test","ui":{"$binary":{"base64":"C3FH7g1eSWWwHd2AZEJhiw==","subType":"04"}},"o":{"_id":{"$oid":"5fb1d7eb25343e833cbaa148"}},"ts":{"$timestamp":{"t":1605490915,"i":3}},"t":{"$numberLong":"1"},"wall":{"$date":{"$numberLong":"1605490915400"}},"v":{"$numberLong":"2"}}
2020-11-16T09:46:20.363+0800	100074 objects found
2020-11-16T09:46:20.363+0800	write /dev/stdout: broken pipe
[root@node11 ~]# 

  提示:我們要恢復到第一次刪除前的數據,我們就選擇第一條日誌中的$timestamp字段中的{“t”:1605490915,”i”:1};這個就是我們第一次刪除的時間信息;

  複製oplog.rs.bson到備份的數據目錄為oplog.bson,模擬出使用–oplog選項備份的備份環境

[root@node11 ~]# cp ./oplog-rs/local/oplog.rs.bson ./alldatabase/oplog.bson
[root@node11 ~]# 

  在使用mongorestore進行恢複數據,指定恢復到第一次刪除數據前的時間點所有數據

[root@node11 ~]# mongorestore -h node12:27017 --oplogReplay  --oplogLimit "1605490915:1" --drop ./alldatabase/
2020-11-16T09:51:19.658+0800	preparing collections to restore from
2020-11-16T09:51:19.668+0800	reading metadata for testdb.test from alldatabase/testdb/test.metadata.json
2020-11-16T09:51:19.693+0800	restoring testdb.test from alldatabase/testdb/test.bson
2020-11-16T09:51:19.983+0800	no indexes to restore
2020-11-16T09:51:19.983+0800	finished restoring testdb.test (16377 documents, 0 failures)
2020-11-16T09:51:19.983+0800	replaying oplog
2020-11-16T09:51:22.657+0800	oplog  537KB
2020-11-16T09:51:25.657+0800	oplog  1.12MB
2020-11-16T09:51:28.657+0800	oplog  1.72MB
2020-11-16T09:51:31.657+0800	oplog  2.32MB
2020-11-16T09:51:34.657+0800	oplog  2.92MB
2020-11-16T09:51:37.657+0800	oplog  3.51MB
2020-11-16T09:51:40.657+0800	oplog  4.11MB
2020-11-16T09:51:43.657+0800	oplog  4.71MB
2020-11-16T09:51:46.657+0800	oplog  5.30MB
2020-11-16T09:51:49.657+0800	oplog  5.90MB
2020-11-16T09:51:52.657+0800	oplog  6.46MB
2020-11-16T09:51:55.657+0800	oplog  7.04MB
2020-11-16T09:51:58.657+0800	oplog  7.61MB
2020-11-16T09:52:01.657+0800	oplog  8.20MB
2020-11-16T09:52:04.657+0800	oplog  8.77MB
2020-11-16T09:52:07.657+0800	oplog  9.36MB
2020-11-16T09:52:10.657+0800	oplog  9.96MB
2020-11-16T09:52:13.657+0800	oplog  10.6MB
2020-11-16T09:52:16.656+0800	oplog  11.2MB
2020-11-16T09:52:19.657+0800	oplog  11.8MB
2020-11-16T09:52:22.657+0800	oplog  12.4MB
2020-11-16T09:52:25.657+0800	oplog  13.0MB
2020-11-16T09:52:28.657+0800	oplog  13.6MB
2020-11-16T09:52:31.657+0800	oplog  14.2MB
2020-11-16T09:52:34.657+0800	oplog  14.8MB
2020-11-16T09:52:37.657+0800	oplog  15.4MB
2020-11-16T09:52:40.657+0800	oplog  16.0MB
2020-11-16T09:52:43.657+0800	oplog  16.6MB
2020-11-16T09:52:46.657+0800	oplog  17.2MB
2020-11-16T09:52:49.657+0800	oplog  17.8MB
2020-11-16T09:52:52.433+0800	skipping applying the config.system.sessions namespace in applyOps
2020-11-16T09:52:52.433+0800	applied 100008 oplog entries
2020-11-16T09:52:52.433+0800	oplog  18.4MB
2020-11-16T09:52:52.433+0800	16377 document(s) restored successfully. 0 document(s) failed to restore.
[root@node11 ~]# 

  提示:從上面的恢復日誌可以看到oplog恢復了100008條,備份的16377條數據也成功恢復;

  驗證:查看testdb.test集合是否恢復?數據恢復了多少條呢?

test_replset:PRIMARY> show dbs
admin   0.000GB
config  0.000GB
local   0.010GB
testdb  0.004GB
test_replset:PRIMARY> use testdb
switched to db testdb
test_replset:PRIMARY> show tables
test
test_replset:PRIMARY> db.test.count()
100000
test_replset:PRIMARY> 

  提示:可以看到testdb.test集合恢復了100000條數據;

  以上就是mongodb的備份與恢復相關話題的實踐;