本帖最后由 desehawk 于 2015-3-13 18:18 编辑
问题导读
1.MongoDB如何实现增量备份?
2.如何启动mongobackup的流式备份?
mongobackup是用于Mongodb的增量备份与恢复工具,恢复时,需要结合全量备份与恢复使用,来自多云mongodb论坛(http://duoyun.org/topic/52a91d844888b88743179c5e),目前该工具尚未开源。该工具可以实时地读取目标mongo实例的oplog,然后以BSON格式存储到文件中,在做数据恢复时通过回放BSON文件中的oplog实现数据的恢复。这与Mongodb自身提供的备份恢复工具mongodump和mongorestore类似,但是mongobackup在备份和恢复时可以指定时间戳,即可以备份和恢复指定时间段内的数据,因此可以实现增量。
目前mongobackup还没有完善的使用说明文档,因此希望通过试用摸清该工具的使用方法,验证其功能是否正确,
具体流程如下:
1)通过YCSB工具向mongo实例中加载1000W条记录;
2)在数据加载过程中启动mongodump命令对已加载的数据进行备份,而在执行mongodump命令前先启动mongobackup工具实时记录mongo实例的oplog;
3)在数据加载完成后停止mongobackup对于oplog的实时记录;
4)使用mongorestore命令恢复之前通过mongodump备份的部分记录;
5)使用mongobackup对从mongodump执行到数据完全加载完成这段时间内的数据进行恢复,以添加mongodump没有备份的数据。
根据以上流程执行每一步测试,以下为整个测试过程的结果记录:
1. 启动YCSB工具开始加载数据
- ./bin/ycsb load mongodb -threads 100 -P workloads/readandupdateandinsert1 > load.result
复制代码
2. 待数据加载到一半时,启动mongobackup的流式备份,开始记录mongo实例的oplog,起始时间戳为1420446193,1
- [frankey@mongo-server3 ~]$ ./mongobackup --port 27017 --host mongo-server1 --backup --stream -s 1420446193,1
- connected to: mongo-server1:27017
- Mon Jan 5 16:24:41.188 local.oplog.$main to backup/oplog.bson
- Mon Jan 5 16:24:44.003 Backup Progress: 221800/197606 112% (objects)
- ...
复制代码
3. 在启动mongobackup记录oplog的同时,启动mongodump进行部分数据备份,通过执行结果可知mongodump一共备份了5224511条记录
- [frankey@mongo-server3]$ mongodump --host mongo-server1 --port 27017 -o /data/mongobackup/mongo-server1/27017
- connected to: mongo-server1:27017
- Mon Jan 5 16:25:15.168 all dbs
- Mon Jan 5 16:25:15.169 DATABASE: ycsb to /data/mongobackup/mongo-server1/20006/ycsb
- Mon Jan 5 16:25:15.170 ycsb.system.indexes to /data/mongobackup/mongo-server1/27017/ycsb/system.indexes.bson
- Mon Jan 5 16:25:15.172 1 objects
- Mon Jan 5 16:25:15.172 ycsb.usertable to /data/mongobackup/mongo-server1/27017/ycsb/usertable.bson
- Mon Jan 5 16:25:18.005 Collection File Writing Progress: 469900/5169224 9% (objects)
- Mon Jan 5 16:25:21.038 Collection File Writing Progress: 1409300/5169224 27% (objects)
- Mon Jan 5 16:25:24.030 Collection File Writing Progress: 2348700/5169224 45% (objects)
- Mon Jan 5 16:25:27.036 Collection File Writing Progress: 3272500/5169224 63% (objects)
- Mon Jan 5 16:25:30.016 Collection File Writing Progress: 4650300/5169224 89% (objects)
- Mon Jan 5 16:25:31.764 5224511 objects
- Mon Jan 5 16:25:31.764 Metadata for ycsb.usertable to /data/mongobackup/mongo-server1/27017/ycsb/usertable.metadata.json
- Mon Jan 5 16:25:31.764 DATABASE: admin to /data/mongobackup/mongo-server1/27017/admin
复制代码
4. 在数据加载完成后终止mongobackup对于oplog的记录,此时可以看到通过oplog共记录了5360433,此时mongodump生成的备份文件包含了一部分数据,mongobackup生成的oplog备份文件包含了一部分增量数据,要想获取全量数据必须两者配合
- Mon Jan 5 16:33:38.001 Backup Progress: 5268700/197606 2666% (objects)
- Mon Jan 5 16:33:41.005 Backup Progress: 5298600/197606 2681% (objects)
- Mon Jan 5 16:33:44.001 Backup Progress: 5327700/197606 2696% (objects)
- Mon Jan 5 16:33:47.001 Backup Progress: 5353900/197606 2709% (objects)
- Mon Jan 5 16:33:56.289 waiting for new ops ^_^
- ^CMon Jan 5 16:34:05.969 Received signal 2.
- Mon Jan 5 16:34:05.969 Will exit soon.
- Mon Jan 5 16:34:06.289 waiting for new ops ^_^
- Mon Jan 5 16:34:06.290 5360433 objects
- Mon Jan 5 16:34:06.290 Backuped up to ts: Timestamp 1420446841000|1
- Mon Jan 5 16:34:06.290 Use -s 1420446841,1 to resume.
- Mon Jan 5 16:34:06.290 Metadata for oplog000001 to backup/oplog.metadata.json
- Mon Jan 5 16:34:06.290 5360433 objects
复制代码
5. 假定此时mongo实例出现故障,全部数据丢失,此时通过mongorestore命令将最近一次全量备份数据导入到数据库中,通过该工具恢复数据共计5224511条,此时连接mongo实例查看集合的记录个数为5224511
- [frankey@mongo-server3]$ mongorestore --host mongo-server1 --port 27017 --drop /data/mongobackup/mongo-server1/27017
- connected to: mongo-server1:27017
- Mon Jan 5 16:36:01.022 /data/mongobackup/mongo-server1/27017/ycsb/usertable.bson
- Mon Jan 5 16:36:01.022 going into namespace [ycsb.usertable]
- Mon Jan 5 16:36:01.022 dropping
- Mon Jan 5 16:36:04.000 Progress: 18993257/1399539468 1% (bytes)
- Mon Jan 5 16:36:07.003 Progress: 37049030/1399539468 2% (bytes)
- Mon Jan 5 16:36:10.002 Progress: 55961917/1399539468 3% (bytes)
- ...
- Mon Jan 5 16:39:37.000 Progress: 1352035352/1399539468 96% (bytes)
- Mon Jan 5 16:39:40.004 Progress: 1371778875/1399539468 98% (bytes)
- Mon Jan 5 16:39:43.003 Progress: 1391583177/1399539468 99% (bytes)
- 5224511 objects found
- Mon Jan 5 16:39:44.213 Creating index: { name: "_id_", key: { _id: 1 }, ns: "ycsb.usertable” }
复制代码
6. 通过mongobackup回放oplog来恢复mongodump备份完成之后的增量数据,通过执行结果可知通过oplog000000.bson回放了3346277条记录,通过oplog000001.bson回放了2014156条记录,此时再次连接mongo实例查看集合的记录个数为10000000,与实际加载的记录个数相同
- [frankey@mongo-server3 ~]$ ./mongobackup --port 27017 --host mongo-server1 --recovery -s 1420446193,1 -t 1420446841,1
- connected to: mongo-server1:27017
- Mon Jan 5 16:43:40.903 Replaying file:oplog000000.bson
- Mon Jan 5 16:43:43.006 Progress: 9241298/1073742107 0% (bytes)
- Mon Jan 5 16:43:46.004 Progress: 22140667/1073742107 2% (bytes)
- Mon Jan 5 16:43:49.005 Progress: 35328590/1073742107 3% (bytes)
- ...
- Mon Jan 5 16:46:31.000 Progress: 1047213189/1073742107 97% (bytes)
- Mon Jan 5 16:46:34.003 Progress: 1068134387/1073742107 99% (bytes)
- 3346277 objects found
- Mon Jan 5 16:46:34.792 Replaying file:oplog000001.bson
- Mon Jan 5 16:46:37.003 Progress: 15658789/646295159 2% (bytes)
- Mon Jan 5 16:46:40.004 Progress: 37414205/646295159 5% (bytes)
- ...
- Mon Jan 5 16:48:07.005 Progress: 614799792/646295159 95% (bytes)
- Mon Jan 5 16:48:10.006 Progress: 635047026/646295159 98% (bytes)
- 2014156 objects found
- Mon Jan 5 16:48:11.677 Successfully Recovered.
复制代码
7. 继续通过YCSB命令执行压力测试,已验证恢复后数据的完整性,该测试共执行1000W条操作,90%read、5%update和5%insert,通过执行结果可知update和insert执行操作全部成功,而read操作有12个失败,导致read操作失败的原因是否因为数据恢复有问题暂时无法确定,YCSB也没有提供更详细的失败原因,所以使用mongodump+mongorestore+mongobackup来实现增量备份是否可靠还需要进一步确认。
- [frankey@mongo-server3 ycsb-0.1.4]$ ./bin/ycsb run mongodb -threads 100 -P workloads/readandupdateandinsert1 > run.result
- [UPDATE], Operations, 499944
- [UPDATE], AverageLatency(us), 20879.322764149583
- [UPDATE], MinLatency(us), 150
- [UPDATE], MaxLatency(us), 1428679
- [UPDATE], 95thPercentileLatency(ms), 64
- [UPDATE], 99thPercentileLatency(ms), 97
- [UPDATE], Return=0, 499944
- [INSERT], Operations, 499827
- [INSERT], AverageLatency(us), 21033.383328631706
- [INSERT], MinLatency(us), 172
- [INSERT], MaxLatency(us), 1485376
- [INSERT], 95thPercentileLatency(ms), 64
- [INSERT], 99thPercentileLatency(ms), 97
- [INSERT], Return=0, 499827
- [READ], Operations, 9000229
- [READ], AverageLatency(us), 4457.124366946663
- [READ], MinLatency(us), 83
- [READ], MaxLatency(us), 1346130
- [READ], 95thPercentileLatency(ms), 15
- [READ], 99thPercentileLatency(ms), 36
- [READ], Return=0, 9000217
- [READ], Return=1, 12
复制代码
通过以上过程可知,mongobackup的实现应该是参考了mongodump --oplog和mongorestore --oplogReplay的源码。在使用mongobackup进行增量备份恢复时,数据恢复速度与mongorestore类似,即每秒钟恢复20000条记录。
|