Quantcast
Channel: CodeSection,代码区,数据库(综合) - CodeSec
Viewing all articles
Browse latest Browse all 6262

【HBase】使用CopyTable备份表

$
0
0

本博客文章如无特别说明,均为原创!转载请注明出处:Big data enthusiast( http://www.lubinsu.com/ )

本文链接地址: 【HBase】使用CopyTable备份表 ( http://www.lubinsu.com/hbase-copytable/ )

CopyTable用法:

执行命令前,需先创建表

支持时间区间、row区间,改变表名称,改变列簇名称,指定是否copy删除数据等功能,例如:

hbase org.apache.hadoop.hbase.mapreduce.CopyTable starttime=1265875194289 endtime=1265878794289 peer.adr= dstClusterZK:2181:/hbase families=myOldCf:myNewCf,cf2,cf3 TestTable

1、同一个集群不同表名称

hbase org.apache.hadoop.hbase.mapreduce.CopyTable new.name=tableCopy srcTable

2、跨集群copy表

hbase org.apache.hadoop.hbase.mapreduce.CopyTable peer.adr=dstClusterZK:2181:/hbase srcTable

该方式,原表、目标表的名称相同

参考链接:

http://hbase.apache.org/1.1/book.html#copy.table

https://blog.cloudera.com/blog/2012/06/online-hbase-backups-with-copytable-2/

CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The target table must first exist. The usage is as follows:

$ ./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help
/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help
Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>
Options:
rs.class hbase.regionserver.class of the peer cluster,
specify if different from current cluster
rs.impl hbase.regionserver.impl of the peer cluster,
startrow the start row
stoprow the stop row
starttime beginning of the time range (unixtime in millis)
without endtime means from starttime to forever
endtime end of the time range. Ignored if no starttime specified.
versions number of cell versions to copy
new.name new table's name
peer.adr Address of the peer cluster given in the format
hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
families comma-separated list of families to copy
To copy from cf1 to cf2, give sourceCfName:destCfName.
To keep the same name, just give "cfName"
all.cells also copy delete markers and deleted cells
Args:
tablename Name of the table to copy
Examples:
To copy 'TestTable' to a cluster that uses replication for a 1 hour window:
$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable
For performance consider the following general options:
It is recommended that you set the following to >=100. A higher value uses more memory but
decreases the round trip time to the server and may increase performance.
-Dhbase.client.scanner.caching=100
The following should always be set to false, to prevent writing data twice, which may produce
inaccurate results.
-Dmapred.map.tasks.speculative.execution=false

示例:

hbase org.apache.hadoop.hbase.mapreduce.CopyTable starttime=1478448000000 endtime=1478591994506 peer.adr=192.168.0.113,192.168.0.114,192.168.0.115:2181:/hbase families=txjl new.name=hy_membercontacts_bk hy_membercontacts

#根据时间范围备份

hbase org.apache.hadoop.hbase.mapreduce.CopyTable starttime=1478448000000 endtime=1478591994506 new.name=hy_membercontacts_bk hy_membercontacts

hbase org.apache.hadoop.hbase.mapreduce.CopyTable starttime=1477929600000 endtime=1478591994506 new.name=hy_linkman_tmp hy_linkman

#备份全表

hbase org.apache.hadoop.hbase.mapreduce.CopyTable new.name=hy_mobileblacklist_bk_before_del hy_mobileblacklist

#拓展根据时间范围查询

scan ‘hy_linkman’, {COLUMNS => ‘lxr:sguid’, TIMERANGE => [1478966400000, 1479052799000]} scan ‘hy_mobileblacklist’, {COLUMNS => ‘mobhmd:sguid’, TIMERANGE => [1468719824000, 1468809824000]}

hbase org.apache.hadoop.hbase.mapreduce.CopyTable new.name=hy_mobileblacklist_bk_before_del_20161228 hy_mobileblacklist


Viewing all articles
Browse latest Browse all 6262

Trending Articles