shell 命令行中操作HBase數(shù)據(jù)庫
Shell控制
進(jìn)入到shell命令行界面,執(zhí)行hbase命令,并附加shell關(guān)鍵字:
1
2
3
4
5
6
7
8
9
|
[grid@hdnode3 ~]$ hbase shell HBase Shell; enter ¨help¨ for list of supported commands. Type "exit" to leave the HBase Shell Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011 hbase(main):001:0> |
雖然成功登錄進(jìn)去了,可是我們也不知道現(xiàn)在能做什么,也不了解SHELL下都有哪些命令。這個(gè)時(shí)候,我們可以選擇,去看官方文檔中的說明,或者,敲個(gè)help上去看看。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
hbase(main):002:0> help .................. .................. COMMAND GROUPS: Group name: general Commands: status, version Group name: ddl Commands: alter, create, describe, disable, drop, enable , exists, is_disabled, is_enabled, list Group name: dml Commands: count, delete, deleteall, get, get_counter, incr, put, scan, truncate Group name: tools Commands: assign, balance_switch, balancer, close_region, compact, flush, major_compact, move, split , unassign, zk_dump Group name: replication Commands: add_peer, disable_peer, enable_peer, remove_peer, start_replication, stop_replication .................. .................. |
幫助信息果然有幫助,通過輸出的信息,我們大致了解能夠做什么。可以看到hbase中也是分有ddl/dml這類語句,此外還有與復(fù)制相關(guān)的,與管理相關(guān)的命令等等。
先來試試通用(general)命令,查詢狀態(tài):
1
2
|
hbase(main):003:0> status 5 servers, 0 dead, 0.4000 average load |
查詢版本:
1
2
3
|
hbase(main):004:0> version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011 |
接下來重點(diǎn)項(xiàng),DDL和DML(想不到HBase也分了DML/DDL語句)。HBase中沒有庫的概念,做為BigTable的山寨產(chǎn)品,盡管沒名山寨到名字,但山寨到了精髓,從設(shè)計(jì)上來說,它也不需要分庫,甚至不需要分表,所有數(shù)據(jù)放到同一張表中也是可以的,這就是真正的BigTable嘛。
創(chuàng)建表對(duì)象:
1
2
3
|
hbase(main):005:0> create ¨t¨,¨t_id¨,¨t_vl¨ 0 row(s) in 2.3490 seconds |
HBase中創(chuàng)建對(duì)象的語法比較靈活,前面這個(gè)示例是簡(jiǎn)寫法,其功能等效于完整寫法,"hbase> create ¨t¨, {NAME => ¨t_id¨}, {NAME => ¨t_vl¨}",第一個(gè)參數(shù)用于指定表名,后面跟的所有參數(shù)都是列族的名稱。每個(gè)表的列族需要在表創(chuàng)建時(shí)定義好(盡管后期也可以修改,但最好一開始就定義好),從這個(gè)角度來看,HBase中的對(duì)象是結(jié)構(gòu)化的。
查看表對(duì)象:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
hbase(main):006:0> list TABLE t 1 row(s) in 0.0080 seconds hbase(main):018:0> describe ¨t¨ DESCRIPTION ENABLED {NAME => ¨t¨, FAMILIES => [{NAME => ¨t_id¨, BLOOMFILTER => ¨NONE¨, REPLICATION_SCOPE => ¨0¨, COMPRESSION => true ¨NONE¨, VERSIONS => ¨3¨, TTL => ¨2147483647¨, BLOCKSIZE => ¨65536¨, IN_MEMORY => ¨ false ¨, BLOCKCACHE => ¨t rue¨}, {NAME => ¨t_vl¨, BLOOMFILTER => ¨NONE¨, REPLICATION_SCOPE => ¨0¨, COMPRESSION => ¨NONE¨, VERSIONS => ¨3¨, TTL => ¨2147483647¨, BLOCKSIZE => ¨65536¨, IN_MEMORY => ¨ false ¨, BLOCKCACHE => ¨ true ¨}]} 1 row(s) in 0.0100 seconds |
輸出的格式也是JSON串的形式,從中可以看到保留的版本數(shù),TTL號(hào)(Time to Live,保留時(shí)間),列的定義,塊大小等等。
修改表對(duì)象,修改(含刪除)前必須首先禁用對(duì)象,執(zhí)行修改命令成功后,再啟用對(duì)象。
禁用對(duì)象:
1
2
3
|
hbase(main):004:0> disable ¨t¨ 0 row(s) in 2.0430 seconds |
判斷當(dāng)前表對(duì)象啟用或禁用:
1
2
3
4
5
6
7
8
9
10
11
12
|
hbase(main):007:0> is_enabled ¨t¨ false 0 row(s) in 0.0040 seconds hbase(main):008:0> is_disabled ¨t¨ true 0 row(s) in 0.0040 seconds |
修改表對(duì)象,增加一個(gè)列族:
1
2
3
4
5
6
7
|
hbase(main):021:0> alter ¨t¨, {NAME => ¨t_info¨, VERSIONS => 3} 0 row(s) in 0.0360 seconds hbase(main):023:0> enable ¨t¨ 0 row(s) in 2.0250 seconds |
插入記錄:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
hbase(main):025:0> put ¨t¨,¨10001¨,¨t_vl:name¨,¨jss¨ 0 row(s) in 0.0060 seconds hbase(main):026:0> put ¨t¨,¨10001¨,¨t_vl:age¨,¨99¨ 0 row(s) in 0.0070 seconds hbase(main):027:0> put ¨t¨,¨10001¨,¨t_info:general¨,¨his fullname is junsanis!¨ 0 row(s) in 0.0040 seconds |
記錄獲取:
1
2
3
4
5
6
7
8
9
10
11
|
hbase(main):028:0> get ¨t¨,¨10001¨ COLUMN CELL t_info:general timestamp=1365670813664, value=his fullname is junsanis! t_vl:age timestamp=1365670733223, value=99 t_vl:name timestamp=1365670723056, value=jss 3 row(s) in 0.0450 seconds |
獲取指定記錄中指定列族的數(shù)據(jù):
1
2
3
4
5
6
7
8
|
hbase(main):029:0> get ¨t¨,¨10001¨,¨t_vl¨ COLUMN CELL t_vl:age timestamp=1365670733223, value=99 t_vl:name timestamp=1365670723056, value=jss 2 row(s) in 0.0070 seconds |
獲取指定記錄中指定列族中指定列的數(shù)據(jù):
1
2
3
4
5
6
7
|
hbase(main):030:0> get ¨t¨,¨10001¨,¨t_vl:age¨ COLUMN CELL t_vl:age timestamp=1365670733223, value=99 1 row(s) in 0.0070 seconds |
記錄更新(跟插入沒有區(qū)別):
1
2
3
4
5
6
7
8
9
10
11
12
|
hbase(main):031:0> put ¨t¨,¨10001¨,¨t_vl:age¨,¨10¨ 0 row(s) in 0.0050 seconds hbase(main):032:0> get ¨t¨,¨10001¨,¨t_vl:age¨ COLUMN CELL t_vl:age timestamp=1365670912700, value=10 1 row(s) in 0.0080 seconds |
全表掃描:
1
2
3
4
5
6
7
8
9
10
|
hbase(main):033:0> scan ¨t¨ ROW COLUMN+CELL 10001 column=t_info:general, timestamp=1365670813664, value=his fullname is junsanis! 10001 column=t_vl:age, timestamp=1365670912700, value=10 10001 column=t_vl:name, timestamp=1365670723056, value=jss 1 row(s) in 0.0370 seconds |
全表描述某個(gè)列:
1
2
3
4
5
6
7
8
|
hbase(main):036:0> scan ¨t¨, {COLUMNS => ¨t_vl¨} ROW COLUMN+CELL 10001 column=t_vl:age, timestamp=1365670912700, value=10 10001 column=t_vl:name, timestamp=1365670723056, value=jss 1 row(s) in 0.0080 seconds |
刪除記錄行:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
hbase(main):043:0> delete ¨t¨,¨10001¨,¨t_vl:age¨ 0 row(s) in 0.0050 seconds hbase(main):045:0> get ¨t¨,¨10001¨ COLUMN CELL t_info:general timestamp=1365670813664, value=his fullname is junsanis! t_vl:name timestamp=1365670723056, value=jss 2 row(s) in 0.0070 seconds |
刪除表:
1
2
3
4
5
6
7
8
|
hbase(main):047:0> disable ¨t¨ 0 row(s) in 2.0230 seconds hbase(main):048:0> drop ¨t¨ 0 row(s) in 1.1170 seconds |
看完前面的例子,大家有沒有問題,或者想到了什么?我腦子里反正是蹦出問號(hào)了:HBase中沒有UPDATE操作,只有INSERT,可是我們每次put新記錄都替換掉了舊的版本,怎么保存大量記錄呢?難道每個(gè)row key的columns中只能存在一條記錄?這不科學(xué)!這也顯然不是人民群眾期待并且喜聞樂見的表現(xiàn)嘛。
這個(gè)問題呀,其實(shí)是列值保存版本(VERSIONS)或保留時(shí)間(TTL, Time to Liv)在起作用。
比如,我們希望統(tǒng)計(jì)某用戶的最近(n條)瀏覽記錄,那么,創(chuàng)建HBase表對(duì)象如下:
hbase> create ¨rlog¨,¨userid¨,{NAME=>¨article¨,VERSIONS=>100}
當(dāng)前設(shè)定,保留最近的100個(gè)版本。當(dāng)用戶瀏覽帖子時(shí),就向rlog表中插入一條記錄,形式如下:
hbase> put ¨rlog¨,$userid,¨article:id¨,$aid
這里僅選擇記錄瀏覽的用戶ID和瀏覽頁面ID,也可以根據(jù)實(shí)際情況,保存頁面的URL地址,文章標(biāo)題等等信息。HBase表列族是非結(jié)構(gòu)化的,大家可以根據(jù)需求任意增加列值。
那么,要獲取用戶最近瀏覽記錄,應(yīng)該怎么查呢?,比如說獲取最近瀏覽的10條記錄:
hbase> get ¨rlog¨,$userid,{COLUMN=>¨article:id¨, VERSIONS=>10}
除了通過VERSIONS控制外,還可以考慮通過版本的保存時(shí)間TTL來控制,TTL的單位是秒,默認(rèn)一般是保存30天。
感謝閱讀,希望能幫助到大家,謝謝大家對(duì)本站的支持!
原文鏈接:http://blog.itpub.net/7607759/viewspace-759609/