
Hbase 学习笔记
一、Hbase 安装
有时间再写吧......😫
二、Hbase Shell
HBase 提供一个 shell,您可以使用 shell 与 HBase 通信。
1、启动 HBase Shell
cd [/opt/hbase-2.4.17/bin/]
:进入 Hbase 主文件夹下的 bin 目录(如果设置了环境变量可以跳过这一步)
[root@node1 ~]# cd /opt/hbase-2.4.17/bin/
[root@node1 bin]#
hbase shell
:启动 hbase shell
[root@node1 bin]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-2.4.17/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.17, r7fd096f39b4284da9a71da3ce67c48d259ffa79a, Fri Mar 31 18:10:45 UTC 2023
Took 0.0028 seconds
hbase:001:0>
2、通用命令
status
:提供的 HBase 的状态,例如,服务器的数量
hbase:001:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 5.0000 average load
Took 0.4039 seconds
hbase:002:0>
version
:提供正在使用的 HBase 的版本
hbase:002:0> version
2.4.17, r7fd096f39b4284da9a71da3ce67c48d259ffa79a, Fri Mar 31 18:10:45 UTC 2023
Took 0.0028 seconds
hbase:003:0>
table_help
:提供有关表引用命令的帮助
hbase:003:0> table_help
Help for table-reference commands.
You can either create a table via 'create' and then manipulate the table via commands like 'put', 'get', etc.
See the standard help information for how to use each of these commands.
However, as of 0.96, you can also get a reference to a table, on which you can invoke commands.
For instance, you can get create a table and keep around a reference to it via:
hbase> t = create 't', 'cf'
Or, if you have already created the table, you can get a reference to it:
hbase> t = get_table 't'
You can do things like call 'put' on the table:
hbase> t.put 'r', 'cf:q', 'v'
which puts a row 'r' with column family 'cf', qualifier 'q' and value 'v' into table t.
To read the data out, you can scan the table:
hbase> t.scan
which will read all the rows in table 't'.
Essentially, any command that takes a table name can also be done via table reference.
Other commands include things like: get, delete, deleteall,
get_all_columns, get_counter, count, incr. These functions, along with
the standard JRuby object methods are also available via tab completion.
For more information on how to use each of these commands, you can also just type:
hbase> t.help 'scan'
which will output more information on how to use that command.
You can also do general admin actions directly on a table; things like enable, disable,
flush and drop just by typing:
hbase> t.enable
hbase> t.flush
hbase> t.disable
hbase> t.drop
Note that after dropping a table, your reference to it becomes useless and further usage
is undefined (and not recommended).
Took 0.0020 seconds
hbase:004:0>
whoami
:提供有关用户的信息
hbase:004:0> whoami
root (auth:SIMPLE)
groups: root
Took 0.0168 seconds
hbase:005:0>
3、数据定义语言(DDL,Data Definition Language)
create '<表名>','<列族名1>','<列族名2> ...'
:创建表,指定表名和列族(列族是一组相关列的集合,是HBase存储和管理数据的基本单位)
在 HBase 里,列族(Column Family) 就是把一组相关的列放在一起管理的单位。数据是先按行键(RowKey)排序,再按列族来存储。
每个表必须至少有一个列族,列族一旦建好就不能改。
hbase:001:0> create 'students','info','score' # 创建一个 students 表有 info 和 score 两个列族
Created table students
Took 1.6486 seconds
=> Hbase::Table - students
hbase:002:0>
list
:列出 HBase 中所有的表
hbase:002:0> list
TABLE
students
1 row(s)
Took 0.0194 seconds
=> ["students"]
hbase:003:0>
disable '<表名>'
:禁用表
hbase:004:0> disable 'students'
Took 0.4334 seconds
hbase:005:0>
is_disabled '<表名>'
:验证表是否被禁用
hbase:005:0> is_disabled 'students'
true
Took 0.0195 seconds
=> true
hbase:006:0>
enable '<表名>'
:启用表
hbase:006:0> enable 'students'
Took 0.6480 seconds
hbase:007:0>
is_enabled '<表名>'
:验证表是否被启用
hbase:008:0> is_enabled 'students'
true
Took 0.0216 seconds
=> true
hbase:009:0>
describe '<表名>'
:表的描述信息
hbase:010:0> describe 'students'
Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'score', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1',KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s)
Quota is disabled
Took 0.0443 seconds
hbase:011:0>
alter
:修改列族(Column Family)结构,需要传入表名和一个字典,指定新的列族结构。字典格式在 help
主命令输出中有说明。字典里必须包含要修改的列族名称。
修改表级属性(比如 MAX_FILESIZE、READONLY、MEMSTORE_FLUSHSIZE、DEFERRED_LOG_FLUSH 等)
参考 Hbase 官方文档 给出的博客链接:HBase shell commands | Learn HBase
hbase:014:0> alter 'students', NAME => 'info', VERSIONS => 5 # 把 students 表中的 info 列族,改为最多保留 5 个版本
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.1117 seconds
hbase:015:0> describe 'students' # 查看修改后的 students 表
Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'score', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s)
Quota is disabled
Took 0.0284 seconds
hbase:016:0>
hbase:031:0> alter 'students', {NAME => 'info', IN_MEMORY => true}, {NAME => 'score', VERSIONS => 5} # 同时修改 students 表的 info 和 score 列族
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9704 seconds
hbase:032:0> describe 'students' # 查看修改后的 students 表
Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'true', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'score', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s)
Quota is disabled
Took 0.0341 seconds
hbase:033:0>
hbase:033:0> alter 'students', NAME => 'score', METHOD => 'delete' # 删除 student 表的 score 列族
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9858 seconds
hbase:034:0> describe 'students' # 查看修改后的 students 表
Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'true', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER'
, MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s)
Quota is disabled
Took 0.0216 seconds
hbase:035:0>
hbase:035:0> alter 'teachers', READONLY # 将 teachers 修改为只读
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9486 seconds
hbase:036:0> describe 'teachers' # 查看修改后的 teachers 表
Table teachers is ENABLED
teachers
COLUMN FAMILIES DESCRIPTION
{NAME => 'READONLY', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'students', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
3 row(s)
Quota is disabled
Took 0.0186 seconds
hbase:037:0>
exists '<表名>'
:验证表是否存在
hbase:037:0> exists 'students' # 验证 students 表是否存在
Table students does exist
Took 0.0074 seconds
=> true
hbase:038:0>
drop '<表名>'
:从 HBase 删除表
hbase:039:0> disable 'teachers' # 禁用 teacher 表
Took 0.3183 seconds
hbase:040:0> drop 'teachers' # 删除 teacher 表
Took 0.3483 seconds
hbase:041:0>
drop_all '<正则表达式>'
:删除命令中匹配 ' regex ' 的表
hbase:042:0> list # 列出已用的表
TABLE
students
teacher1
teacher2
teacher3
4 row(s)
Took 0.0102 seconds
=> ["students", "teacher1", "teacher2", "teacher3"]
hbase:043:0> disable_all "teacher.*" # 禁用以 teacher 开头的表
teacher1
teacher2
teacher3
Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
Took 2.3826 seconds
hbase:044:0> drop_all "teacher.*" # 删除以 teacher 开头的表
teacher1
teacher2
teacher3
Drop the above 3 tables (y/n)?
y
3 tables successfully dropped
Took 1.9565 seconds
hbase:045:0>
4、数据操作语言(DML,Data Manipulation Language)
put '<表名>','<行号>','<列族:列名>','<值>'
:将单元格值放在特定表中指定行中的指定列
hbase:001:0> put 'students','1','info:name','Ezekielx' # 在 students 表的第一行的 info 列族的 name 列插入 Ezekielx
Took 0.7762 seconds
hbase:002:0> scan 'students'
ROW COLUMN+CELL
1 column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0517 seconds
hbase:003:0>
get '<表名>','<行号>'
:获取行或单元格的内容
hbase:003:0> get 'students','1' # 查看 students 表的第一行
COLUMN CELL
info:name timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0703 seconds
hbase:004:0>
delete '<表名>','<行号>','<列族:列名>'
:删除表中的单元格值
hbase:005:0> get 'students','1' # 查看 students 表的第一行
COLUMN CELL
info:age timestamp=2025-04-23T22:50:57.865, value=18
info:name timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0115 seconds
hbase:006:0> delete 'students','1','info:age' # 删除 students 表的第一行的 info 列族的 name 列
Took 0.0174 seconds
hbase:007:0> get 'students','1' # 查看 students 表的第一行
COLUMN CELL
info:name timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0102 seconds
hbase:008:0>
deleteall '<表名>','<行号>'
:删除给定行中的所有单元格
hbase:011:0> scan 'students' # 获取 students 表的数据
ROW COLUMN+CELL
1 column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
2 column=info:name, timestamp=2025-04-23T22:55:13.593, value=Bob
2 row(s)
Took 0.0184 seconds
hbase:012:0> deleteall 'students','2' # 删除 students 表的第二行
Took 0.0079 seconds
hbase:013:0> scan 'students' # 获取 students 表的数据
ROW COLUMN+CELL
1 column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0080 seconds
hbase:014:0>
scan '<表名>'
:扫描并返回表数据
hbase:014:0> scan 'students' # 获取 students 表的数据
ROW COLUMN+CELL
1 column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0172 seconds
hbase:015:0>
count '<表名>'
:计算并返回表中的行数
hbase:015:0> count 'students' # 计算 students 表中的行数
1 row(s)
Took 0.0494 seconds
=> 1
hbase:016:0>