一、Hbase 安装

有时间再写吧......😫

二、Hbase Shell

HBase 提供一个 shell,您可以使用 shell 与 HBase 通信。

1、启动 HBase Shell

cd [/opt/hbase-2.4.17/bin/] :进入 Hbase 主文件夹下的 bin 目录(如果设置了环境变量可以跳过这一步)

[root@node1 ~]# cd /opt/hbase-2.4.17/bin/

[root@node1 bin]#

hbase shell :启动 hbase shell

[root@node1 bin]# hbase shell

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-2.4.17/lib/client-facing-thirdparty/slf4j-reload4j-1.7.33.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.17, r7fd096f39b4284da9a71da3ce67c48d259ffa79a, Fri Mar 31 18:10:45 UTC 2023
Took 0.0028 seconds
hbase:001:0>

2、通用命令

status :提供的 HBase 的状态,例如,服务器的数量

hbase:001:0> status

1 active master, 0 backup masters, 1 servers, 0 dead, 5.0000 average load
Took 0.4039 seconds
hbase:002:0>

version :提供正在使用的 HBase 的版本

hbase:002:0> version

2.4.17, r7fd096f39b4284da9a71da3ce67c48d259ffa79a, Fri Mar 31 18:10:45 UTC 2023
Took 0.0028 seconds
hbase:003:0>

table_help :提供有关表引用命令的帮助

hbase:003:0> table_help

Help for table-reference commands.

You can either create a table via 'create' and then manipulate the table via commands like 'put', 'get', etc.
See the standard help information for how to use each of these commands.

However, as of 0.96, you can also get a reference to a table, on which you can invoke commands.
For instance, you can get create a table and keep around a reference to it via:

   hbase> t = create 't', 'cf'

Or, if you have already created the table, you can get a reference to it:

   hbase> t = get_table 't'

You can do things like call 'put' on the table:

  hbase> t.put 'r', 'cf:q', 'v'

which puts a row 'r' with column family 'cf', qualifier 'q' and value 'v' into table t.

To read the data out, you can scan the table:

  hbase> t.scan

which will read all the rows in table 't'.

Essentially, any command that takes a table name can also be done via table reference.
Other commands include things like: get, delete, deleteall,
get_all_columns, get_counter, count, incr. These functions, along with
the standard JRuby object methods are also available via tab completion.

For more information on how to use each of these commands, you can also just type:

   hbase> t.help 'scan'

which will output more information on how to use that command.

You can also do general admin actions directly on a table; things like enable, disable,
flush and drop just by typing:

   hbase> t.enable
   hbase> t.flush
   hbase> t.disable
   hbase> t.drop

Note that after dropping a table, your reference to it becomes useless and further usage
is undefined (and not recommended).
Took 0.0020 seconds
hbase:004:0>

whoami :提供有关用户的信息

hbase:004:0> whoami

root (auth:SIMPLE)
    groups: root
Took 0.0168 seconds
hbase:005:0>

3、数据定义语言(DDL,Data Definition Language)

create '<表名>','<列族名1>','<列族名2> ...' :创建表,指定表名和列族(列族是一组相关列的集合,是HBase存储和管理数据的基本单位)

HBase 里,列族(Column Family) 就是把一组相关的列放在一起管理的单位。数据是先按行键(RowKey)排序,再按列族来存储。
每个表必须至少有一个列族,列族一旦建好就不能改。

hbase:001:0> create 'students','info','score'	# 创建一个 students 表有 info 和 score 两个列族

Created table students
Took 1.6486 seconds
=> Hbase::Table - students
hbase:002:0>

list :列出 HBase 中所有的表

hbase:002:0> list

TABLE
students
1 row(s)
Took 0.0194 seconds
=> ["students"]
hbase:003:0>

disable '<表名>' :禁用表

hbase:004:0> disable 'students'

Took 0.4334 seconds
hbase:005:0>

is_disabled '<表名>' :验证表是否被禁用

hbase:005:0> is_disabled 'students'

true
Took 0.0195 seconds
=> true
hbase:006:0>

enable '<表名>' :启用表

hbase:006:0> enable 'students'

Took 0.6480 seconds
hbase:007:0>

is_enabled '<表名>' :验证表是否被启用

hbase:008:0> is_enabled 'students'

true
Took 0.0216 seconds
=> true
hbase:009:0>

describe '<表名>' :表的描述信息

hbase:010:0> describe 'students'

Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

{NAME => 'score', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1',KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

2 row(s)
Quota is disabled
Took 0.0443 seconds
hbase:011:0>

alter :修改列族(Column Family)结构,需要传入表名和一个字典,指定新的列族结构。字典格式在 help 主命令输出中有说明。字典里必须包含要修改的列族名称。

修改表级属性(比如 MAX_FILESIZE、READONLY、MEMSTORE_FLUSHSIZE、DEFERRED_LOG_FLUSH 等)

参考 Hbase 官方文档 给出的博客链接:HBase shell commands | Learn HBase

hbase:014:0> alter 'students', NAME => 'info', VERSIONS => 5	# 把 students 表中的 info 列族,改为最多保留 5 个版本

Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.1117 seconds

hbase:015:0> describe 'students'	# 查看修改后的 students 表

Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

{NAME => 'score', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

2 row(s)
Quota is disabled
Took 0.0284 seconds
hbase:016:0>
hbase:031:0> alter 'students', {NAME => 'info', IN_MEMORY => true}, {NAME => 'score', VERSIONS => 5}	# 同时修改 students 表的 info 和 score 列族

Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9704 seconds

hbase:032:0> describe 'students'	# 查看修改后的 students 表

Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'true', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

{NAME => 'score', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

2 row(s)
Quota is disabled
Took 0.0341 seconds
hbase:033:0>
hbase:033:0> alter 'students', NAME => 'score', METHOD => 'delete'	# 删除 student 表的 score 列族

Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9858 seconds

hbase:034:0> describe 'students'	# 查看修改后的 students 表

Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'true', VERSIONS => '5', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER'
, MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

1 row(s)
Quota is disabled
Took 0.0216 seconds
hbase:035:0>
hbase:035:0> alter 'teachers', READONLY	# 将 teachers 修改为只读

Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.9486 seconds

hbase:036:0> describe 'teachers'	# 查看修改后的 teachers 表

Table teachers is ENABLED
teachers
COLUMN FAMILIES DESCRIPTION
{NAME => 'READONLY', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

{NAME => 'students', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

3 row(s)
Quota is disabled
Took 0.0186 seconds
hbase:037:0>

exists '<表名>' :验证表是否存在

hbase:037:0> exists 'students'	# 验证 students 表是否存在

Table students does exist
Took 0.0074 seconds
=> true
hbase:038:0>

drop '<表名>' :从 HBase 删除表

hbase:039:0> disable 'teachers'	# 禁用 teacher 表

Took 0.3183 seconds

hbase:040:0> drop 'teachers'	# 删除 teacher 表

Took 0.3483 seconds
hbase:041:0>

drop_all '<正则表达式>' :删除命令中匹配 ' regex ' 的表

hbase:042:0> list	# 列出已用的表

TABLE
students
teacher1
teacher2
teacher3
4 row(s)
Took 0.0102 seconds
=> ["students", "teacher1", "teacher2", "teacher3"]

hbase:043:0> disable_all "teacher.*"	# 禁用以 teacher 开头的表

teacher1
teacher2
teacher3

Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
Took 2.3826 seconds

hbase:044:0> drop_all "teacher.*"		# 删除以 teacher 开头的表

teacher1
teacher2
teacher3

Drop the above 3 tables (y/n)?
y
3 tables successfully dropped
Took 1.9565 seconds
hbase:045:0>

4、数据操作语言(DML,Data Manipulation Language)

put '<表名>','<行号>','<列族:列名>','<值>' :将单元格值放在特定表中指定行中的指定列

hbase:001:0> put 'students','1','info:name','Ezekielx'	# 在 students 表的第一行的 info 列族的 name 列插入 Ezekielx

Took 0.7762 seconds

hbase:002:0> scan 'students'

ROW                  COLUMN+CELL
 1                   column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0517 seconds
hbase:003:0>

get '<表名>','<行号>' :获取行或单元格的内容

hbase:003:0> get 'students','1'	# 查看 students 表的第一行

COLUMN                                        CELL
 info:name                                    timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0703 seconds
hbase:004:0>

delete '<表名>','<行号>','<列族:列名>' :删除表中的单元格值

hbase:005:0> get 'students','1'	# 查看 students 表的第一行

COLUMN                                        CELL
 info:age                                     timestamp=2025-04-23T22:50:57.865, value=18
 info:name                                    timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0115 seconds

hbase:006:0> delete 'students','1','info:age'	# 删除 students 表的第一行的 info 列族的 name 列

Took 0.0174 seconds

hbase:007:0> get 'students','1'	# 查看 students 表的第一行

COLUMN                                        CELL
 info:name                                    timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0102 seconds
hbase:008:0>

deleteall '<表名>','<行号>' :删除给定行中的所有单元格

hbase:011:0> scan 'students'	# 获取 students 表的数据

ROW                                           COLUMN+CELL
 1                                            column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
 2                                            column=info:name, timestamp=2025-04-23T22:55:13.593, value=Bob
2 row(s)
Took 0.0184 seconds

hbase:012:0> deleteall 'students','2'	# 删除 students 表的第二行

Took 0.0079 seconds

hbase:013:0> scan 'students'	# 获取 students 表的数据

ROW                                           COLUMN+CELL
 1                                            column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0080 seconds
hbase:014:0>

scan '<表名>' :扫描并返回表数据

hbase:014:0> scan 'students'	# 获取 students 表的数据

ROW                                           COLUMN+CELL
 1                                            column=info:name, timestamp=2025-04-23T22:31:19.677, value=Ezekielx
1 row(s)
Took 0.0172 seconds
hbase:015:0>

count '<表名>' :计算并返回表中的行数

hbase:015:0> count 'students'	# 计算 students 表中的行数

1 row(s)
Took 0.0494 seconds
=> 1
hbase:016:0>

三、Hbase Java API