HBase is a distributed column storage system built on HDFS, mainly used for massive structured data storage. Here, our goal is just to provide a basic environment for Python to access HBase, so we download the binary package directly and install it on a single machine. After downloading, unzip it, modify the configuration file, and then start HBase directly. The system version used is ubuntu14.04.
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.2.4/hbase-1.2.4-bin.tar.gz tar zxvf hbase-1.2.4-bin.tar.gz
Modify hbase-env.sh and set JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
Modify hbase-site.xml and set the root directory for storing data.
<configuration> <property> <name>hbase.rootdir</name> <value>file:///home/mi/work/hbase/data</value> </property></configuration>
bin/start-hbase.sh # 启动bin/hbase shell # 进入hbase交互shell
After installing HBase, you need to install Thrift, because when calling HBase in other languages, you need to connect through Thrift.
sudo apt-get install automake bison flex g++ git libboost1.55 libevent-dev libssl-dev libtool make pkg-config
PS: libboost1.55-all-dev, there was a problem with the installation on my ubuntu14.04, so I installed libboost1.55.
Download the source code, unzip it and compile and install. Thrift download address
tar zxf thrift-0.10.0.tar.gzcd thrift-0.10.0/./configure --with-cpp --with-boost --with-python --without-csharp --with-java --without-erlang --without-perl --with-php --without-php_extension --without-ruby --without-haskell --without-gomake # 编译耗时较长sudo make install
bin/hbase-daemon.sh start thrift
Check the system process
~/work/hbase/hbase-1.2.4/conf$ jps3009 ThriftServer4184 HMaster5932 Jps733 Main
You can see that the ThriftServer has been started successfully, and then we can use multiple languages through Thrift comes to access HBase.
The following uses Python as an example to demonstrate how to access HBase.
sudo pip install thriftsudo pip install hbase-thrift
from thrift import Thriftfrom thrift.transport import TSocketfrom thrift.transport import TTransportfrom thrift.protocol import TBinaryProtocolfrom hbase import Hbasefrom hbase.ttypes import * transport = TSocket.TSocket('localhost', 9090) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Hbase.Client(protocol) transport.open() contents = ColumnDescriptor(name='cf:', maxVersions=1)# client.deleteTable('test')client.createTable('test', [contents])print client.getTableNames()# insert datatransport.open() row = 'row-key1'mutations = [Mutation(column="cf:a", value="1")] client.mutateRow('test', row, mutations) # get one rowtableName = 'test'rowKey = 'row-key1'result = client.getRow(tableName, rowKey) print resultfor r in result: print 'the row is ', r.row print 'the values is ', r.columns.get('cf:a').value
['test'] [TRowResult(columns={'cf:a': TCell(timestamp=1488617173254, value='1')}, row='row-key1')] the row is row-key1 the values is 1
The above is the detailed content of HBase operation example code analysis in Python. For more information, please follow other related articles on the PHP Chinese website!