HBase Architecture: HBase Data Model & HBase Read/Write Mechanism

HBase Architecture

In my previous blog on HBase Tutorial , I explained what is HBase and its features. I also mentioned Facebook messenger’s case study to help you to connect better. Now further moving aheadin our Hadoop Tutorial Series , I will explain you the data model of HBase and HBase Architecture. So, you can understand what makes HBase very famous.

The important topics that I will be taking you through in this HBase architecture blog are:

HBase Data Model HBase Architecture and it’s Components HBase Write Mechanism HBase Read Mechanism HBase Performance Optimization Mechanisms

Let us first understand the data model of HBase. Ithelps HBase in faster read/write and searches.

HBase Architecture: HBase Data Model

As we know, HBase is a column-oriented NoSQL database. Although it looks similar to a relational database which contains rows and columns, but it is not a relational database. Relational databases areroworiented while HBase is column-oriented. So, let us first understand the difference between Column-oriented and Row-oriented databases:

Row-oriented vs column-oriented Databases:

Row-oriented databases store table records in a sequence of rows. Whereas column-oriented databases store table records in a sequence of columns, i.e. the entries in a column are stored in contiguous locations on disks.

To better understand it, let us take an example and consider the table below.

HBase Architecture: HBase Data Model & HBase Read/Write Mechanism

If this table is stored in a row-oriented database. It will store the records as shown below:

1 , Paul Walker , US , 231 , Gallardo ,

2, Vin Diesel , Brazil , 520 , Mustang

In row-oriented databases data is stored on the basis of rows or tuples as you can see above.

While the column-oriented databases store this data as:

1 , 2 , Paul Walker , Vin Diesel , US , Brazil , 231 , 520 , Gallardo , Mustang

In a column-oriented databases, all the column values are stored together like first column values will be stored together, then the second column values will be stored together and data in other columns are stored in a similar manner.

When the amount of data is very huge, like in terms of petabytes or exabytes, we use column-oriented approach, because the data of a single column is stored together and can be accessed faster. While row-oriented approach comparatively handles less number of rows and columns efficiently, as row-oriented database stores data is a structured format. When we need to processand analyze a large set of semi-structured or unstructured data, we use column oriented approach. Such as applications dealing with Online Analytical Processing like data mining, data warehousing, applications including analytics, etc. Whereas, Online Transactional Processing such as banking and finance domains which handle structured data and require transactional properties (ACID properties) use row-oriented approach.

HBase tables has following components, shown in the image below:

Tables : Data isstored in a table format in HBase. But here tables are in column-oriented format. Row Key : Row keys are used to search recordswhich makesearches fast. You would be curious to know how? I will explain it in the architecture part moving ahead in this blog. Column Families : Various columns are combined in a column family. These column families are stored together which makes the searching process faster because data belonging to same column family can be accessedtogether in a single seek. Column Qualifiers : Each column’s name is known as its column qualifier. Cell : Data is stored in cells. The data is dumped into cells which are specifically identified by rowkey and column qualifiers. Timestamp : Timestamp is a combination of date and time. Whenever data is stored, it is stored with its timestamp. This makes easy to search for a particular version of data.

In a more simple and understanding way,we can say HBase consists of:

Set of tables Each table with column families and rows Row key acts as a Primary key in HBase. Any access to HBase tables uses this Primary Key Each column qualifier present in HBase denotes attribute corresponding to the object which resides in the cell.

Now that you know about HBaseData Model, let us see how this data model falls in line with HBase Architecture and makes it suitable for large storage and faster processing.

Learn HBase in our Hadoop Course

HBase Architecture: Components of HBase Architecture

HBase has three major components i.e., HMaster Server , HBase Region Server, Regions and Zookeeper .

The below figure explains the hierarchy of the HBase Architecture. We will talk about each one of them individually.

Now before going to the HMaster, we will understand Regions as all these Servers (HMaster, Region Server, Zookeeper) are placed to coordinate and manage Regions and perform various operations inside the Regions. So you would be curious to know what are regions and why are they so important?

HBase Architecture: Region

A region contains all the rows between the start key and the end key assigned to that region. HBase tables can be divided into a number of regions in sucha way that all the columns of a column familyis stored in one region. Each region contains the rows in a sorted order.

Many regions are assigned to a Region Server , which is responsible for handling, managing, executing reads and writes operations on that set of regions.

So, concluding in a simpler way:

A table can be divided into anumber of regions. A Region is a sorted range of rows storing data between a start key and an end key. A Regionhas a default size of 256MB which can be configured accor

HBase Architecture: HBase Data Model & HBase Read/Write Mechanism

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本