Apache HBase is an open source, free and platform-independent software specifically designed to be used for those times when you need real-time, random read and write access to your big data. It’s main goal is to store very large tables (millions of columns X billions of rows) on top of various cluster hardware.
Features at a glance
This project provides a distributed, versioned, scalable and column-oriented store, based on Google's Bigtable project. Key features include linear and modular scalability, strictly consistent writes and reads, configurable and automatic sharding of tables, automatic failover support betwixt RegionServers, an extensible JIRB (JRuby-based) shell, a REST-ful web service, a thrift gateway, as well as an extremely easy-to-use Java API for client access.
In addition, the application allows users to query predicate push down through server-side filters, supports Bloom filters and block cache functions for real-time queries, offers advantageous base classes for backing Hadoop MapReduce jobs by using Apache HBase tables, and supports export of metrics via JMX (Java Management Extensions) or the Hadoop metrics subsystem.
Part of the Apache Software Foundation
Apache HBase is designed in such a way that it is capable of providing Bigtable-like capabilities on top of HDFS (Hadoop Distributed File System) and Apache Hadoop projects. It is distributed as a standalone application, as part of the Apache Software Foundation.
Modeled after Google's Bigtable
As mentioned, the Apache HBase application is modeled after Google's Bigtable project, a distributed storage system that has been engineered to be used for structured data, developed by Chang et al.
Under the hood and supported OSes
The application is written in the Java programming language and it is distributed as both a universal sources archive and a pre-built binary package. At the moment, it has been tested with several GNU/Linux operating systems, and supports both 32-bit and 64-bit hardware platforms.