Skip to main content

HBase Writer

Writes to an HBase database.

The following describes use of HBaseWriter with an input stream of a user-defined type. For use with a CDC reader or DatabaseReader input stream of type WAEvent, see Replicating Oracle data to HBase.

For information on which firewall ports must be open, see the hbase-site.xml file specified by HBaseConfigurationPath.

HBase Writer properties

property

type

default value

notes

Authentication Policy

String

If the target HBase instance is unsecured, leave this blank. If it uses Kerberos authentication, provide credentials in the format Kerberos, Principal:<Kerberos principal name>, KeytabPath:<fully qualified keytab file name>. For example: authenticationpolicy:'Kerberos, Principal:nn/ironman@EXAMPLE.COM, KeytabPath:/etc/security/keytabs/nn.service.keytab'

Batch Policy

String

eventCount:1000, Interval:30

The batch policy includes eventCount and interval (see Setting output names and rollover / upload policies for syntax). Events are buffered locally on the Striim server and sent as a batch to the target every time either of the specified values is exceeded. When the app is stopped, any remaining data in the buffer is discarded. To disable batching, set to EventCount:1,Interval:0.

With the default setting, data will be written every 30 seconds or sooner if the buffer accumulates 1000 events.

HBase Configuration Path

String

Fully-qualified name of the hbase-site.xml file. Contact your HBase administrator to obtain a valid copy of the file if necessary, or to mark the host as a Hadoop client so that the file gets distributed automatically.

PK Update Handling Mode

String

ERROR

With the default value, when the input stream contains an update to a primary key, the application stops and its status is ERROR. With the value IGNORE, primary key update events are ignored and the application continues.

To support primary key updates, set to DELETEANDINSERT. The Compression property in the CDC source reader must be set to False.

Parallel Threads

Integer

See Creating multiple writer instances.

Tables

String

the input stream type and name of the HBase table to write to, in the format <table name>.<column family name> (case-sensitive; multiple tables are supported only when Replicating Oracle data to HBase)

The input stream's type must define a key field.

HBase Writer sample application

The following TQL will write to the HBase table posdata of ColumnFamily striim:

CREATE SOURCE PosSource USING FileReader (
  directory:'Samples/PosApp/AppData',
  wildcard:'PosDataPreview.csv',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:yes
)
OUTPUT TO RawStream;

CREATE TYPE PosData(
  merchantId String KEY, 
  dateTime DateTime, 
  amount Double, 
  zip String
);
CREATE STREAM PosDataStream OF PosData;

CREATE CQ CsvToPosData
INSERT INTO PosDataStream 
SELECT TO_STRING(data[1]), 
       TO_DATEF(data[4],'yyyyMMddHHmmss'),
       TO_DOUBLE(data[7]),
       TO_STRING(data[9])
FROM RawStream;

CREATE TARGET WriteToHBase USING HBaseWriter(
  HBaseConfigurationPath:"/usr/local/HBase/conf/hbase-site.xml",
  Tables: 'posdata.striim'
INPUT FROM PosDataStream;

In HBase, merchantId, dateTime, hourValue, amount, and zip columns will automatically be created under the striim ColumnFamily if they do not already exist.