Skip to main content

Replicating Oracle data to HBase

Striim provides a template for creating applications that read from Oracle and write to HBase. See Creating an application using a template for details.

The following sample application will continuously replicate changes to MYSCHEMA.MYTABLE to the HBase Writer table mytable in the column family oracle_data:

CREATE SOURCE OracleCDCSource USING OracleReader ( 
  Username:'striim',
  Password:'passwd',
  ConnectionURL:'203.0.113.49:1521:orcl',
  Tables:'MYSCHEMA.MYTABLE',
  ReaderType: 'LogMiner',
  CommittedTransactions: false
)
INSERT INTO DataStream;

CREATE TARGET HBaseTarget USING HBaseWriter(
  HBaseConfigurationPath:"/usr/local/HBase/conf/hbase-site.xml",
  Tables: 'MYSCHEMA.MYTABLE,mytable.oracle_data'
INPUT FROM DataStream;

Notes:

  • INSERT, UPDATE, and DELETE are supported.

  • UPDATE does not support changing a row's primary key.

  • If the Oracle table has one primary key, the value of that column is treated as the HBase rowkey. If the Oracle table has multiple primary keys, their values are concatenated and treated as the HBase rowkey.

  • Inserting a row with the same primary key as an existing row is treated as an update.

  • The Tables property values are case-sensitive.

The Tables value may map Oracle tables to HBase tables and column families in various ways:

  • one to one: Tables: "MYSCHEMA.MYTABLE,mytable.oracle_data"

  • many Oracle tables to one HBase table: "MYSCHEMA.MYTABLE1,mytable.oracle_data;MYSCHEMA.MYTABLE2,mytable.oracle_data"

  • many Oracle tables to one HBase table in different column families: "MYSCHEMA.MYTABLE1,mytable.family1;MYSCHEMA.MYTABLE2,mytable.family2"

  • many Oracle tables to many HBase tables: "MYSCHEMA.MYTABLE1,mytable1.oracle_data;MYSCHEMA.MYTABLE2,mytable2.oracle_data "