Striim for HDInsight
To set up Striim for Real-Time Data Integration to HDInsight:
- Go to the Azure Marketplace and search for Striim.
- Click Striim for Real-Time Data Integration to HDInsight.
- Click Get It Now.
- Enter the required Azure profile information and click Continue.
- At the bottom of the page, click Create.
- By default, the HDInsight cluster type is set to Hadoop. If you want to create an HDInsight Kafka cluster, click Cluster type, change the type from Hadoop to Kafka, and click Select. (Other cluster types are not supported.)
- Enter the cluster name, cluster admin password, and resource group name. Make note of these settings as they are required to manage the HDInsight cluster and Striim.
- Optionally, change the location, then click Next.
- Select an existing storage account or enter a name to create a storage account, then click Next.
- Click Striim for Real-time Data Integration > Review legal terms. Optionally, change the name, preferred email address, or preferred telephone number, then click Create > OK > Next.
- Optionally, edit the cluster configuration. Reducing the number of nodes will reduce the hourly Azure charges but may decrease performance. When done, click Create (for Hadoop) or Next > Next > Create (for Kafka). In a few seconds, you should see something like this:
You can follow the progress of deployment on the Notifications menu:
Once deployment is complete, you should see something like this:
- When deployment completes successfully, click Applications.
- Click Portal.
- Click Continue.
- Enter your first and last name, email, a name and password for the Striim cluster (not the same as the HDInsight cluster), a password for the Striim admin (not the same as the HDInsight admin). Make note of the passwords as you will need them to access Striim. Click Save and Continue.
- Leave the license key field blank and click Save and Continue.
- Click Launch. It will take a minute or so for Striim to start.
- When you see Log in, click it.
- Bookmark the URL for the login page, then enter
admin as the username, provide the password you specified for the Striim admin (not the HDInsight admin), and click Log In.
Setup is complete.
At this point, if you are new to Striim, we recommend you open the documentation from the Help menu and follow the Quick Start, beginning with "Viewing dashboards."
The trial license will allow you to evaluate Striim for 30 days. If you wish to keep using your cluster after that time, contact Striim support.
To access the Striim server via ssh, go to the HDInsight cluster's Applications page, click Striim, and copy the ssh endpoint minus the final colon and port number. Then run
ssh sshuser@<ssh endpoint> and when prompted provide the password you specified in step 6 above.
Striim includes eight templates for HDInsight applications that read from files or My SQL, Oracle, or SQL Server CDC logs and write to Hadoop or Kafka. Use only the templates that are compatible with your HDInsight cluster type (that is, do not try to use the Kafka templates with a Hadoop cluster or the Hadoop templates with a Kafka cluster). See "Creating apps using templates" in the online help or documentation PDF for detailed instructions.
For "to Hadoop" templates
- Authentication Policy: leave blank (will be handled automatically by Striim)
- File Name: in this release, the file name must start with a slash, for example,
/myfile.csv (known issue DEV-13594)
- Flush Policy: may be left at the default value
- Hadoop Configuration Path: do not change
- Hadoop URL: leave blank (will be handled automatically by Striim)
- Rollover Policy: may be left at the default value
See "HDFSWriter" in the online help or documentation PDF for more information on these and other properties.
For "to Kafka" templates
See "KafkaWriter" in the online help or documentation PDF for more information on these and other properties. See Produce and consume records on microsoft.com for instructions on reading records from the topic so you can verify that your application is working.