Striim 3.9.7 documentation

AVROFormatter

Formats a writer's output for use by Apache Avro and generates an Avro schema file. See Supported writer-formatter combinations.

property

type

default value

notes

formatAs

java.lang.String

default

Do not change default value unless Using the Confluent or Hortonworks schema registry.

schemaFileName

java.lang.String

A string specifying the path and name of the Avro schema file Striim will create based on the type of the target's input stream. (Be sure Striim has write permission for the specified directory.) If no path is specified, the file will be created in the Striim program directory. To generate the schema file, deploy the application. Then compile the schema file as directed in the Avro documentation for use in your application.

schemaregistryurl

java.lang.String

Leave blank unless Using the Confluent or Hortonworks schema registry.

The following sample application filters and parses part of PosApp's data and writes it to a file using AvroFormatter:

CREATE SOURCE CsvDataSource USING FileReader (
  directory:'Samples/PosApp/appData',
  wildcard:'PosDataPreview.csv',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
) OUTPUT TO CsvStream;

CREATE CQ CsvToPosData
INSERT INTO PosDataStream
SELECT TO_STRING(data[1]) as merchantId,
  TO_DATEF(data[4],'yyyyMMddHHmmss') as dateTime,
  DHOURS(TO_DATEF(data[4],'yyyyMMddHHmmss')) as hourValue,
  TO_DOUBLE(data[7]) as amount,
  TO_STRING(data[9]) as zip
FROM CsvStream;

CREATE TARGET AvroFileOut USING FileWriter(
  filename:'AvroTestOutput'
)
FORMAT USING AvroFormatter (
  schemaFileName:'AvroTestParsed.avsc'
)
INPUT FROM PosDataStream;

If you deploy the above application in the namespace avrons, AvroTestParsed.avsc is created with the following contents:

{"namespace": "PosDataStream_Type.avrons",
  "type" : "record",
  "name": "Typed_Record",
  "fields": [
{"name" : "merchantId", "type" : "string"},
{"name" : "dateTime", "type" : "string"},
{"name" : "hourValue", "type" : "int"},
{"name" : "amount", "type" : "double"},
{"name" : "zip", "type" : "string"}
 ]
}

The following application simply writes the raw data in WAEvent format:

CREATE SOURCE CsvDataSource USING FileReader (
  directory:'Samples/PosApp/appData',
  wildcard:'PosDataPreview.csv',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
) OUTPUT TO CsvStream;

CREATE TARGET AvroFileOut USING FileWriter(
  filename:'AvroTestOutput'
)
FORMAT USING AvroFormatter (
  schemaFileName:'AvroTestRaw.avsc'
)
INPUT FROM CsvStream;

If you deploy the above application in the namespace avrons, AvroTestRaw.avsc is created with the following contents:

{"namespace": "WAEvent.avrons",
  "type" : "record",
  "name": "WAEvent_Record",
  "fields": [
    {"name" : "data",
    "type" : ["null" , { "type": "map","values":[ "null" , "string"] }]
     },
    {"name" : "before",
    "type" : ["null" , { "type": "map","values":[ "null" , "string"] }]
    },
    {"name" : "metadata",
     "type" : { "type": "map","values":"string" }
    }
  ]
}

See Parsing the data field of WAEvent and Using the META() function for information about this format.