AVRO Formatter
Formats a writer's output for use by Apache Avro and generates an Avro schema file. See Supported writer-formatter combinations.
property | type | default value | notes |
---|---|---|---|
Format As | String | default | Do not change default value unless Using the Confluent or Hortonworks schema registry. |
Schema File Name | String | A string specifying the path and name of the Avro schema file Striim will create based on the type of the target's input stream. (Be sure Striim has write permission for the specified directory.) If no path is specified, the file will be created in the Striim program directory. To generate the schema file, deploy the application. Then compile the schema file as directed in the Avro documentation for use in your application. | |
Schema Registry URL | String | Leave blank unless Using the Confluent or Hortonworks schema registry. |
The following sample application filters and parses part of PosApp's data and writes it to a file using AvroFormatter:
CREATE SOURCE CsvDataSource USING FileReader ( directory:'Samples/PosApp/appData', wildcard:'PosDataPreview.csv', positionByEOF:false ) PARSE USING DSVParser ( header:Yes, trimquote:false ) OUTPUT TO CsvStream; CREATE CQ CsvToPosData INSERT INTO PosDataStream SELECT TO_STRING(data[1]) as merchantId, TO_DATEF(data[4],'yyyyMMddHHmmss') as dateTime, DHOURS(TO_DATEF(data[4],'yyyyMMddHHmmss')) as hourValue, TO_DOUBLE(data[7]) as amount, TO_STRING(data[9]) as zip FROM CsvStream; CREATE TARGET AvroFileOut USING FileWriter( filename:'AvroTestOutput' ) FORMAT USING AvroFormatter ( schemaFileName:'AvroTestParsed.avsc' ) INPUT FROM PosDataStream;
If you deploy the above application in the namespace avrons
, AvroTestParsed.avsc
is created with the following contents:
{"namespace": "PosDataStream_Type.avrons", "type" : "record", "name": "Typed_Record", "fields": [ {"name" : "merchantId", "type" : "string"}, {"name" : "dateTime", "type" : "string"}, {"name" : "hourValue", "type" : "int"}, {"name" : "amount", "type" : "double"}, {"name" : "zip", "type" : "string"} ] }
The following application simply writes the raw data in WAEvent format:
CREATE SOURCE CsvDataSource USING FileReader ( directory:'Samples/PosApp/appData', wildcard:'PosDataPreview.csv', positionByEOF:false ) PARSE USING DSVParser ( header:Yes, trimquote:false ) OUTPUT TO CsvStream; CREATE TARGET AvroFileOut USING FileWriter( filename:'AvroTestOutput' ) FORMAT USING AvroFormatter ( schemaFileName:'AvroTestRaw.avsc' ) INPUT FROM CsvStream;
If you deploy the above application in the namespace avrons
, AvroTestRaw.avsc
is created with the following contents:
{"namespace": "WAEvent.avrons", "type" : "record", "name": "WAEvent_Record", "fields": [ {"name" : "data", "type" : ["null" , { "type": "map","values":[ "null" , "string"] }] }, {"name" : "before", "type" : ["null" , { "type": "map","values":[ "null" , "string"] }] }, {"name" : "metadata", "type" : { "type": "map","values":"string" } } ] }
See Parsing the data field of WAEvent and Using the META() function for information about this format.