Skip to main content

Using pattern matching

You can use pattern matching in your queries to conveniently specify multiple combinations of events via a MATCH_PATTERN expression. Pattern expressions consist of references to stream and timer events according to this syntax :

CREATE CQ <CQ name>
INSERT INTO <output stream>
SELECT <list of expressions>
FROM <data source> [ <data source alias>] [, <data source> [ <data source alias> ], ...]
MATCH_PATTERN <pattern>
DEFINE <pattern variable> = [<data source or alias name>] ‘(‘ <predicate> ‘)’ 
[ PARTITION BY <data source field name> ];

In this syntax, dataSource is a stream or timer event.

The MATCH_PATTERN clause represents groups of events along with condition variables. New events are referenced in the predicates, where event variables are defined. Condition variables are expressions used to check additional conditions, such as timer events, before continuing to match the pattern.

The pattern matching expression syntax uses the following notation:

Notation

Description

*

0 or more quantifiers

?

0 or 1 quantifiers

+

1 or more quantifiers

{m,n}

m to n quantifiers

|

logical OR (alternative)

&

logical AND (permutation)

( )

grouping

#

restart matching markers for overlapping patterns

For example, the pattern A B* C+ would contain exactly 1 new event matching A, 0 or more new events matching B, and 1 or more new events matching C.

NOTE: You cannot refer to new events in conditions or expressions contained in the SELECT clause.

The DEFINE clause defines a list of event variables and conditions. The conditions are described in the predicates associated with the event variables.

For example, A = S(sensor < 100) matches events on stream S where the sensor attribute is less than 100.

The optional PARTITION BY clause partitions the stream by key on sub-streams, and pattern matching is performed independently for every partition.

Let's examine this example:

CREATE CQ 
SELECT LIST(B.id) as eventValues, count(B.id) as eventCount, A.id as eventTime 
FROM eventStream S
MATCH_PATTERN T A W B* P C
DEFINE
  A = S(sensor < 100), 
  B = S(sensor >= 100),
  C = S(sensor < 100),
  T = TIMER(interval 60 second),
  W = STOP(T), 
  P  = (count(B) > 5)
PARTITION BY eventId;

The pattern of event variables in the MATCH_PATTERN clause (T A W B* P C) represents the following:

  • T A W: Event A is matched within 60 seconds. Since this pattern ends with W, subsequent events (B, P, C) can only be matched after the first 60 seconds have elapsed.

  • B* P C: After the first 60 seconds, the match pattern consists of 0 or more instances of event B, followed by event P, followed by event C.

Event Variables

Once you have established event variables, you can refer to groups of matched events using 0-based array index notation.

For example, B[3].sensor refers to the 4th event in group B. If you omit the index (for example, B.sensor), the last event in the group is returned.

If you pass event variables to built-in or user-defined functions, the compiler will generate code differently depending on whether a scalar or aggregating function is used.

In this example, the expression aggregates all sensor attributes for all matched events in event variable B.

avg(B.sensor)

In this example, the expression evaluates to the absolute value of the sensor attribute of the last matched event in event variable B:

abs(B.sensor)

Referring to Past Events

If a match on the next event depends on the value of an attribute in a previous event, use the PREV() built-in function. The only parameter is an integer constant indicating how far back to match in the event pattern. For example, PREV() or PREV(1) refers to the immediately preceding event, but PREV(2) refers to the event that occurred before the immediately preceding event. For example:

DEFINE
-- Compare the new event's sensor with the previous event's sensor.
-- The default index value of 1 is used.
A=streamx(sensor < PREV().sensor)  
-- Compare the new event's sensor with the sensor of the event before the previous sensor.
B=streamx(sensor < PREV(2).sensor) 

Timer Events

You can set timer events using the following functions:

Function

Description

timer(interval <int> {second|minute|hour})

A condition that does not wait for a new event from any stream.

When the timer expires, it send a signal as an event from the timer stream.

signal( variable )

The next event is expected from the timer defined by the specified variable.

stop( variable )

Stops the timer and cancels sending the timer event.

For example, this pattern matches no events for a period of 50 seconds:

PATTERN T W
DEFINE
T=timer(interval 50 second),
W=signal(T)

This pattern matches all events from streamA for 30 seconds (until event W is received from the timer):

PATTERN T A* W -- matching all events from streamA for 30 seconds
DEFINE
T timer(interval 30 second),
A=streamA,
W=signal(T)

This pattern matches an event from streamA within 30 seconds. If an event from streamA does not occur within 30 seconds, an event from the timer will be received causing the pattern matching to fail:

PATTERN T A -- matching event from streamA within 30 seconds
DEFINE
T=timer(interval 30 second),
A=streamA

This pattern matches events from streamA for 30 seconds. It subsequently matches events from streamB for 30 seconds:

PATTERN T A C T2 B -- matching event A for 30 seconds and then event B within 30 seconds
DEFINE
T=timer(interval 30 second),
A=streamA,
C=stop(T),
T2=timer(interval 30 second),
B=streamB

Alternation ( | )

If you would like to specify several variations of an event sequence, use the alternation operator ( | ). In case there are equivalent variations in the alternation expression, the first (leftmost) variation will be matched first. For example, the following pattern matches A but not AA since it is equivalent to A:

PATTERN (A|AA|B|C)
DEFINE
A=streamA(sensor between 10 and 20),
AA=streamA(sensor between 10 and 20), -- it is same as A, and will never be matched
B=streamA(sensor > 20),
C=streamA(sensor < 10)

Matching overlapping patterns ( # )

Suppose you would like to match the sequence ABA and the stream contains events ABABA. Normally the first instance (ABA)BA will be matched, but the second instance AB(ABA) will not be matched. If you would like both instances to be matched, include a ( # ) operator in the pattern wherever you would like the engine to restart its matching (for example, AB#ABA). If the pattern contains multiple # operators, the matching restarts from the earliest occurrence of the last successful match.

For example, consider this pattern:

PATTERN A (B # A | C D # A)

If the actual event sequence is ABACDABACDA, the following subsequences will be matched:

  • ABA

  • ACDA

  • ABA

  • ACDA