MultiLogApp
This sample application shows how Striim could be used to monitor and correlate logs from web and application server logs from the same web application. The following is a relatively high-level explanation. For a more in-depth examination of a sample application with more detail about the components and how they interact, see PosApp.
MultiLogApp contains 12 flows that analyze the data from one or both logs and take appropriate actions:
MonitorLogs parses the log files to create two event streams (AccessStream for access log events and Log4JStream for application log events) used by the other flows. See the detailed discussion below.
ErrorsAndWarnings selects application log error and warning messages for use by the ErrorHandling and WarningHandling flows, and creates a sliding window containing the 300 most recent errors and warnings for use by the LargeRTCheck and ZeroContentCheck flows, which join it with web server data.
The following flows send alerts regarding the web server logs and populate the dashboard's Overview page world map and the Detail - UnusualActivity page:
HackerCheck sends an alert when an access log
srcIp
value is on a blacklist.LargeRTCheck sends an alert when an access log
responseTime
value exceeds 2000 microseconds.ProxyCheck sends an alert when an access log
srcIP
value is on a list of suspicious proxies.ZeroContentCheck sends an alert when an access log entry's
code
value is200
(that is, the HTTP request succeeded) but thesize
value is0
(the return had no content).
The following flows send alerts regarding the application server log and populate the dashboard's Overview page pie chart and API detail pages:
ErrorHandling sends an alert when an error message appears in the application server log.
WarningHandling sends an alert once an hour with the count of warnings for each API call for which there has been at least one alert.
InfoFlow joins application log events with user information from the MLogUserLookup cache to create the ApiEnrichedStream used by ApiFlow, CompanyApiFlow, and UserApiFlow.
ApiFlow populates the Detail - ApiActivity page.
CompanyApiFlow populates the Detail - CompanyApiActivity page and the bar chart on the Overview page. It also sends an alert when an API call is used by a company more than 1500 times during the flow's one-hour jumping window.
UserApiFlow populates the dashboard's Detail - UserApiActivity page and the US map on the Overview page. It also sends an alert when an API call is used by a user more than 125 times during the flow's one-hour window.
MonitorLogs: web server log data
The web server logs are in Apache NCSA extended/combined log format plus response time:
"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D"
(See apache.org for more information.) Here are four sample log entries:
216.103.201.86 - EHernandez [10/Feb/2014:12:13:51.037 -0800] "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 21560 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1606 216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.487 -0800] "GET http://cloud.saas.me/create?type=Partner&id=01e3928f-e05a-9be1-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 63523 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1113 216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.543 -0800] "GET http://cloud.saas.me/query?type=ChatterMessage&id=01e3928f-e05a-9be2-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 46556 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1516 216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.578 -0800] "GET http://cloud.saas.me/retrieve?type=ContractHistory&id=01e3928f-e05a-9be3-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 44556 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 39

In MultiLogApp, these logs are read by AccessLogSource:
CREATE SOURCE AccessLogSource USING FileReader ( directory:'Samples/MultiLogApp/appData', wildcard:'access_log', blocksize: 10240, positionByEOF:false ) PARSE USING DSVParser ( columndelimiter:' ', ignoreemptycolumn:'Yes', quoteset:'[]~"', separator:'~' ) OUTPUT TO RawAccessStream;
The log format is space-delimited, so the columndelimiter value is one space. With these quoteset and separator values, both square brackets and double quotes are recognized as delimiting strings that may contain spaces. With these settings, the first log entry above is output as a WAEvent data
array with the following values:
"216.103.201.86", "-", "EHernandez", "10/Feb/2014:12:13:51.037 -0800", "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1", "200", "21560", "-", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)", "1606"
This in turn is processed by the ParseAccessLog CQ:
CREATE CQ ParseAccessLog INSERT INTO AccessStream SELECT data[0], data[2], MATCH(data[4], ".*jsessionId=(.*) "), TO_DATE(data[3], "dd/MMM/yyyy:HH:mm:ss.SSS Z"), data[4], TO_INT(data[5]), TO_INT(data[6]), data[7], data[8], TO_INT(data[9]) FROM RawAccessStream;
After the AccessLogEntry type is applied, the event looks like this:
srcIp: "216.103.201.86" userId: "EHernandez" sessionId: "01e3928f-e059-6361-bdc5-14109fcf2383" accessTime: 1392063231037 request: "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" code: 200 size: 21560 referrer: "-" userAgent: "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" responseTime: 1606
The web server log data is now in a format that Striim can analyze. AccessStream is used by the HackerCheck, LargeRTCheck, ProxyCheck, and ZeroContentCheck flows.
MonitorLogs: application server log data
The application server logs are in Apache's Log4J format. Log4J is a standard Java logging framework used by many web-based applications. In a real-world implementation, this application could be reading many log files on many hosts. Here is a sample message:
<log4j:event logger="SaasApplication" timestamp="1392067731765" level="ERROR" thread="main"> <log4j:message><![CDATA[Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]]]></log4j:message> <log4j:throwable><![CDATA[com.me.saas.SaasMultiApplication$SaasException: Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User] at com.me.saas.SaasMultiApplication.login(SaasMultiApplication.java:1253) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.me.saas.SaasMultiApplication$UserApiCall.invoke(SaasMultiApplication.java:360) at com.me.saas.SaasMultiApplication$Session.login(SaasMultiApplication.java:1447) at com.me.saas.SaasMultiApplication.main(SaasMultiApplication.java:1587) ]]></log4j:throwable> <log4j:locationInfo class="com.me.saas.SaasMultiApplication" method="login" file="SaasMultiApplication.java" line="1256"/> </log4j:event>

Log4JSource retrieves data from …/Striim/Samples/MultiLOgApp/appData/log4jLog
. This file contains around 1.45 million errors, warnings, and informational messages. The XMLParser portion of Log4JSource specifies the portions of the message that will be used by this application:
CREATE SOURCE Log4JSource USING FileReader ( directory:'Samples/MultiLogApp/appData', wildcard:'log4jLog.xml', positionByEOF:false ) PARSE USING XMLParser( rootnode:'/log4j:event', columnlist:'log4j:event/@timestamp, log4j:event/@level, log4j:event/log4j:message, log4j:event/log4j:throwable, log4j:event/log4j:locationInfo/@class, log4j:event/log4j:locationInfo/@method, log4j:event/log4j:locationInfo/@file, log4j:event/log4j:locationInfo/@line' ) OUTPUT TO RawXMLStream;
For example, for the sample log message above, log4j:event/@level returns WARN and log4j:event/log4j:locationInfo/@line returns 1133. These elements are output as a WAEvent data
array with the following values:
"1392067731765", "ERROR", "Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]","com.me.saas.SaasMultiApplication$SaasException: Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]\n\tat com.me.saas.SaasMultiApplication.login(SaasMultiApplication.java:1253)\n\tat sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:606)\n\tat com.me.saas.SaasMultiApplication$UserApiCall.invoke(SaasMultiApplication.java:360)\n\tat com.me.saas.SaasMultiApplication$Session.login(SaasMultiApplication.java:1447)\n\tat com.me.saas.SaasMultiApplication.main(SaasMultiApplication.java:1587)", "com.me.saas.SaasMultiApplication", "login", "SaasMultiApplication.java", "1256"
This array in turn is processed by the ParseLog4J CQ:
CREATE CQ ParseLog4J INSERT INTO Log4JStream SELECT TO_DATE(TO_LONG(data[0])), data[1], data[2], MATCH(data[2], '\\\\[api=([a-zA-Z0-9]*)\\\\]'), MATCH(data[2], '\\\\[session=([a-zA-Z0-9\\-]*)\\\\]'), MATCH(data[2], '\\\\[user=([a-zA-Z0-9\\-]*)\\\\]'), MATCH(data[2], '\\\\[sobject=([a-zA-Z0-9]*)\\\\]'), data[3], data[4], data[5], data[6], data[7] FROM RawXMLStream;
The elements in the array are numbered from zero, so data[0] returns the timestamp, data[1] returns the level, and so on. The MATCH functions use regular expressions to return the api, session, user, and sobject portions of the message string. (See Using regular expressions (regex) for discussion of the multiple escapes for [ and ] in the regular expressions.) After processing by the CQ, the event looks like this:
logTime: 1392067731765 level: "ERROR" message: "Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]" api: "login" sessionId: "01e3928f-e975-ffd4-bdc5-14109fcf2383" userId: "HGonzalez" sobject: "User" xception: "com.me.saas.SaasMultiApplication$SaasException: Problem in API call [api=login] [session=01e3928f-e975-ffd4-bdc5-14109fcf2383] [user=HGonzalez] [sobject=User]\n\tat com.me.saas.SaasMultiApplication.login(SaasMultiApplication.java:1253)\n\tat sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:606)\n\tat com.me.saas.SaasMultiApplication$UserApiCall.invoke(SaasMultiApplication.java:360)\n\tat com.me.saas.SaasMultiApplication$Session.login(SaasMultiApplication.java:1447)\n\tat com.me.saas.SaasMultiApplication.main(SaasMultiApplication.java:1587)" className: "com.me.saas.SaasMultiApplication" method: "login" fileName: "SaasMultiApplication.java" lineNum: "1256"
The application server log data is now in a format that Striim can analyze. Log4JStream is used by the ErrorsAndWarnings and InfoFlow flows.
ErrorsAndWarnings

The CQ GetLog4JErrorWarning
filters Log4JStream, selecting only WARN and ERROR messages and discarding all others.
CREATE CQ GetLog4JErrorWarning INSERT INTO Log4ErrorWarningStream SELECT l FROM Log4JStream l WHERE l.level = 'ERROR' OR l.level = 'WARN';
Log4ErrorWarningStream is used by the ErrorHandling and WarningHandling flows and by the Log4JErrorWarningActivity sliding window, which contains the most recent 300 events:
CREATE WINDOW Log4JErrorWarningActivity OVER Log4ErrorWarningStream KEEP 300 ROWS;
This window is used by the LargeRTCheck and ZeroContentCheck flows.
HackerCheck

This flow sends an alert when an access log srcIp
value is on a blacklist. The BlackListLookup
cache contains the blacklist:
CREATE CACHE BlackListLookup using FileReader ( directory: 'Samples/MultiLogApp/appData', wildcard: 'multiLogBlackList.txt' ) PARSE USING DSVParser ( ) QUERY (keytomap:'ip') OF IPEntry;
The CQ FindHackers
selects access log events that match a blacklist entry:
CREATE CQ FindHackers INSERT INTO HackerStream SELECT ale FROM AccessStream ale, BlackListLookup bll WHERE ale.srcIp = bll.ip;
The CQ SendHackingAlerts
sends an alert for each such event:
CREATE CQ SendHackingAlerts INSERT INTO HackingAlertStream SELECT 'HackingAlert', ''+accessTime, 'warning', 'raise', 'Possible Hacking Attempt from ' + srcIp + ' in ' + IP_COUNTRY(srcIp) FROM HackerStream; CREATE SUBSCRIPTION HackingAlertSub USING WebAlertAdapter( ) INPUT FROM HackingAlertStream;
This flow also creates the UnusualActivity WActionStore that populates various charts and tables on the dashboard:
CREATE TYPE UnusualContext ( typeOfActivity String, accessTime DateTime, accessSessionId String, srcIp String KEY, userId String, country String, city String, lat double, lon double ); CREATE WACTIONSTORE UnusualActivity CONTEXT OF UnusualContext ...
The CQ GenerateHackerContext
populates UnusualActivity:
CREATE CQ GenerateHackerContext INSERT INTO UnusualActivity SELECT 'HackAttempt', accessTime, sessionId, srcIp, userId, IP_COUNTRY(srcIp), IP_CITY(srcIP), IP_LAT(srcIP), IP_LON(srcIP) FROM HackerStream LINK SOURCE EVENT;
HackAttempt
is a literal string that identifies the type of activity. That will distinguish events from this flow from those from the three other flows that populate UnusualActivity.
LargeRTCheck

LargeRTCheck sends an alert whenever an access log responseTime value exceeds 2000 microseconds.
CREATE CQ FindLargeRT INSERT INTO LargeRTStream SELECT ale FROM AccessStream ale WHERE ale.responseTime > 2000;
The alert code is similar to HackerCheck's.
The typeOfActivity string for events written to the UnusualActivity WActionStore is LargeResponseTime
.
ProxyCheck

ProxyCheck sends an alert when an access log srcIP
value is on a list of suspicious proxies. This works exactly like HackerCheck but with a different list. The typeOfActivity string for events written to the UnusualActivity WActionStore is ProxyAccess
.
ZeroContentCheck

ZeroContentCheck sends an alert when an access log entry's code
value is 200
(that is, the HTTP request succeeded) but the size
value is 0
(the return had no content).
CREATE CQ FindZeroContent INSERT INTO ZeroContentStream SELECT ale FROM AccessStream ale WHERE ale.code = 200 AND ale.size = 0;
The alert code is similar to HackerCheck's.
The typeOfActivity string for events written to the UnusualActivity WActionStore is ZeroContent
).
ErrorHandling

This flow sends an alert immediately when an error appears in Log4ErrorWarningStream
.
CREATE CQ GetErrors INSERT INTO ErrorStream SELECT log4j FROM Log4ErrorWarningStream log4j WHERE log4j.level = 'ERROR'; CREATE CQ SendErrorAlerts INSERT INTO ErrorAlertStream SELECT 'ErrorAlert', ''+logTime, 'error', 'raise', 'Error in log ' + message FROM ErrorStream;
CQ GetErrors
discards all WARN messages and passes only ERROR messages. In CQ SendErrorAlerts
, since the key value is logTime
(which is different for every event) and the flag is raise
(see Sending alerts from applications), an alert will be sent for every message in ErrorStream.
WarningHandling

This flow sends an alert once an hour with the count of warnings for each API call for which there has been at least one warning. The following code creates a one-hour jumping window of application log warning messages:
CREATE CQ GetWarnings INSERT INTO WarningStream SELECT log4j FROM Log4ErrorWarningStream log4j WHERE log4j.level = 'WARN'; CREATE JUMPING WINDOW WarningWindow OVER WarningStream KEEP WITHIN 60 MINUTE ON logTime;
The HAVING
clause in the CQ SendWarningAlerts
filters out API calls that have had no warnings.
CREATE CQ SendWarningAlerts INSERT INTO WarningAlertStream SELECT 'WarningAlert', ''+logTime, 'warning', 'raise', COUNT(logTime) + ' Warnings in log for api ' + api FROM WarningWindow GROUP BY api HAVING count(logTime) > 1;
InfoFlow, APIFlow, CompanyApiFlow, and UserApiFlow
InfoFlow joins application log INFO events with user information from the MLogUserLookup cache:
CREATE CQ GetInfo INSERT INTO InfoStream SELECT log4j FROM Log4JStream log4j WHERE log4j.level = 'INFO'; CREATE CQ GetUserDetails INSERT INTO ApiEnrichedStream SELECT a.userId, a.api, a.sobject, a.logTime, u.userName, u.company, u.userZip, u.companyZip FROM InfoStream a, MLogUserLookup u WHERE a.userId = u.userId;
Otherwise this portion of the application is generally similar to PosApp. APIFlow, CompanyApiFlow, and UserAPIFlow populate dashboard charts and send alerts as described in the summary above.