Striim 3.9.7 documentation

Monitoring using the system health REST API

This REST API endpoint allows you to retrieve various statistics about a Striim cluster. The basic URI syntax (see Getting a REST API authentication token) is:

  http://<IP address>:<port>/health?token=<token>

For example:

  http://localhost:9080/health?token=01e56161-9e42-3811-8157-685b3587069e

If you pretty-print the return, it will look something like this:

{
    "healthRecords": [
        {
            "kafkaHealthMap": {},
            "waStoreHealthMap": {
                "Samples.UnusualActivity": {
                    "fqWAStoreName": "Samples.UnusualActivity",
                    "writeRate": 0,
                    "lastWriteTime": 1508429845353
                } ...
            },
            "cacheHealthMap": {
                "Samples.MLogZipLookup": {
                    "size": 87130,
                    "lastRefresh": 1508429842638,
                    "fqCacheName": "Samples.MLogZipLookup"
                } ...
            },
            "clusterSize": 1,
            "appHealthMap": {
                "ns3.ProxyCheck": {
                    "lastModifiedTime": 1508371187853,
                    "fqAppName": "ns3.ProxyCheck",
                    "status": "CREATED"
                } ...
            },
            "serverHealthMap": {
                "Global.S192_168_1_14": {
                    "memory": 3693349800,
                    "cpu": "10.6%",
                    "elasticsearchFree": "56GB",
                    "fqServerName": "Global.S192_168_1_14",
                    "diskFree": "/: 56GB"
                }
            },
            "sourceHealthMap": {
                "Samples.AccessLogSource": {
                    "eventRate": 0,
                    "lastEventTime": 1508429845353,
                    "fqSourceName": "Samples.AccessLogSource"
                } ...
            },
            "elasticSearch": true,
            "targetHealthMap": {
                "Samples.CompanyAlertSub": {
                    "eventRate": 0,
                    "fqTargetName": "Samples.CompanyAlertSub",
                    "lastWriteTime": 1508429850358
                } ...
            },
            "stateChangeList": [
                {
                    "currentStatus": "CREATED",
                    "type": "APPLICATION",
                    "fqName": "Samples.MultiLogApp",
                    "previousStatus": "UNKNOWN",
                    "timestamp": 1508429825651
                } ...
            ],
            "issuesList": [],
            "startTime": 1508429825345,
            "id": "01e7b4e8-f298-76e2-ade3-685b3587069e",
            "endTime": 1508429855366,
            "derbyAlive": true,
            "agentCount": 0
        }
    ],
    "next": "/healthRecords?size=1&from=1",
    "prev": "/healthRecords?size=1&from=0"
}
  • Times are in milliseconds.

  • In cacheHealthMap, size is the amount of memory used, in bytes.

  • In serverHealthMap, cpu is the percentage used by the Java virtual machine at the time the server health was recorded, and memory is the amount of free memory usable by the server, in bytes.

issuesList will contain any log entries of level ERROR.

You can use start and end switches to return records from a specific time range, for example, http://<IP address>:<port>/health?start=<start time in milliseconds>&end=<end time in millisections>&token=<token>.

You can use the id value from the summary to return a subset of the data using the following syntax:

http://<IP address>:<port>/health/<id>/{agents|apps|caches|clustersize|derby|es|issues|servers|sources|statechanges|targets|wastores}?token=<token>

For example, curl -X GET http://<IP address>:<port>/health/<id>/apps?token=<token> will return only the appHealthMap portion of the data.