logoBack to home screen

Running Conversion and Platform Health Checks

This document explains how to run health checks on conversion service and ADx. You can use any of the below methods, whichever you find most convenient.

For checks on legacy features, see Running Deep Health Checks on Legacy Endpoints

From User Interface

From Landing Page

The quickest way to get a health check from the interface is to use the landing page.

  1. Log in to ADx. The landing page opens.
  2. Under Operations/Health, select the appropriate health check. After a while, the health check report is generated and displayed. All items in the report should have the OK status. If this is not the case, contact your support person.

From Conversion Access

  1. Log in to ADx. The landing page opens.

  2. Under Operations, select Conversion/Control Center. Conversion access opens as a result.

  3. Find the Health section in the left-hand menu. It provides you with the following options:

    Health checkDescription
    PlatformRuns the health check on the conversion service platform (Tribefire). When ready, the report is printed out in the interface.
    ConversionRuns the health check on the conversion service itself. When ready, the report is printed out in the interface.
    Conversion Health CheckRuns the health check on available converters. When ready, the report is printed out in the interface as a list, in the standard Tribefire format.
    Conversion Health Check RESTRuns the health check on available converters. When ready, the report is printed out in the browser as a REST response, in JSON.

From Terminal

As an alternative to the above methods, you can run health checks from terminal:

Basic Health Check

To run the basic health check:

  1. In the folder where you unzipped the deployment package, open the terminal and run the ./check-health.sh --url http://localhost:8080 command.

If you are using a different server, port or transfer protocol (http/https), you must change the command accordingly. You can reference the environment variable script while running health check by adding --environment ../environment.sh to the command (adapt the path accordingly if you saved your environment script elsewhere).

After the health check has finished (it may take a few minutes), there are two possible outcomes:

  • Successfully completed health check. Deployed services are available.
  • Health check failed! For more information see health-check-result.json and/or run health check directly via http://localhost:8080/tribefire-services/healthz

Health Check with JSON/HTML Response

This check returns a full response, either in JSON, HTML or plain text.

Either in a local setup or a distributed setup the status of an instance needs to be reflected to decide if the instance is still alive or if it must be deactivated/replaced. The check API provides a standardized procedure to define and implement checks in order to obtain information about the status of a tribefire instance. Such checks may include:

  • Availability of database connections
  • Reachability of extensions
  • Thread Pools
  • Status of deployables, especially accesses

The platform provides the /healthz endpoint which is responsible for the execution of checks. The full URL is https://[host]:[port]/tribefire-services/healthz. By default, this prints an HTML report:

If you render the output, you get a human-readable report:

You can change the output to either JSON or plain text by passing a HTTP Accept header, as explained below in the sample call.

Sample Call using cURL

curl --request GET https://hostname:port/tribefire-services/healthz --header "Accept: application/json"

You might want to use --insecure when using HTTPS with self-signed certificates. Keep in mind that this option disables certificate validation.

HTTP Accept HeaderDescription
application/jsonPrints out the report in JSON
text/htmlPrints out the report in HTML
text/plainPrints out the report in plain text

JSON Response

In case the accept header is set to application/json, the endpoint answers with a JSON structured as check results per node.

Sample output:

{"_type": "flatmap", "value":[
 {"_type": "com.braintribe.model.service.api.InstanceId", "_id": "0",
   "applicationId": "master",
   "nodeId": "tf@NB-VIE01-CWI02#200128101854615f41f7f263064628ab"
  },[
   {"_type": "com.braintribe.model.check.service.CheckResult", "_id": "1",
    "entries": [
     {"_type": "com.braintribe.model.check.service.CheckResultEntry", "_id": "2",
      "checkStatus": "ok",
      "details": "Check infrastructure is ok",
      "name": "Base Check"
     },
     {"_type": "com.braintribe.model.check.service.CheckResultEntry", "_id": "3",
      "checkStatus": "ok",
      "details": "Active Threads: 0, Total Executions: 1, Average Execution Time: 182 ms, Pool Size: 0, Core Pool Size: 5",
      "name": "Thread Pool: Activation"
     },
     ...
     ...
   }
}

The maps's key value defines the type InstanceId which reflects the node (applicationId@nodeId) that was responsible for the check execution.
The map's value defines the list of CheckResults. A CheckResult returns a list of CheckResultEntry.

A CheckResultEntry is qualified by:

ValueDescription
statusSet to one of the following values: ok, warn or fail
nameThe name of the executed check e.g. "DB Connectivity Check"
messageThe summarized check result message.
detailsContains check result details like an exception stacktrace or further information of the check result

HTTP Status Codes

If you're using a monitoring system, you might be interested in the different HTTP status codes returned.

A 200 OK is returned when all checks have passed, while a 503 Service Unavailable is returned when at least one check has failed.
If check result entries contain at least one warn or fail, the returned HTTP status code is always 503 Service Unavailable. For example, if you have 4 checks:

  • [ok, ok, ok, ok] the status is 200 OK
  • [ok, ok, ok, warn] the status is 503 Service Unavailable

Custom HTTP Status Codes

You can define a custom status code for when a warn is thrown. You can do this by adding the warnStatusCode=123 parameter to the URL when calling the endpoint. Sample call:

https://hostname:port/tribefire-services/healthz?warnStatusCode=123

If the parameter is set, the defined status code is returned in case a CheckResult results in a warn. If this parameter is not defined, the default HTTP status code 503 is used.