Monitor health of non-Prod services with Jenkins & Slack

The product I currently work on depends on 3 different vendor APIs and, as you would expect, their non-Prod environments go down fairly regularly.

I found it very frustrating that I never knew when their services came back online. So I decided to build a healthcheck endpoint which integrated with Jenkins and Slack and notified me, via Slackbot, once their services came back up. This meant I could switch back and start testing tickets again asap. When you have tight deadlines, and service regularly go down, every minute counts!

Overview

The following tutorial will ping a our health check endpoint every 15 minutes and send a message, via Slackbot, only if the endpoint’s status has changed i.e. service changed from down to up or up to down. There are 2 prerequisites to getting this solution to work for you.

  1. You will need permissions to create and configure Jenkins jobs
  2. Your instance of Jenkins will need the Slack plugin

Choose endpoint to monitor

First off, find an endpoint that you want to monitor. This tutorial works with JSON responses but you can easily tweak it for xml data.

Creating a Jenkins Job

There are 3 stages within the Jenkins job that need to be configured.

  1. Build Triggers – I used a cron job to kick of this health check every 15 minutes (I would recommend using Crontab.guru make your Cron expressions easier to read). Click the “Build periodically” checkbox and paste the below cron details into the “Schedule” textfield.
15 08-18 * * 1-5

Fyi, the above job will run “at minute 15 past every hour from 8am through 6pm on every day-of-week from Monday through Friday”.

2. Build Stage – you will want to create a simple “execute shell” job for the “Build Stage”

Our sample healthcheck endpoint returns the following JSON response…

{
    "status": "UP"
}

… so our sample Shell script now looks like…

HEALTH=$(curl "https://my.sample.api.com/health/vendor_1" | jq '.status')

if [[ $HEALTH = *"UP"* ]]; then
 exit 0
else
 exit -1
fi

For anyone else who isn’t that familiar with Unix syntax, I’ve tried to explain each step in a bit more detail.

Shell script componentDescriptionResource
curl “https://my.sample.api.com/health/vendor_1Make a network call to the healthcheck endpointcURL tutorial
| jq ‘.status’Read the JSON response and return the value of “status” i.e. return the word “UP”jq tutorial
jq cheatsheet
HEALTH=$( ….. )Set the return value of everything inside ( …. ) to a variable called “HEALTH”
if [[ $HEALTH = “UP” ]]; thenCheck if HEALTH equals the string “up”Unix “if statement” tutorial
exit 0“0” makes the Jenkins job pass
exit -1“-1” makes the Jenkins job fail

3. Post-build Actions – like I mentioned previously, you will need the Slack plugin for Jenkins. Now click the “Add Post-Build Action” button and select “Slack notifications” from the dropdown. Configure as follows:

Slack integration

The final piece of the puzzle is to connect Jenkins with your Slack workspace. The next few steps are a bit painful but bear with me!

Firstly, in the “Slack notifications” section of the “Post-build Actions” stage, enter the workspace and as well as the channel and/or memberId.

Next, go to your browser and login to you Slack workspace. Go to “https://your_workspace_name.slack.com/apps/manage ” (swap the workspace name for yours), search for the Jenkins CI app, click on it. Once that page loads, click the green “Add to Slack” button and follow the prompts.

Finally return back to Jenkins, navigate to the “Slack notifications” section again and click the “Test Connection” button near the bottom of the page. You should now be see a message from Jenkins appear in your nominated Slack channel or in your personal Slackbot.

Slackbot setup

If you would prefer to receive messages directly to your Slackbot you should add your memberId to the “Channel / memberId” textfield within the “Slack notifications” section. To find your memberId, go to “Profile & account” -> click on vertical three dot button -> your memberId should now be displayed

Unexpected benefits

This simple Slack/Jenkins solution brought huge benefits to everyone on our team, not just the Devs and Testers. We were now able to build reports (in Splunk) off the back of these Jenkins build results so our Delivery team could now confidently communicate outages and downtime with senior management and our Vendors.

The Final result

Below is what you will end up with. Also the Slack app for Jenkins gives you a nice summary of how long the outage lasted.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.