Shell scripting with AWS Lambda: The Function

In the previous article, I detailed how AWS Lambda can be used to act as a scripting control tool for other AWS services. The fact that it is focused on running individual functions, contains the AWS SDK by default, and only acrues costs when running create a perfect situation for the use of administrative scripting. In this article, I detail the use of Lambda functions to perform the cleaning itself.

Lambda functions are single JavaScript Node.js functions that are called by the Lambda engine. They take two parameters that provide information about the event that triggered the function call and the context the function is running under. It is important that these functions run as stateless services that do not depend on the underlying compute infrastructure. In addition, it is helpful to keep the function free of excessive setup code and dependencies to minimize the overhead of running functions.

The Code

Imports

var aws = require('aws-sdk');
var async = require('async');
var moment = require('moment');
 
var ec2 = new aws.EC2({apiVersion: '2014-10-01'});

The AWS SDK is available to all Lambda functions and we import and configure it for use with EC2 in this example. You can also include any Javascript library that you would with Node. I have included both the Async module and the Moment.js library for time.

Core Logic

var defaultTimeToLive = moment.duration(4, 'hours');

function shouldStop(instance) {
    var timeToLive = moment.duration(defaultTimeToLive.asMilliseconds());
    instance.Tags.forEach(function (tag) {
      if (tag.Key == 'permanent') {
        return false;
      } else if (tag.Key == "ttl-hours") {
        timeToLive = moment.duration(tag.Value, 'hours');
      }
    });

    var upTime = new Date().getTime() - instance.LaunchTime.getTime();

    if (upTime < timeToLive.asMilliseconds()) {
      timeToLive.subtract(upTime);
      console.log("Instance (" + instance.InstanceId + ") has " + timeToLive.humanize() + " remaining.");
      return false;
    }
  return true;
}

I use the AWS tagging mechanism to drive the decision if an EC2 instance should be stopped. If the instance is tagged as 'permanent' or with sepecific 'ttl-hours' tag, then the function knows that it should be kept alive and for how long. If a tag wasn't added, we want to terminate those instance after a default time period. It might be helpful to have this externalized to an AWS configuration store such as SimpleDB, but I leave that as an exercise for the reader. Finally, it is helpful to log the amount of time the instances have left on their TTL.

Searching the instances

async.waterfall([
    function fetchEC2Instances(next) {
      var ec2Params = {
        Filters: [
          {Name: 'instance-state-name', Values: ['running']}
        ]
      };

      ec2.describeInstances(ec2Params, function (err, data) {
        next(null, err, data)
      });
    },
    function filterInstances(err, data, next) {
      var stopList = [];

      data.Reservations.forEach(function (res) {
        res.Instances.forEach(function (instance) {
          if (shouldStop(instance)) {
            stopList.push(instance.InstanceId);
          }
        });
      });
      next(null, stopList);
    },
    function stopInstances(stopList, next) {
      if (stopList.length > 0) {
        ec2.stopInstances({InstanceIds: stopList}, function (err, data) {
          if (err) {
            nex(err);
          }
          else {
            console.log(data);
            next(null);
          }
        });
      }
      else {
        console.log("No instances need to be stopped");
        next(null);
      }
    }
  ],
  function (err) {
    if (err) {
      console.error('Failed to clean EC2 instances: ', err);
    } else {
      console.log('Successfully cleaned all unused EC2 instances.');
    }
    context.done(err);
  });

This should look familiar to everyone that has done Javascript AWS SDK work.  We use the Async library to query for running instances.  We then run the returned instance data through our helper method as a filter.  Finally, we take all of the identified instances and stop them. 

This code works well for a moderate number of running instances.  If you need to handle thousands of instances in your organization, you will need to adjust the fetch and stop processes to handle AWS SDK paging.  

You can find this code in our Github repository here: https://github.com/dev9com/lambda-cleanup/blob/master/lambda-node/CostWatch.js.

Next Steps

The final piece of the puzzle for our Lambda scripting is deployment and scheduling.  In my final article on this, I will cover both how to deploy a Lambda function and the current, kludgy, method for job scheduling using EC2 autoscaling.