How to continuously assess the security of your AMIs

Jawad Seddar 23rd February 2018
continuous monitoring

Jawad Seddar, Cloud Systems Developer at Cloudreach, gives insight on a continuous AMI assessment process using Amazon Inspector, Lambda and CloudWatch.

 

Vulnerability assessments

 

When deploying resources in the cloud or in your datacenters, it’s always recommended to make sure they are not affected by known vulnerabilities as these can easily be leveraged to gain access to your data or that of your users.

Running vulnerability assessments on a regular basis is therefore necessary and needs to encompass as many of the deployed systems as possible.

 

Amazon Inspector

 

Amazon Inspector is an AWS managed service that enables you to run vulnerability assessments on your instances with little to no configuration. It supports most common flavors of Linux, Windows Server and is based around rules packages which define what types of vulnerabilities to check for (the most common one being the CVE rules package).

In order to use Amazon Inspector, you need to define an assessment target which is a set of tags that will determine which resources to include in the assessment. Then, an assessment template needs to be defined which will point to an assessment target as well as rules packages.

Once these are defined, you simply need to have resources with the appropriate tag running and start an assessment run. This will trigger the agent running on those instances to send telemetry back to Inspector based on the rules packages defined in the template.

At the end of the assessment, Inspector will analyze the telemetry and generate findings. The findings can then be reviewed and or downloaded as part of an assessment report.

 

Continuous assessment of AMIs

 

As part of a project, we have been working on setting up an AMI Factory for a client. This factory builds AMIs for different OSs and versions using packer.

Those AMIs are not built on a regular basis but are used throughout the client’s accounts for different applications. It is therefore essential that these AMIs are properly secured. We run a number of hardening jobs on those but we still need to make sure that we ship them without known vulnerabilities.

In order to achieve this, we used Amazon Inspector, CloudWatch events, SNS topics and Lambda functions in order to react to events and notify users when needed. Below is the complete solution.

Cloudreach solution

Events

 

There are 2 types of triggers for the assessment pipeline:

  • CloudWatch schedule
  • CloudWatch event

The schedule is set to run on a daily basis, while the event rule responds to a build event from the AMI factory pipeline. This ensures that the latest AMI is checked for security vulnerabilities right after it’s built and throughout its lifetime. It also enables us to check that after patching an AMI and running a new build, previous vulnerabilities are indeed fixed.

 

Lambda – Start EC2

 

This Lambda takes a parameter from the CloudWatch event (namely the OS code). It then pulls some parameters from SSM such as the Inspector assessment template ARN and some tags and starts an EC2 instance with the aforementioned tags. It sets its userdata to download and install the Inspector agent and start an assessment run with the correct template.

import os
import json
import boto3
 
def handler(event, context):
 print event
 
 assessments = event.get("Assessments", [])
 
 session = boto3.session.Session()
 region = session.region_name
 
 s3 = boto3.resource('s3')
 ec2 = boto3.client('ec2')
 ssm = boto3.client('ssm')
 
 instance_type = os.environ['INSTANCE_TYPE']
 inspector_parameters = ssm.get_parameter(Name="{{SSMParameterName}}")['Parameter']['Value']
 inspector_parameters = json.loads(inspector_parameters)
 
 for assessment in assessments:
 OS = assessment.get("OS")
 
 # Start ec2 instance from latest AMI for given OS and apply tags
 assessment_parameters = inspector_parameters.get(OS)
 latest_version_id = assessment_parameters.get("AMI_ID")
 tags = [
 {
 'Key': assessment_parameters.get('TAG_KEY'),
 'Value': assessment_parameters.get('TAG_VALUE')
 }
 ]
 assessment_template = assessment_parameters.get('INSPECTOR_ASSESSMENT_TEMPLATE')
 response = ec2.run_instances(
 ImageId=latest_version_id,
 InstanceType=instance_type,
 MaxCount=1,
 MinCount=1,
 TagSpecifications=[
 {
 'ResourceType': 'instance',
 'Tags': tags
 }
 ],
 SubnetId=os.environ['SUBNET_ID']
 UserData="\n".join(
 [
 "#!/bin/bash",
 "yum install yum-utils -y",
 "package-cleanup --oldkernels --count=1 -y",
 "wget https://d1wk0tztpsntt1.cloudfront.net/linux/latest/install",
 "bash install",
 "/etc/init.d/awsagent start",
 "aws inspector start-assessment-run --assessment-template-arn {} --region {}"
 ]
 ).format(assessment_template, region),
 IamInstanceProfile={
 'Arn': os.environ['INSTANCE_PROFILE']
 }
 )
 instance_id = response.get("Instances")[0].get("InstanceId")
 print "Instance {} has been started...".format(
 instance_id
 )

The policy associated with the role attached to the EC2 instance and giving it rights to start an Inspector assessment run is the following:

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Action": [
 "inspector:StartAssessmentRun"
 ],
 "Resource": "*",
 "Effect": "Allow"
 }
 ]
}

 

Inspector Configuration

 

Inspector is set up to have one assessment target (set of tags) per OS that we maintain. We also deployed assessment templates that specify the assessment target, duration, rules packages and SNS topic to send notifications to at certain steps.

After the EC2 instance starts, it starts an assessment run based on one of those pre-deployed assessment templates. During the run, the agent installed on the instance will send telemetry to Inspector.

After the assessment run is over, Inspector will analyze all the telemetry and compare those to the rules in the different rules packages. Once this is done, Inspector will generate findings and publish a message on the SNS topic containing the assessment run ARN among other fields.

 

Lambda – Gather findings

 

This Lambda function is triggered by the SNS topic on which Amazon inspector publishes after the assessment run is finished.

It retrieves the assessment run ARN from the SNS message and loops through all the findings, retrieving the following fields:

  • Title
  • Severity
  • Description
  • Recommendation

It then creates a message with all the findings in a HTML table so it can be sent via email to admins using SES.

It also terminates the EC2 instances that were launched as part of the security assessment run.

import json
import os
import boto3
 
def handler(event, context):
 message = event['Records'][0]['Sns']['Message']
 data = json.loads(message)
 print data
 
 # Retrieve assessment run ARN from message
 assessment_run = data['run']
 
 ec2 = boto3.client('ec2')
 inspector = boto3.client('inspector')
 ses = boto3.client("ses")
 sns = boto3.client("sns")
 
 # Get a list of all ARN findings
 finding_arns = inspector.list_findings(
 assessmentRunArns=[assessment_run],
 maxResults=5000
 )
 
 finding_messages = []
 ami_names = set()
 instance_ids = set()
 finding_message_template = "<tr>" + "<td>{}</td>" * 4 + "</tr>"
 finding_message_header = "<tr>" + "<th>{}</th>" * 4 + "</tr>"
 finding_message_header = finding_message_header.format(
 "Title",
 "Severity",
 "Description",
 "Recommendation"
 )
 ami_link = os.environ['AMI_LINK_TEMPLATE']
 
 for finding_arn in finding_arns['findingArns']:
 finding = inspector.describe_findings(findingArns=[finding_arn])
 for result in finding['findings']:
 instance_id = result['assetAttributes']['agentId']
 instance_ids.add(instance_id)
 ami_id = ec2.describe_instances(InstanceIds=[instance_id]).\
 get("Reservations")[0].get("Instances")[0].get("ImageId")
 ami_name = ec2.describe_images(ImageIds=[ami_id]).\
 get("Images")[0].get("Name")
 ami_names.add("{} ({})".format(
 ami_link.format(ami_name, ami_name),
 ami_link.format(ami_id, ami_id)
 ))
 finding_message = finding_message_template.format(
 result.get('title'),
 result.get('severity'),
 result.get('description'),
 result.get('recommendation')
 )
 finding_messages.append(finding_message)
 
 results = "".join([
 "<table>",
 "".join(finding_message_header),
 "".join(finding_messages),
 "</table>"
 ])
 
 if not instance_ids:
 response = inspector.list_assessment_run_agents(
 assessmentRunArn=assessment_run
 )
 for agent in response.get("assessmentRunAgents"):
 instance_id = agent.get("agentId")
 instance_ids.add(instance_id)
 ami_id = ec2.describe_instances(InstanceIds=[instance_id]).\
 get("Reservations")[0].get("Instances")[0].get("ImageId")
 ami_name = ec2.describe_images(ImageIds=[ami_id]).\
 get("Images")[0].get("Name")
 ami_names.add("{} ({})".format(
 ami_link.format(ami_name, ami_name),
 ami_link.format(ami_id, ami_id)
 ))
 
 results = "<p>There were no issues found on this AMI.</p>"
 
 for instance_id in instance_ids:
 print 'Terminating instance {}'.format(instance_id)
 ec2.terminate_instances(InstanceIds=[instance_id])
 
 from_address = os.environ['SOURCE_EMAIL']
 subject = "Inspector - Assessment run report"
 body = "\n".join([
 "<p>The findings from your assessment run can be found below.</p>",
 "<p>This assessment ran for the following AMIs: {}</p>".format(
 ", ".join(ami_names)
 ),
 results
 ])
 
 ses.send_email(
 Source=from_address,
 Destination={
 "ToAddresses": [<LIST_OF_ADDRESSES>]
 },
 Message={
 "Subject": {
 "Data": subject
 },
 "Body": {
 "Html": {
 "Data": body
 }
 }
 }
 )

Here is the policy in the service role associated with that Lambda function:

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Action": [
 "ec2:TerminateInstances",
 "ec2:DescribeImages",
 "ec2:DescribeInstances"
 ],
 "Resource": "*",
 "Effect": "Allow"
 },
 {
 "Action": [
 "inspector:DescribeFindings",
 "inspector:ListFindings",
 "inspector:ListAssessmentRunAgents"
 ],
 "Resource": "*",
 "Effect": "Allow"
 },
 {
 "Action": [
 "ses:SendEmail"
 ],
 "Resource": "*",
 "Effect": "Allow"
 }
 ]
}

 

Closing words

 

As you can see from the implementation above, we now have a fully functional and automated system to ensure that the AMIs we deploy are vulnerability free. It runs continuously and doesn’t need much upkeep.

Such a system gives confidence to our client that the AMIs they use are safe. It’s very similar to CI/CD for AMI deployment.