ElasticCleaner for AWS Elasticsearch
AWS’ Elasticsearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS cloud. Here at Cloudreach, we value the importance of open sourcing so have developed a guide to tackling one of the limitations of AWS Elasticsearch.
Why use AWS Elasticsearch?
AWS Elasticsearch is a managed service, in which failed nodes are automatically replaced. It is fully integrated with the AWS ecosystem which means that you can control and manage the cluster size using the AWS API. Not to mention, your data can be backed up directly with Amazon S3. As well as these benefits, AWS Elasticsearch supports CloudWatch Metric which means you are able to control and monitor the status of your cluster. AWS Elasticsearch can be deployed through Cloudformation. This enables you to deploy in multiple accounts, across multiple locations and all within the same structure/config. Finally, if these benefits were not enough to entice you, there are also multiple engine versions (currently supports 1.5, 2.3, 5.1).
When should you use AWS Elasticsearch?
Now that you know why you should use AWS Elasticsearch, you may be wondering when best to do so. The following scenarios are ideal to use AWS Elasticsearch.
- Organising Cloudwatch Logs
- Ingesting Kinesis Firehose
- Ingesting S3 log files for example Cloudtrail
- Ingesting EC2 log files using Filebeat/Fluentd
- Advanced VPC Flow Audit
- ELB/ELBv2 log parsing
This sounds fantastic, but it can’t be perfect…
You’d be correct in thinking this. Currently AWS Elasticsearch has some limitations. From a security perspective, the infrastructure is outside of your VPC. In addition, access to Elasticsearch is only possible via IAM or IP whitelist and it does not support HTTP Basic Authentication.
Additionally, for every installation of ELK, a specific tool (curator) is usually required to maintain the cluster. Without having access to the underlying instances of AWS Elasticsearch, this configuration is not possible and requires a dedicated EC2 in order to cope with the related cost involved.
Cloudreach designed a dedicated AWS Lambda to clean up old indexes based on a specified regular expression. We chose AWS Lambda to simplify the management of AWS Elasticsearch rather than having another EC2. As an additional benefit, the cost of Lambda is considerably lower than running a dedicated EC2.
If you have implementation or troubleshooting questions, please open an issue in our repository.
For more open sourcing projects, please check out the work we are doing with Cloudreach Sceptre.