Automate Data Gathering and Evaluation with AWS

Jay Almers | Senior Cloud Solutions Architect


Whether you are creating usage and billing reports or taking inventory of your AWS account(s) in preparation for an application or service migration, data gathering and evaluation are imperative to making actionable, informed decisions. AWS provides some very powerful tools in their management console to assist their users in capturing the information necessary to make these kinds of decisions; however, searching for and obtaining data from multiple tools and services and combining them into an aggregate report can often be a tedious task.

To make things more difficult, let’s complicate that process by capturing this same information from multiple accounts. Depending on the volume of data, the number of sources, and the complexity of manipulation you need to perform, the amount of effort needed for data gathering and evaluation of this information could be quite significant. To make things even more difficult, what happens to the data (and the resulting reports) if a number was off by one decimal place unknowingly introduced due to human error?


Anything you have to do more than twice has to be automated.”

-Adam Stone, CEO of D-Tools


Automation is used in countless industries, from automotive manufacturing to zoological research, to perform repetitive tasks and decrease the possibility of error due to human intervention. The world of Information Technology is no different. Using the data inaccuracy scenario described above, it is clear to see that the process of collecting large amounts of data from multiple sources in a single account would benefit from automation, if for no other reason than to decrease the amount of time required to generate reports. In a multi-account situation, the time savings are even greater.

Another benefit of automation is its inherent ability to decrease the chances of introducing errors due to human intervention. That’s all well and good, but you may be wondering how we can automate the process of collecting this data from AWS without utilizing the management console and ultimately making your life easier. Enter the AWS Command Line Interface (CLI), an amazing tool that can be used for everything from service provisioning to data collection. With a little bit of programming and knowledge of how the AWS CLI works, we can automate the process of collecting information from multiple sources and data points in an efficient and repeatable manner without the need for tedious and error-prone human intervention.

To provide a real-world illustration of how this type of automation can be used, I’ll provide some details from a recent engagement TekStream was a part of. Our client, a large organization with many different internal departments, wanted to migrate a large number of non-enterprise AWS accounts into their controlled Enterprise Master Account. In order to migrate these accounts into the correct organizational structure with the necessary services, controls, and policies applied, they needed to determine the service and application requirements as well as gain an understanding of other operational considerations such as dependencies, availability, security, and fault tolerance.

Capturing this amount of information would have taken a tremendous amount of time if done manually; time that could be better spent on other areas of business. We developed a series of scripts and utilities which utilized the AWS CLI and a few supporting libraries for data parsing and manipulation. The scripts collected this invaluable information and created aggregate reports that simplified the process of comparing accounts and assisted the client in making actionable, data-driven decisions in preparation for migration.

Two of the main benefits of programmatic data collection, aggregation, and parsing is using native iteration in the form of loops and conditional logic such as if/then/else statements.  For example, we needed to capture information about all the configured Virtual Private Clouds (VPC) in all available AWS regions. Manually, we would have needed to log into the console, change into each region, navigate to the VPC service, then copy and paste each VPC ID into the report. Then, for each VPC ID, we’d need to navigate to the Gateways, Subnets, Route Tables, Network ACL, DHCP Options, Security Groups, and VPC Peering sub-sections to collect all the required information needed to complete the report.

Instead, once the script was developed, we could simply supply a list of account numbers and let the scripts iterate through all regions and service endpoints, building the reports for us – trimming the required effort from multiple hours of manual data gathering and evaluation to approximately 5 minutes.

This is just one example of a countless number of use cases for employing automation to simplify your current business processes.

If you would like more information on automation and how you may be able to leverage it to make your life a little easier,
contact TekStream today!