Splunk, AWS, and the Battle for Burst Balance

By: Karl Cepull | Senior Director, Operational Intelligence

 

Splunk and AWS: two of the most adopted tools of our time. Splunk allows fantastic insight into your company’s data at an incredible pace. AWS allows an affordable alternative to on-premise or even other cloud environments. Together both of these tools can come together and allow for one of the best combinations to further the overall ability to show the value in your data. But, there are many systems that need to come together to make all of this work.

In AWS, you have multiple types of storage options available to you for your Splunk servers with their Elastic Block Storage (EBS) offering. There are multiple drive types that you can use – e.g. “io1”, “gp2”, and others. The ‘gp2’ volume type is perhaps the most common one, particularly because it is usually the cheapest. However, when using this volume type, you need to be aware of Burst Balance.

Burst Balance can be a wonderful system. At its core, what Burst Balance does is allow your volume’s disk IOPS to burst higher when needed, without you needing to pay for the guaranteed IOPS all of the time (like you do with the “io1” volume type). What are IOPS? This stands for Inputs/Outputs Per Second, and represent the number of reads and writes that can occur over time. Allowing the IOPS to burst can come in handy when there is a spike in traffic to your Splunk Heavy Forwarder or Indexer, for example. However, this system does have its downside that can actually cause the volume to stop completely!

The way Burst Balance works is on a ‘credit’ system. Every second, the volume earns 3 ‘credits’ for every GB of configured size. For example, if the volume is 100GB, you would earn 300 credits every second. These credits are then used for reads and writes – 1 credit for each read or write. When the volume isn’t being used heavily, it will store up these credits (up to a cap of 5.4 million), and when the volume gets a spike of traffic, the credits are then used to handle the spike.

However, if your volume is constantly busy, or sees a lot of frequent spikes, you may not earn credits at a quick enough rate to keep up with the number of reads and writes. Using our above example, if you had an average of more than 300 reads and writes per second, you wouldn’t earn credits fast enough to keep up. What happens when you run out of credits? The volume stops. Period. No reads or writes occur until you earn more credits (again 3/GB/sec). So, all you can do is wait. That can be a very bad thing, so it is something you need to avoid!

The good news is that AWS has tools that you can use to monitor and alert if your Burst Balance gets low. You can use CloudWatch to monitor the Burst Balance percentage, and also set up an alert if it gets low. To view the Burst Balance percentage, one way is to click on the Volume in the AWS console, then go to the Monitoring tab. One of the metrics is the Burst Balance Percentage, and you can click to view it in a bigger view:

As you can see in the above example, the Burst Balance has been at 100% for most of the last 24 hours, with the exception of around 9pm on 3/19, where it dropped to about 95% shortly, before returning to 100%. You can also set up an alarm to alert you if the Burst Balance percentage drops below a certain threshold.

So, what can you do if the Burst Balance is constantly dipping dangerously low (or running out!)? There are three main solutions:

  1. You can switch to another volume type that doesn’t use the Burst Balance mechanism, such as the “io1” volume type. That volume type has guaranteed, consistent IOPS, so you don’t need to worry about “running out”. However, it is around twice the cost of the “gp2” volume type, so your storage costs could double.
  2. Since the rate that you earn Burst Balance credits is based on the size of the volume (3 credits/GB/second), if you increase the size of the volume, you will earn credits faster. For example, if you increase the size of the volume by 20%, you will earn credits 20% faster. If you are coming up short, but only by a little, this may be the easiest/most cost-effective option, even if you don’t actually need the additional storage space.
  3. You can modify your volume usage patterns to either reduce the number of reads and writes, or perhaps reduce the spikes and spread out the traffic more evenly throughout the day. That way, you have a better chance that you will have enough credits when needed. This may not be an easy thing to do, however.

In summary, AWS’s Burst Balance mechanism is a very creative and useful way to give you performance when you need it, without having to pay for it when you don’t. However, if you are not aware of how it works and how it could impact your environment, it can suddenly become a crippling issue. It pays to understand how this works, how to monitor and alert on it, and options to avoid the problem. This will help to ensure your Splunk environment stays running even in peak periods.

Want to learn more? Contact us today!

Textract – The Key to Better Solutions

By: Troy Allen | Vice President of Emerging Technologies

 

Businesses thrive on information, but finding good data can be difficult to collect sort, and utilize due to the vast variety of sources and forms by which information is created and disseminated.  As organizations are inundated with documents, forms, data streams, and more it’s becoming more difficult to extract meaningful information efficiently and funnel that information into the systems that need it or present it in a fashion that drives better business decisions.  Textract, part of AWS’s ever-growing solutions for Machine Learning, can play a critical part in how businesses process documents and collect vital data for use in their critical solutions and operations.

While Optical Character Recognition (OCR) has been around for many years, many organizations tend to overlook its strengths and ability to improve data processing.  Textract, while it does provide OCR functionality as a Cloud-based service, is much more thorough in its ability to bring Machine Learning based models to your business applications.  In order for data to be useful, it must first be collected; Textract provides OCR capabilities to ensure text is recognized from paper-scanned documents to electronic forms.

For data to be really useful, it needs to have organization and structure; Textract provides the ability to automatically detect content layout and recognize key elements and the relationship of the text and the elements it discovers.  And finally, for data to not only be useful, but actually utilized, it needs to be accessed; Textract can easily share the data, in its context, with other applications and data stores through well-formatted data streams to applications, databases, and other services.  Textract is designed to collect and filter data from documents and files so that you don’t have to.  Solutions utilizing Textract naturally benefit from an automated flow of information from capture to storage, to retrieval.

Textract is more than just OCR

In 1914, Emanual Goldberg developed a machine that could read characters and convert them into telegraph code. Golderg also applied for a patent in 1927 for his “Statistical Machine”.   Goldberg’s statistical machine was designed to retrieve individual records from spools of microfilm by using a movie projector and a photoelectric cell to do pattern recognition to find the right record on microfilm. In many ways, Goldberg’s inventions are often credited as the beginning of Optical Character Recognition technology (OCR).  Over the next 92 years, OCR has become one of the most critical elements, which few have heard of, in building business solutions.

OCR moved beyond the business world to enabling sight-impaired people to read printed materials.  Ray Kurzweil and the National Federation of the Blind announced a new product in 1976, based on newly developed charged-coupled device (CCD) flatbed scanners and text-to-speech synthesizers, which has fundamentally changed the way we work with information.  It was no longer about reports, statistics, or data; it was about sharing information with anyone, in a format that could easily be accessible.  By 1978, OCR had moved into the digital world as a computer program.

Like all new technologies, OCR has had its issues and limitations.  In the beginning, text had to be very clear and created only in certain fonts to be recognized.  Scan quality of physical pages also plays a major factor in how well OCR engines extract the text from pages and in most cases, only a portion of the text is captured on poor scans.  Even today, with so many advancements in OCR, there are challenges to accurately collecting and organizing data from images.

Most OCR engines collect all the text from documents and make the words available for search engines, but very few OCR engines take it any further without requiring additional tools and applications.  Textract by Amazon Web Services goes beyond OCR by not only collecting the content but understanding where the content came from.

Textract provides the ability to not only perform standard character recognition but is designed to understand the formatting and how content is aligned within a page.  This is accomplished by recognizing and creating Bounding Boxes around key information and text areas to support the content, table extraction, and form extraction.

Item Location on a Document Page

The example image displays content that is separated by columns and has header information.

Figure 1 – Two Column Document Example

Most OCR applications will collect all the words on the page, but do not provide a reference to lines of text or location.  Amazon’s Textract retrieves multiple blocks of information from each page of the image it investigates:

  • The lines and words of detected text
  • The relationships between the lines and words of detected text
  • The page that the detected text appears on
  • The location of the lines and words of text on the document page

As the following illustration demonstrates, Textract is able to identify that there are two columns of information on the page.  It then recognizes that for each column, there are multiple lines of text which are made up of multiple words.

Figure 2 – Textract Line and Word Recognition

Textract outputs its findings in standard JSON files so that they can be utilized easily by other services or applications.  The example above would be represented in the JSON as follows:

Figure 3 – Sample JSON

Table Extraction

Amazon’s Textract is well equipped to locate table data within documents as well.  Textract recognizes the table construct and can establish key-value pairs with the cells by referencing the row and column information.  The following table represents 20 distinct cells, including the header row that will be evaluated by Textract:

Figure 4 – Sample table data

The output JSON from the Textract service creates a mapping between the rows and columns and intelligently identifies the key-value pairs in the table.  This recognition can also be performed against vertical table data rather than horizontal table.  The following illustrates the key-value pair matching:

Figure 5 – Table Key-Value Pair

In addition to detecting text, Textract has the ability to recognize selection elements such as checkboxes and radio buttons.  A checkbox that has not been selected, such as o or ¡ is represented as a status of NOT_SELECTED whereas Rž are represented as SELECTED and can be tied to a key-value pair as well.  This can be extremely helpful in finding values in both tables and forms.

 

Form Extraction

Businesses have been interacting with their clients and vendors for decades through forms.  Textract provides the ability to read form data and clearly define key-value pairs of information from them.  Many organizations struggle with the fact that forms change over time and it can be difficult to train tools to find data when those tools were specific for a particular form layout.  Textract removes that complexity by reading the actual text rather than a location on a form to get its information and analyzes documents and forms for relationships between the detected text.

Figure 6 – Sample form image

In the example above, Textract will create the following Key-value pairs:

Traditional OCR tools will provide all the available text out of an image or document, but to gather Key-value pairs from forms and data, as well as recognizing text based on words, lines, and understanding the blocking of content, additional tools are required.  Textract does all of this for you providing data that can then be further analyzed as needed.

Textract Considerations

Textract is specifically designed to perform OCR against image files such as JPG, PNG, and PDF file formats.  Most text-based document formats created electronically today do not require additional OCR since they are already embedded with an index that is accessible by search engines.  With the proliferation of mobile device and tablet use, there are still many times that images are created in which there is no inherent index available.  We use our phones to take pictures of everything including people, scenes, receipts, presentations, and much more.  It is quick and easy to capture the world around us, but it is more difficult to have a computer application capture important information that may be held in those photographs.  Textract enables the extraction of data from images so that you don’t have to.

As with all technologies, there are limits to what Textract can do and should be recognized before introducing it into a solution.  AWS maintains detailed information about the Amazon Textract service and its limitations and can be found here, https://docs.aws.amazon.com/textract/latest/dg/limits.html.

Putting Textract to Work

While OCR is important and can be a critical part of any business process, it is an engine that retrieves information from sources that could not be accessed except through human intervention.  In many ways, it is like an important element within a car’s engine.  A fuel injector is critical for a car to run, but may not have much value as an entity unto itself.  It’s when you bring various parts together that your car takes you where you need to go or your application drives your business.

To create a basic OCR application with Textract, you will need:

  • A place to store the images that need to be processed, in many situations this may be an Amazon S3 service (Simple Storage Service), Amazon WorkDocs (secure content creation, storage, and collaboration service), or even a relational database like Amazon Aurora.
  • An application or service to call the Textract services. Many organizations are creating Cloud-first applications and may choose to use AWS Lambda to run their code without having to worry about the servers where the code runs.
  • A place to store the results of the Textract services. The options are limitless for where to store the text and details uncovered by Textract, this could be stored back into an Amazon S3 instance, a database like Aurora, or even a data warehouse like Amazon Redshift.
  • And finally, you need to do something with the information you have collected. This all depends on what your goals are for the information, but at a minimum, most people want to search for information.  Using Amazon Elasticsearch Service is an easy way to allow people to find the new information Textract was able to gather for you.

The following outlines this simple Textract solution:

Figure 7 – Simple Textract solution with Amazon Elasticsearch Service

Practical Applications for Textract

While being able to search for information that was extracted from images is useful, it isn’t all that compelling from a business perspective.  Information needs to be meaningful and applied to a task so that its value can be recognized.  The following examples illustrate common business processes and the role that Textract can play in them.

Human Resource Document Management

Every organization has employees and/or volunteers to support their efforts.  There are many state, county, and country regulations that drive what information we need to keep about our employees as well as operational documents about the employees that help us to keep our businesses running.  The following are some examples of common documents that most organizations need to collect and retain:

  • Employment applications
  • Employee resumes
  • Interview notes, references, and background information
  • Employee offer letters
  • Benefit elections
  • Employee appraisals
  • Wage garnishments
  • State and Federal Employee documentation
  • Employee disciplinary actions
  • Termination decisions and disclosures
  • Promotion recommendations
  • Employee complaints and investigations
  • Leave request documentation

While there are many applications and services available on the market today which will help organizations capture, index, and retain this information, they can sometimes be costly and may not be able to completely capture information held in non-text-based file formats.  As discussed earlier, more and more people are using mobile and tablet technologies because of their accessibility and ease of use.  In many cases, an employee may use their phone to take a picture of a signed employee document and send it in to the company.  This photographed document can cause issues in capturing the information in it, or even classifying it properly in an automated fashion.  This is where Textract can easily be integrated into an existing solution, or incorporated as part of a newly constructed solution, to ensure vital information isn’t missed.

The following illustrates how a solution designed for the Cloud-based on Amazon services can facilitate common Human Resource document management activities:

 

Amazon Services Utilized:

Amazon S3|Amazon WorkDocs|AWS Lambda|Amazon Textract|Amazon API Gateway

Non-Amazon Application examples:

Workday|Oracle Human Capital Management|Oracle PeopleSoft

 

In this example, a newly hired employee is granted access to the company’s Amazon WorkDocs environment to upload documentation that will be required during the hiring process.  While most of the documents being uploaded will be easily indexed and searchable through the Amazon WorkDocs service, the employee has been asked to upload a copy of their driver’s license.  The employee utilizes the Amazon WorkDocs mobile application to take a picture of their driver’s license and uploads it to the appropriate folder on their phone.  Behind the scenes, the company has configured a workflow in Amazon WorkDocs to inform HR managers when new documents have been submitted and a Human Resources representative reviews the uploaded driver’s license.  The human resource representative launches an action in Amazon WorkDocs (a special feature provided by the company’s IT department) which will launch an operation running on AWS Lambda initiating Textract to capture OCR information form the driver’s license as well as create Key-value information which will be sent to the company’s ERP system (like Workday, Oracle Human Capital Management, Oracle PeopleSoft, or other similar application) along with the Amazon WorkDocs reference for where the actual image is stored.

This illustrates a very simple method to directly engage with employees to capture critical HR information through a combination of out-of-the-box Amazon services and some light-weight customizations to create a streamlined process for document storage and data capture.  It only took the employee a few seconds to take the picture of the driver’s license and upload it and the HR representative a few seconds to review and process the new document.  In fact, the solution could be configured to automatically extract the required details and send it to the ERP without even having to have the HR representative involved for a truly automated solution.  Imagine each new hire having ten to twenty documents they need to upload and how much time HR spends processing each document manually for every new employee.  Automating this process can amount to several hours a month of time savings, especially when dealing with non-text-based file formats that require someone to manually read the documents to key in the information contained in them.  By introducing Amazon Textract into the overall solution, data can be collected, stored, processed, and shared easily and more efficiently.

Business Document Processing and Information Automation

While the Human Resource Document Management example above focused on capturing documents individually as they come in, there are many situations where companies need to process documents in bulk.  Using similar AWS services as the previous example, solutions can be designed to allow for batch uploading of documents for processing.  As an example, procurement procedures for large purchases can incorporate a wide variety of documentation which may have vastly different processes associated with them.  By providing a simple way for files to be uploaded in bulk, AWS services can be utilized to sort through the file formats for processing.  Non-text-based image files like JPG, PNG, and PDF files can then be automatically processed by Amazon Textract to capture OCR information, Table data, and Key-value information from forms and then shared with back-office applications, stored in data warehouse services, and/or shared with Amazon Elasticsearch services.  Processing hundreds or even thousands of documents and images a month becomes much easier through automation.  Incorporating Textract into business process work streams ensures that critical information is identified and captured from structured and semi-structured documents reducing the need for manual classification of information to facilitate business operations like Insurance Claims, Legal Processes, Partner Management, Purchasing, and more.

Litigation is disruptive to normal business operations for any company.  Thousands of documents, images, and artifacts have to be reviewed and collected to share with attorneys and the courts during a legal process which can be time-consuming.  While there are many discovery tools available on the market today to help speed up the process of finding the desired information, they are reliant on information being in a format that the discovery services can handle.  In many cases, important information is stored in pictures and scanned documents that these discovery services cannot easily process.  Amazon’s Textract becomes a valuable tool in the discovery process by allowing organizations to quickly filter through image files, capture, and OCR information so that it can become indexed and searched.

Litigation isn’t only a headache for companies, it is a headache for the legal teams associated with the litigation process as well.  Imagine a law firm receiving millions of electronic files from a company and having to read through each document to find pertinent information regarding the case they are working on.  This can take months and many resources to complete, time that most lawyers don’t have during a case to complete. Files may be images, documents, spreadsheets, audio files, and even video files.  All of these need to be processed so that key information can be selected to support the case they are working on.  The expense of a large legal process can be staggering due to the sheer amount of manual labor required to gather information.  In the following example, Amazon’s Artificial Intelligence and Machine Learning services, including Textract, are utilized to greatly reduce the processing time for legal discovery.

Amazon Services Utilized:

AWS Transfer for SFTP|Amazon S3|AWS Lambda|Amazon Textract|Amazon API Gateway| Amazon Rekognition| Amazon Comprehend| Amazon Transcribe|Amazon Elasticsearch services

In this example, a legal firm utilizes the power of AWS Transfer for SFTP services to allow clients and opposing counsel to quickly upload all of their discovery files and documents which are then automatically stored in Amazon S3.  Files are then sorted based on file types for processing.  Amazon Textract capture OCR information from image files including table and form data while Amazon Rekognition analyzes photos and videos to identify the objects, people, text, scenes, and activities, perform facial recognition, and detect any inappropriate content.  Audio and video files are processed through Amazon Transcribe to capture speech-to-text information.  As files are processed, the information is captured and indexed in Amazon Elasticsearch service to enable rich search functionality to the litigators as well as being processed by Amazon Comprehend to quickly find relationships and insights into all the data collected.

What would have taken months to sort through and comprehend becomes manageable information in hours or days providing more time for the legal team to focus on winning their case while saving thousands of dollars on the personnel required to manually process all the discovery information.

The tool you didn’t know you needed

Technology is advancing at incredible speeds and new solutions and services are becoming available every day.  Services like Amazon Textract are critical tools in document processing and are rarely thought about but imperative for success.  Of all the services Amazon provides, Amazon Textract is one of the hidden gems that can be easily overlooked but deserves to be part of your processing arsenal.

You are not alone

Business solutions can be complex, but making them work for your requirements doesn’t have to be.  Clearly defining your goals and objectives is half of the battle, the other half is knowing what tools will help you achieve those goals.  Are your off-the-shelf solutions and applications collecting all the information you have?  Do you need a business solution to manage all of your documents and data, but don’t know where to start?  Are you looking to move off of an outdated legacy application that no longer supports your business direction?  You are not alone.  Thousands of companies are facing the same questions and are finding the best answers by engaging with experts from Amazon and experts from solution service providers.  TekStream Solutions, along with AWS, is excited to speak with you about your Information Processing needs and how the right tools and solutions can have a positive impact on how you conduct business.  TekStream Solutions is offering a free Digital Transformation assessment where we will work with you to identify your document processing needs and provide process and technology recommendations to help you transform your business with ease.

 

Want to learn more about Textract? Contact us today!

Using and Understanding Basic Subsearches in Splunk

By: Brent Mckinney | Splunk Consultant

A subsearch in Splunk is a unique way to stitch together results from your data. Simply put, a subsearch is a way to use the result of one search as the input to another. Subsearches contain an inner search, who’s results are then used as input to filter the results of an outer search. The inner search always runs first, and it’s important to note that subsearches return a maximum of 10,000 results and will only run up to 60 seconds by default.

First, it’s good to understand when to use Subsearch and when NOT to use Subsearches. Generally, you want to avoid using subsearches when working with large result sets. If your inner search produces a lot of results, then applying them as input to your outer search could be inefficient. When working with large result sets, it will likely be more efficient to create fields using the eval command and performing statistical results using the stats command. Because subsearches are computationally more expensive than most search types, it is ideal to have an inner search that produces a small set of results and use that to filter out a bigger outer search.

Example:

Suppose we have a network that should only be accessed from those local to the United States. We’re interested in seeing a list of users who’ve successfully accessed our network from outside of the United States. We could build one search to give us a list of IP addresses from outside of the U.S., and another search could be used to give a list of all accepted connections. A subsearch could then be used to stitch these results together and help us obtain a comprehensive list.

First, we’d need to decide what our inner results should be, a list of all accepted connections, or a list of all non-U.S. IPs? Using the latter as an inner search would probably work best, as it should return a much smaller set of results.

Our inner search would look something like this, using the iplocation command to give us more info on the IP address field.

index=security sourcetype=linux_secure | stats count by ip_address | iplocation ip_address | search Country !=“United States” | fields ip_address

This essentially results in a list of IP addresses that are not from the U.S.

From here, we want to create another search to return a list of all accepted connections. This will be our outer search, and look something like this:

index=security sourcetype=linux_secure connection_status=accepted | dedup ip_address | table ip_address, Country

To Combine these, we can use the following subsearch format. Inner searches are always surrounded by square brackets, and begin with the search keyword. Here’s what our final search would look like:

index=security sourcetype=linux_secure connection_status=accepted
[ search index=security sourcetype=linux_secure | stats count by ip_address | iplocation ip_address | search Country !=“United States” | fields ip_address ]
| dedup ip_address
| table ip_address, Country

Here, our inner search (enclosed in square brackets) would be run first and would return IP addresses that do not belong to the U.S. Those results would be used to filter out the outer search, with returns results of connections that were accepted by the network. Finally, the end of the outer search provides a table with the IP address and country for each result.

We have now obtained a list of IP addresses that have successfully accessed our network, along with the country that it was accessed from, all through the power of a Splunk subsearch!

Tips for troubleshooting if your subsearch is not producing desired results:

  1. Ensure that the syntax is correct. Make sure that the entire inner search is enclosed in square brackets, and that it is placed in the appropriate place of the outer search.
  2. Run both searches by themselves to ensure that they return the expected results independent of each other. Each search may need to be tuned a bit before combining them into a subsearch. Keep in mind that the results of the inner search are used as a filter for the outer search.
  3. You can check into the Splunk job inspector to see if anything stands out that looks out of the ordinary. The normalizedSearch property helps in showing the results of the subsearch.

This covers the basics of subsearches within Splunk. It’s worth noting, however, that there are advanced commands available to use with subsearches to achieve specific results. These commands include append, which could be used to combine searches that run over different periods or join, which can take a field from an inner search, and correlate that field to events from an outer search. These take on similar syntax to run, and are worth trying out once you have down the basics!

Want to learn more about basic subsearches in Splunk? Contact us today!

3 Ways to Migrate Custom Oracle Middleware Applications to the Cloud

Understanding and classifying middleware applications is one of the most critical and complex tasks of any Cloud adoption process. No doubt, your company has several diverse applications integrated into your system. Off-the-shelf applications for sure, but custom-built and legacy applications as well.

Whether you are considering migrating your Oracle solution to Amazon Web Services (AWS), Oracle Cloud Infrastructure (OCI), or another Cloud platform, each of your legacy applications will have its own migration needs that will need to be accounted for during your migration efforts.

3 Methods for Migrating Your Middleware Applications to a Cloud Environment

Re-Hosting: Lift & Shift Migrations

The first method for migrating your middleware applications to the Cloud pertains to your applications that rely on traditional server/compute technologies or applications that, based on their complexity, won’t benefit from re-factoring to utilize newer technologies like serverless or micro-services.

For these applications, we recommend leveraging the Infrastructure as a Service (IaaS) offerings provided by AWS and OCI (depending on your preferred platform). With these IaaS offerings, you can re-create the compute/servers required to run these applications just like you would with a traditional datacenter. You can also layer on new Cloud tools and concepts like:

  •  – On-demand pricing
  •  – Next-generation networking and securities
  •  – Additional service integrations like Content Delivery Networks or API gateways

As a note, many Oracle middleware applications will potentially fall under this category. Most of the WebLogic off-the-shelf applications use stateful sessions for clustering and will require additional effort to be able to integrate with newer Cloud concepts like auto-scaling.

Re-Platform: Migration Applications to a Managed Platform

For this next method, you’re going to focus on applications that can (and should) be moved to a managed platform. AWS has several services available to support the deployment of custom applications that use various tech stacks (Java, PHP, .Net, etc.).

In these instances, AWS takes over the provisioning, management, and autoscaling of compute/service and network compliances. This can significantly reduce operational costs as companies no longer need to maintain servers, operating systems, networks, etc. It also eases the migration tasks by removing infrastructure components from the mix.

Re-Architect: Recreating an Application for the Cloud

While many applications can be migrated via a “Lift-and-Shift” approach or through a managed platform, others may need to be completely overhauled to function correctly in the Cloud. “Re-thinking” or “re-architecting” these types of applications allows your team to ensure that these tools and concepts can utilize their full potential and appreciate the benefits of being deployed on the Cloud.

For example, you can explore opportunities to break down monolithic apps into smaller “micro” services and utilize serverless technologies like Lambdas, Amazon Simple Notification Services (SNS), or Amazon Simple Queue Service (SQS) to improve performance. At the same time, you can replace the traditional Oracle RDBMS data sources with new concepts like Data Lakes, Object Storage, or NoSQL Databases.

The Migration Support You Need

Regardless of your Cloud platform of choice, careful consideration needs to be given for how you are going to migrate your legacy middleware applications. You can also use your upcoming migration as an opportunity to audit your applications and determine if there are any that can be sunset or rolled into a new system or application to drive further efficiency.

Have questions? TekStream has deep experience deploying enterprise-grade Oracle middleware applications both on traditional data centers as well as cloud environments like AWS or OCI. As part of any migration, we utilize that experience to help classify applications and apply best practices to the deployment of those applications in the Cloud.

Are you looking for more insight, tips, and tactics for how best to migrate your legacy Oracle solution to the Cloud? Download our free eBook, “Taking Oracle to the Cloud: Key Considerations and Benefits for Migrating Your Enterprise Oracle Database to the Cloud.

 

If you’d like to talk to someone from our team, fill out the form below.

Solution-Driven CMMC Implementation – Solve First, Ask Questions Later

We’re halfway through 2020 and we’re seeing customers begin to implement and level up within the Cybersecurity Maturity Model Certification (CMMC) framework. Offering a cyber framework for contractors doing business with the DoD, CMMC will eventually become the singular standard for Controlled Unclassified Information (CUI) cybersecurity.

An answer to limitations of NIST 800-171, CMMC requires attestation by a Certified Third-Party Assessor Organization (C3PAO). Once CMMC is in full effect, every company in the Department of Defense’s (DoD’s) supply chain, including Defense Industrial Base (DIB) contractors, will need to be certified to work with the Department of Defense.

As such, DIB contractors and members of the larger DoD supply chain find themselves asking: when should my organization start the compliance process, and what is the best path to achieving CMMC compliance?

First, it is important to start working toward compliance now. Why?

  • – Contracts requiring CMMC certification are expected as early as October and if we wait to certify until we see an eligible contract, it’s too late.
  • – You can currently treat CMMC compliance as an “allowable cost.” The cost of becoming compliant (tools, remediation, preparation) can be expensed back to the DoD. The amount of funding allocated to defray these expenses and the allowable thresholds are unclear but the overall cost is likely to exceed initial estimates and as with any federal program, going back for additional appropriations can be challenging.

As far as the best path to achieving CMMC goes – the more direct, the better.

Understanding Current Approaches to CMMC Compliance

CMMC is new enough that many organizations have yet to go through the compliance process. Broadly, we’ve seen a range of recommendations, most of which start with a heavy upfront lift of comprehensive analysis.

The general process is as follows:

  1. Assess current operations for compliance with CMMC, especially as it relates to its extension of NIST 800-171 standards.
  2. Document your System Security Plan (SSP) to identify what makes up the CUI environment. The plans should describe system boundaries, operation environments, the process by which security requirements are implemented, and the relationship with and/or connections to other systems.
  3. Create a logical network diagram of your network(s), including third-party services, remote access methods, and cloud instances.
  4. List an inventory of all systems, applications, and services: servers, workstations, network devices, mobile devices, databases, third-party service providers, cloud instances, major applications, and others.
  5. Document Plans of Action and Milestones (POAMs). The POAMs should spell out how system vulnerabilities will be solved for and existing deficiencies corrected.
  6. Execute POAMs to achieve full compliance through appropriate security technologies and tools.

This assessment-first approach, while functional, is not ideal.

In taking the traditional approach to becoming CMMC compliant, the emphasis is put on analysis and process first; the tools and technologies to satisfy those processes are secondary. By beginning with a full compliance assessment, you are spending time guessing where your compliance issues and gaps are, and by deprioritizing technology selection, potentially relying upon multiple tools, there is the potential to have granular processes that increase the problem of swivel-chair compliance (e.g., having to go to multiple tools and interfaces to establish, monitor, and maintain compliance and the required underlying cybersecurity). This is actually creating more work for your compliance and security team when you have to architect an integrated, cohesive compliance solution.

Then, the whole process has to be redone every time a contractor’s compliance certification is up.

Big picture, having to guess at your compliance gaps upfront can lead to analysis paralysis. By trying to analyze so many different pieces of the process and make sure they’re compliant, it is easy to become overwhelmed and feel defeated before even starting.

With NIST 800-171, even though it has been in effect since January 1, 2018, compliance across the DIB has not been consistent or widespread. CMMC is effectively forcing the compliance mandate by addressing key loopholes and caveats in NIST 800-171:

  • – You can no longer self-certify.
  • – You can no longer rely on applicability caveats.
  • – There is no flexibility for in-process compliance.

So, if you’ve been skirting the strictness of compliance previously, know you can no longer do that with CMMC, and are overwhelmed with where to even begin, we recommend you fully dive into and leverage a tool that can be a single source of truth for your whole process – Splunk.

Leverage a Prescriptive Solution and Implementation Consultancy to Expedite CMMC Compliance

Rather than getting bogged down in analysis paralysis, accelerate your journey to CMMC compliance by implementing an automated CMMC monitoring solution like Splunk. Splunk labels itself “the data to everything platform.” It is purpose-built to act as a big data clearinghouse for all relevant enterprise data regardless of context. In this case, as the leading SIEM provider, Splunk is uniquely able to provide visibility to compliance-related events as the overlap with security-related data is comprehensive.

Generally, the process will begin with ingesting all available information across your enterprise and then implementing automated practice compliance. Through that implementation process, gaps are naturally discovered. If there is missing or unavailable data, processes can then be defined as “gap fillers” to ensure compliance.

The automated practice controls are then leveraged as Standard Operating Procedures (SOPs) that are repurposed into applicable System Security Plans (SSPs), Plans of Action and Milestones (POAMs), and business plans. In many cases, much of the specific content for these documents can be generated from the dashboards that we deliver as a part of our CMMC solution.

The benefits realized by a solution-driven approach, rather than an analysis-driven one, are many:

  1. Starting with a capable solution reduces the overall time to compliance.
  2. Gaps are difficult to anticipate, as they are often not discovered until the source of data is examined (e.g. one cannot presume that data includes a user, or an IP address, or a MAC address until the data is exposed). Assumption-driven analysis is foreshortened.
  3. Automated practice dashboards and the collection of underlying metadata (e.g authorized ports, machines, users, etc.) can be harvested for document generation.
  4. Having a consolidated solution for overall compliance tracking across all security appliances and technologies provides guidance and visibility to C3PAOs, quelling natural audit curiosity creep, and shortening the attestation cycle.

Not only does this process get you past the analysis paralysis barrier, but it reduces non-compliance risk and the effort needed for attestation. It also helps keep you compliant – and out of auditors’ crosshairs.

Let Splunk and TekStream to Get You Compliant in Weeks, Not Months

Beyond the guides and assessments consulting firms are offering for CMMC, TekStream has a practical, proven, and effective solution to get you compliant in under 30 days.

By working with TekStream and Splunk, you’ll get:

  • – Installation and configuration of Splunk, CMMC App, and Premium Apps
  • – Pre/Post CMMC Assessment consulting work to ensure you meet or exceed your CMMC level requirements
  • – Optional MSP/MSSP/compliance monitoring services to take away the burden of data management, security, and compliance monitoring
  • Ongoing monitoring for each practice on an automated basis and summarized in a central auditing dashboard.
  • – Comprehensive TekStream ownership of your Splunk instance, including implementation, licensing, support, outsourcing (compliance, security, and admin), and resource staffing.

If you’re already a Splunk user, this opportunity is a no brainer. If you’re new to Splunk, this is the best way to procure best-in-class security, full compliance, and an operational intelligence platform, especially when you consider the financial benefit of allowable costs.

If you’d like to talk to someone from our team, fill out the form below.

CMMC Maturity – Understanding What is Needed to Level Up

At its core, the Cybersecurity Maturity Model Certification (CMMC) is designed to protect mission-critical government systems and data and has the primary objective of protecting the government’s Controlled Unclassified Information (CUI) from cyber risk.

CMMC goes beyond NIST 800-171 to require strict adherence to a complex set of standards, an attestation, and a certification by a third-party assessor.

The Cybersecurity Model has a framework with five maturity (or “trust”) levels. As you likely know, the certification level your organization needs to reach is going to be largely situational and dependent on the kinds of contracts you currently have and will seek out in the future.

The CMMC compliance process is still so new that many organizations are just prioritizing what baseline level they need to reach. For most, that’s level 3. With that said, there is certainly value to gain from an incremental approach to leveling up.

Why Seek CMMC Level 4 or 5 Compliance, Anyway?

First, let’s define our terms and understand the meaning behind the jump from Level 3 up to 4 or 5. CMMC trust levels 3-5 are defined as:

Level 3: Managed

  • – 130 practices (including all 110 from NIST 800-171 Rev. 1)
  • – Meant to protect CUI in environments that hold and transmit classified information
  • – All contractors must establish, maintain, and resource a plan that includes their identified domain

Level 4: Reviewed

  • – Additional 26 practices
  • Proactive and focuses on the protection of CUI from Advanced Persistent Threats (APTs) and encompasses a subset of the enhanced security requirements from Draft NIST SP 800-171B (as well as other cyber-security best practices). In Splunk terms, that means a shift from monitoring and maintaining compliance to proactively responding to threats. This puts an emphasis on SOAR tools such as Splunk Phantom to automate security threat response in specific practice categories.
  • – All contractors should review and measure their identified domain activities for effectiveness

Level 5: Optimizing

  • – Additional 15 practices
  • – An advanced and proactive approach to protect CUI from APTs
  • – Requires a contractor to standardize and optimize process implementation across their organization. In Splunk terms, this means expansion to more sophisticated threat identification algorithms to include tools such as User Behavior Analytics.

The benefits of taking an incremental approach and making the jump up to Level 4 (and potentially 5 later) are two-fold:

  1. It can make your bids more appealing. Even if the contracts that you are seeking only require Level 3 compliance, having the added security level is an enticing differentiator in a competitive bidding market.
  2. You can open your organization up to new contracts and opportunities that require a higher level of certification and are often worth a lot more money.
  3. It puts in place the tools and techniques to automatically respond to security-related events. This shortens response times to threats, shortens triage, increases accuracy and visibility, automates tasks that would typically be done manually by expensive security resources, and makes you safer.

Plus, with “allowable costs” in the mix, by defraying the spend on compliance back to the DoD, you get the added financial benefit as well.

How Do You Move Up to the Higher CMMC Trust Levels?

Our recommendation is to start small and at a manageable level. Seek the compliance level that matches your current contract needs. As was highlighted earlier, for most, that is Level 3.

To have reached Level 3, you are already using a single technology solution (like Splunk) or a combination of other tools.

Getting to Level 4 and adhering to the additional 14 practices is going to be an incremental process of layering in another tool or technique or technology that goes on top of all your previous work. It’s additive.

For TekStream clients, that translates to adding Splunk Phantom to your Splunk Core and Enterprise Security solution. It’s not a massive or insurmountable task, and it is a great way to defray costs associated with manual security tasks and differentiate your organization from your fellow DIB contractors.

TekStream Can Help You Reach the Right Certification Level for You

Ready to start your compliance process? Ready to reach Level 3, Level 4, or even Level 5? Acting now positions you to meet DoD needs immediately and opens the door for early opportunities. See how TekStream has teamed up with Splunk to bring you a prescriptive solution and implementation consultancy.

If you’d like to talk to someone from our team, fill out the form below.

CMMC Response – Managing Security & Compliance Alerts & Response for Maturity Levels 4 and 5

The Cybersecurity Maturity Model Certification (CMMC) is here and staying. There are increased complexities that come with the new compliance model as compared to NIST 800-171, and organizations have to be prepared to not only navigate the new process but also reach the level that makes the most sense for them.

Level 3 (Good Cyber Hygiene, 130 Practices, NIST SP 800-171 + New Practices) is the most common compliance threshold that Defense Industrial Base (DIB) contractors are seeking out. However, there can be significant value in increasing to a Level 4 and eventually a Level 5, especially if you’re leveraging the Splunk for CMMC Solution.

Thanks to the DoD’s “allowable costs” model (where you can defray costs of becoming CMMC compliant back to the DoD), reaching Level 4 offers significant value at no expense to your organization.

Even if you’re not currently pursuing contracts that mandate Level 4 compliance, by using TekStream and Splunk’s combined CMMC solution to reach Level 4, you end up with:

  • – A winning differentiator against the competition when bidding on Level 3 (and below) contracts
  • – The option to bid on Level 4 contracts worth considerably more money
  • – Automating security tasks with Splunk ES & Phantom
  • – Excellent security posture with Splunk ES & Phantom

And all of these benefits fall under the “allowable costs” umbrella.

The case for reaching Level 4 is clear, but there are definitely complexities as you move up the maturity model. For this blog, we want to zero in on a specific complexity — the alert and response set up needed to be at Level 4 or 5 and how a SOAR solution like Splunk Phantom can get you there.

How Does Splunk Phantom Factor into Levels 4 and 5?

Level 4 is 26 practices above Level 3 and 15 practices below Level 5. Level 4 focuses primarily on protecting CUI and security practices that surround the detection and response capabilities of an organization. Level 5 is centered on standardizing process implementation and has additional practices to enhance the cybersecurity capabilities of the organization.

Both Level 4 and Level 5 are considered proactive, and 5 is even considered advanced/progressive.

Alert and incident response are foundational to Levels 4 and 5, and Splunk Phantom is a SOAR (Security Orchestration, Automation, and Response) tool that helps DIB contractors focus on automating the alert process and responding as necessary.

You can think about Splunk Phantom in three parts:

  1. SOC Automation: Phantom gives teams the power to execute automated actions across their security infrastructure in seconds, rather than the hours+ it would take manually. Teams can codify workflows into Phantom’s automated playbooks using the visual editor or the integrated Python development environment.
  2. Orchestration: Phantom connects existing security tools to help them work better together, unifying the defense strategy.
  3. Incident Response: Phantom’s automated detection, investigation, and response capabilities mean that teams can reduce malware dwell time, execute response actions at machine speed, and lower their overall mean time to resolve (MTTR).

The above features of Phantom allow contractors to home in on their ability to respond to incidents.

By using Phantom’s workbooks, you’re able to put playbooks into reusable templates, as well as divide and assign tasks among members and document operations and processes. You’re also able to build custom workbooks as well as use included industry-standard workbooks. This is particularly useful for Level 5 contractors as a focus of Level 5 is the standardization of your cybersecurity operations.

TekStream and Splunk’s CMMC Solution

With TekStream and Splunk’s CMMC Solution, our approach is to introduce as much automation as possible to the security & compliance alerts & response requirements of Levels 4 and 5.

Leveraging Splunk Phantom, we’re able to introduce important automation and workbook features to standardize processes, free up time, and make the process of handling, verifying, and testing incident responses significantly more manageable.

If you’d like to talk to someone from our team, fill out the form below.

Troubleshooting Your Splunk Environment Utilizing Btool

By: Chris Winarski | Splunk Consultant

 

Btool is a utility created and provided within the Splunk Enterprise download and when it comes to troubleshooting your .conf files, Btool is your friend. From a technical standpoint, Btool shows you the “merged” .conf files that are written to disc and what the current .conf files contain at the time of execution, HOWEVER, this may not show you what Splunk is actually using at that specific time, because Splunk is running off of the settings that are written in memory and for your changes to a .conf file to be read from disc to memory requires a restart of that specific Splunk instance or force Splunk to reload of the .conf files. This blog is focused primarily on a Linux environment, but if you would like more information on how to go about this in a Windows environment feel free to inquire below! These are some use cases for your troubleshooting using Btool.

 

Btool checks disk NOT what Splunk has in Memory

Let’s say you just changed an inputs.conf file on a forwarder – Adding a sourcetype to the incoming data:

The next step would be to change directory to $SPLUNK_HOME/bin directory

($SPLUNK_HOME = where you installed splunk, best practice is /opt/splunk)

Now once in the bin directory, you will be able to use the command:

 

./splunk Btool inputs list

This will output every inputs.conf file that is currently saved to that machine taking the current precedence and what their attributes are. This is what will be merged when Splunk restarts and in which is written to memory, which is why the current instance running needs to be restarted to write our “sourcetype” change above to the memory so it can utilize that attribute. If we don’t restart the instance, Splunk will have no idea that we edited a .conf file and will not use our added attribute.

The above command shows us that our change was saved to disc, but in order for Splunk to utilize this attribute, we still have to restart the instance.

 

./splunk restart

Once restarted, all Btool merge files are in memory and describe how Splunk is currently acting at that given time.

 

Btool conf file creating a file with returned results

The above example will just simply print out the results to the console, where the code below will run the command and then create a file located in your “tmp” folder of all the returned text involving the inputs.conf files in your splunk instance.

 

./splunk btool inputs list > /tmp/btool_inputs.txt

 

Where do these conf files come from?

When running the normal Btool command above, we are returning ALL the settings in all inputs.conf files for the entire instance, however, we can’t tell which inputs.conf file each setting is defined in. This can be done by adding a –debug parameter.

 

./splunk btool inputs list –debug

 

Organizing the Btool legibility

When printing out the long list of conf files, they seem to be all smashed together, using the ‘sed’ command we are able to pretty it up a bit using some simple regex.

 

./splunk btool inputs list | sed ‘s/^\([^\[]\)/   \1/’

 

There are many other useful ways to utilize Btool such as incorporating scripts, etc. If you would like more information and would like to know more about how to utilize Btool in your environment, contact us today!

 

How to Leverage a Bring-Your-Own-License Model on Oracle Cloud Infrastructure and Amazon Web Services

It’s no secret that Oracle licensing can be complicated. Between the never-ending legal jargon, Core calculations, and usage analysis, Oracle licensing can get complex. Often, it’s navigating these licenses, not the underlying technology, that can halt even the most well-intentioned Oracle Cloud migration efforts.

In this blog post, we’re going to take a closer look at how you can leverage your existing Oracle license to support your Oracle Cloud migration efforts to either Oracle Cloud Infrastructure (OCI) or Amazon Web Services (AWS).

What is a Bring Your Own License Model, Anyway?

Simply put, Bring Your Own License (BYOL) is a licensing model that lets you utilize your current on-premise Oracle license to support your Oracle migration and deployment to the Cloud – oftentimes at a significant cost savings.

BYOL on OCI

Is your organization leaning toward migrating your legacy Oracle system to OCI? If you have any existing Oracle software licenses for services like Oracle Database, Oracle Middleware, or Oracle Business Intelligence, you can leverage those existing licenses when subscribing to Oracle Platform Cloud Services (Oracle PaaS).

With BYOL, you can leverage existing software licenses for Oracle PaaS subscriptions at a lower cost. As an example, if you already have a perpetual license for Oracle Database Standard Edition, then you can leverage that license to purchase a cloud subscription to Standard Edition Database as a Service at a lower cost.

The total cost of ownership calculations can be complex with this option as you need to consider your existing cost of support, the added value you will gain from a cloud-based, self-healing, self-patching solution versus the cost of buying the solution outright without using BYOL. TekStream can help you weigh these options if you are thinking about leveraging BYOL for your cloud journey.

How Do You Use Your BYOL for Oracle PaaS?

So, how exactly do you use your existing Oracle software license to support your OCI migration needs? It’s easier than you may think:

• Select specific Oracle BYOL options in the Cost Estimator to get your BYOL pricing.

Apply your BYOL pricing to individual cloud service instances when creating a new instance of your PaaS service. BYOL is the default licensing option during instance creation for all services that support it.

As noted, when creating a new instance of Oracle Database Cloud Service using the QuickStart wizard, the BYOL option is automatically applied.

Bring Your Own License to AWS

Oracle can be deployed on AWS using the compute resources (EC2). Like a standard server on your datacenter today, when using this migration strategy, you are responsible for the licenses of any software running on the instances (including Oracle database, middleware, or any other software instances).

You can use your existing Oracle licenses to run on AWS. If you choose this licensing approach, it is important to consider a couple of supporting factors.

If you are licensing a product by processor or named users on this platform, you need to consider the Oracle core multipliers referenced in the terms and conditions of your license agreement.

If you are using employee or user-based metrics, you can deploy solutions on AWS with little concern about these issues.

Many Oracle Unlimited and Enterprise License Agreements do not allow usage in AWS. If you are using one of these options for your Oracle licensing, we would recommend reviewing your contracts carefully before deploying these Oracle licenses on AWS.

Is the BYOL Licensing Model Right for You?

Regardless of which Cloud platform you choose (AWS or OCI), a Cloud migration is the perfect opportunity to reexamine your Oracle license structure. Whether you opt for the BYOL licensing model or choose to utilize a new licensing structure, take this opportunity to identify ways to reduce the cost of your overarching licensing structure.

Learn about alternative licensing models by downloading our free eBook, “A Primer on Licensing Options, Issues, and Strategies for Running Oracle CPU-based Licenses on Cloud.”

Need help? TekStream can help demystify the Oracle licensing process. We provide straightforward counsel and, most importantly, identify cost-saving opportunities while still maintaining full licensing compliance.

 

If you’d like to talk to someone from our team, fill out the form below.