Options to Consider for Your Oracle 12c WebCenter Upgrade

By: Brandon Prasnicki | Technical Architect

 

If you search the Oracle knowledgebase on how to upgrade your existing Oracle WebCenter Content (WCC), Imaging, or Portal instance from 11g to 12c, your options are to do an in-place upgrade or to migrate the entire repository using Oracle WebCenter Content supported tools.  However, if an upgrade consists of new hardware (on-premise), new cloud Infrastructure (Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform etc), upgraded operating systems (Microsoft Windows or Linux) along with database upgrade (Oracle Database 12c), the only supported method is to use these supported migration tools.  To move the content from one machine to the next, this process consists of the following:

  1. Install 12c on the new environment
  2. Create the 12c schemas with RCU
  3. Create and configure the 12c domain
  4. Migrate the WCC configurations with CMU and Archiver
  5. Migrate the WCC content with Archiver

While this is a straightforward approach, the question becomes:  Is this feasible?

The answer to that question is:  It depends.

With any upgrade project, TekStream Solutions evaluates the scope of the upgrade and migration and makes recommendations on the appropriate approach.  Here is a high-level outline of starting points considered during the TekStream QuickStream process:

  1. Is the repository small? This supported methodology is a good approach and alternative for instances that do not hold a lot of content.  We have seen situations for implementations that leverage WCC as a backend for implementations like Portal where the content repository isn’t very large.  For this, the supported methodology is a decent alternative.
  2. Are there opportunities to decommission old Enterprise Content Management Systems? Sometimes there is an opportunity to also mix in and decommission old content repositories. Examples include old shared filesystems not currently managed by any enterprise content management systems (CMS), or even little-used old CMS systems where, depending on the customer license structure, the ROI of rolling into an Oracle WebCenter Content (WCC) instance makes sense during the time of upgrade.  Examples of this include but are not limited to Adobe and Documentum etc.  For this, TekStream utilizes a proprietary utility called “Content Loader” to handle WCC delta migrations, and merge deprecated CMS application content.
  3. Is the repository large? For very large repositories, Tekstream uses a cost-effective approach called the “out of place” in-place upgrade which eliminates the need to migrate the content.  The ‘supported’ Oracle approach simply is not feasible, as repositories with millions of content items would take months and maybe even years to migrate.  Examples of implementations that include large repositories include Digital Asset Management (DAM), Records Management (RM) and even some regular Content Management repositories.   When Oracle states this “out of place” in-place upgrade is not a supported approach, they are strictly referring to all the ‘gothchas’ that can occur.  The support team members at Oracle are not the consultants to handle such an approach.  That is where TekStream solutions come in to guide and implement the upgrade to a successful outcome.
    1. Have we seen ‘gotchas’ in this approach? Certainly.  Every version and situation has its nuances.  TekStream’s QuickStream process digs deeper into identifying unique customer situations to account for during a migration.  Tekstream has proven to handle these challenges and deliver successful implementations.  Our background expertise performing these upgrades before has proven vital to customer success.
    2. Could a customer do this approach by themselves? Honestly, probably not.  We are here to guide you through this approach and avoid the pitfalls that can occur.  We have been through this before, and are here to guide and deliver a successful upgrade.

TekStream Solutions makes sure that the system is migrated, upgraded and in a clean, working, and supported state at the completion of the project.   This approach has proven to save customers a lot of time and money.  TekStream also offers extended support and is an Oracle Managed Services provider to give customers great peace of mind and frees up internal resources for more demanding in-house projects.

Want to learn more about Oracle 12c WebCenter upgrades? Contact us today!

Press Release: TekStream Makes 2019 INC. 5000 List for Fifth Consecutive Year

For the 5th Time, Atlanta-based Technology Company Named One of the Fastest-growing Private Companies in America with Three-Year Sales Growth of 166%

ATLANTA, GA, August 14, 2019– Atlanta-based technology company, TekStream Solutions, is excited to announce that for the fifth time in a row, it has made the Inc. 5000 list of the fastest-growing private companies in America. This prestigious recognition comes again just eight years after Rob Jansen, Judd Robins, and Mark Gannon left major firms and pursued a dream of creating a strategic offering to provide enterprise technology software, services, solutions, and sourcing. Now, they’re a part of an elite group that, over the years, has included companies such as Chobani, Intuit, Microsoft, Oracle, Timberland, Vizio, and Zappos.com.

“Being included in the Inc. 5000 for the fifth straight year is something we are truly proud of as very few organizations in the history of the Inc. 5000 list since 2007 can sustain the consistent and profitable growth year over year needed to be included in this prestigious group of companies,” said Chief Executive Officer, Rob Jansen. “The accelerated growth we are seeing to help clients leverage Cloud-based technologies and Big Data solutions to solve complex business problems has been truly exciting. We are helping our clients take advantage of today’s most advanced recruiting and technology solutions to digitally transform their businesses and address the ever-changing market.”

This year’s Inc. 5000 nomination comes after TekStream has seen a three-year growth of over 166%, and 2019 is already on pace to continue this exceptional growth rate. In addition, the company has added 30% more jobs over the last 12 months.

“Customers continue to invest in ‘Cloud First’ strategies to move their on-premises environments to the cloud, but often struggle with how to get started.  There is a vast market for specialized experts familiar with both legacy systems and newer cloud technology platforms.  Bridging those two worlds to address rapid line of business changes and reducing technology costs are focal points of those strategies. TekStream is well-positioned to continue that thought leadership position over the next several years.” stated Judd Robins, Executive Vice President of Sales.

To qualify for the award, companies had to be privately owned, established in the first quarter of 2015 or earlier, experienced a two-year growth in sales of more than 50 percent, and garnered revenue between $2 million and $300 million in 2018.

“The continued recognition is evidence of our team’s response to client’s recruiting needs across multiple industries and sectors. The growth in hiring demands commercially and federally along with the need to deliver on changing candidate demands have fueled the work we have put into having both outsourced and immediate response contingent recruiting solutions,” stated Mark Gannon, Executive Vice President of Recruitment.

TekStream
We are “The Experts of Business & Digital Transformation”, but more importantly, we understand the challenges facing businesses and the myriad of technology choices and skillsets required in today’s “always on” companies and markets. We help you navigate the mix of transformative enterprise platforms, talent and processes to create future-proof solutions in preparing for tomorrows opportunities…so you don’t have to. TekStream’s IT consulting solutions combined with its specialized IT recruiting expertise helps businesses increase efficiencies, streamline costs and remain competitive in an extremely fast-changing market. For more information about TekStream Solutions, visit www.tekstream.com or email info@tekstream.com.

Integrating Oracle Human Capital Management (HCM) and Content and Experience Cloud (CEC)

By: Greg Becker | Technical Architect

OVERVIEW

During the first phase of a recent project we built an employee file repository for a Healthcare client in the Oracle Cloud Infrastructure – Classic (OCI-C) space. A number of services were used including Oracle Content and Experience Cloud (repository), Oracle Process Cloud Service (for filing the documents in a logic structure), Oracle WebCenter Enterprise Capture (for scanning) and Oracle Database Cloud Service (for custom application tables).

During the second phase of the project our clients had a requirement to automatically update metadata values on their content items stored in the CEC repository. They wanted to trigger a change based on events or updates that occur for an employee record that is stored in Oracle Human Capital Management, for example when an Employee Status changes from Active to Inactive.

Our solution was to use an Oracle Process Cloud Service process to perform the metadata updates when certain values were passed into the process. The reason for updating the metadata is so that accurate searches can be performed by end users. The tricky part of the implementation is how to call the PCS process based on the change. To accomplish this Informatica is used to determine a ‘change’ based on data from the tables within the HCM structure and then pass that change record to a DB table used by the client solution. At that point a database function was developed to action the PCS REST Web Service. The final step of the process was to build a database trigger that called the function.

First you need to do some initial setup to be able to use the APEX libraries as well as create the network ACL to connect to the PCS domain you’re using. You can find this information in various places online. You can either use SOAP or REST web services and we chose REST. If you want to call the web service using SSL (which we did) you’ll have to also create an Oracle wallet.

CODE SNIPPETS

Function Definition:

SOAP Envelope:

Call the Function from a Trigger:

SUMMARY

There are more than one ways to fulfil this customer requirement but these are the pieces that worked well in this case. If you have any additional integration needs between Oracle Human Capital Management and Oracle Content and Experience Cloud please contact TekStream and we’d be happy to assist you.

Iplocation: Simple Explanation for Iplocation Search Command

By: Charles Dills | Splunk Consultant

Iplocation can be used to find some very important information. It is a very simple yet powerful search command that can help with identifying where traffic from a specific IP is coming from.

To start iplocation on its own won’t display any visualizations. What it will do is add a number of additional fields that can be used in your searches that can be added to dashboards, panels, and tables. Below we will use a simple base search using Splunk example data:

From here we will add iplocation to our search, sorting by clientip. As you can see in the below screenshot, this added a few fields that we can use circled in red:

From here we can alter our search with a table to display the information we need. For example, for a company who is based and fully operates out of the US could consider and traffic going outside the us to a foreign country as unauthorized or malicious. Using the iplocation in combination with values, we are able to list out each IP address that is not located inside the US and display each by which country It is located:

The last thing we will do is clean up our table using rename and this can provide a simple way to distinguish where traffic from a specific IP address is coming from:

Want to learn more about iplocation? Contact us today!

Take your Traditional OCR up a notch

By: Greg Moler | Director of Imaging Solutions

While the baseline OCR landscape has not changed much, AWS aims to correct that. Traditional OCR engines are quite limited in what details they can provide. Being able to detect the characters is only half the battle, the ability to get meaningful data out of them becomes the challenge. Traditional OCR follows the ‘what you see is what you get’ mantra, meaning once you run your document through, the blob of seemingly unnavigable text is all you are left with. What if we could enhance this output with other meaningful data elements useful in extraction confidence? What if we could improve the navigation of the traditional OCR block of text?

Enter Textract from AWS. A public web service aimed to improve your traditional OCR experience in an easily scalable, integrable, and low cost package. Textract is built upon an OCR extraction engine that is optimized by AWS’ advanced machine learning. It has been taught how to extract thousands of different types of forms so you don’t have to worry about it. The ‘template’ days are over. It also provides a number of useful advanced features that other engines simply do not offer: confidence ratings, word block identification, word and line object identification, table extraction, and key-value output. Let’s take a quick look at each of these:

  • Confidence Ratings: Ability to intelligently make choices to accept results, or require human intervention based on your own thresholds. Building this into your work flow or product can greatly improve data accuracy
  • Word Blocks: Textract will identify word blocks allowing you to navigate through them to help identify things like address blocks or known blocks of text in your documents. The ability to identify grouped wording rather than sifting through a massive blob of OCR output can help you find the information you are looking for faster
  • Word and Line Objects: Rather than getting a block of text from a traditional OCR engine, having code-navigable objects to parse your documents will greatly improve your efficiency and accuracy. Paired with location data, you can use the returned coordinates to pinpoint where it was extracted from. This becomes useful when you know your data is found in specific areas or ranges of a given document to further improve accuracy and filter out false positives
  • Table Extraction: Using AWS AI-backed extraction technology, Table extraction will intelligently identify and extract tabular data to pipe into whatever your use case may need, allowing you to quickly calculate and navigate these table data elements.
  • Key-value Output: AWS, again using AI-backed extraction technology, will intelligently identify key-value pairs found on the document without having to write custom engines to parse the data programmatically. Optionally, send these key-value pairs to your favorite key-value engine like Splunk or Elasticsearch (Elastic Stack) for easily searchable, trigger-able, and analytical actions for your document’s data.

Contact us today to find out how Textract from AWS can help streamline your OCR based solutions to improve your data’s accuracy!

Tsidx Reduction for Storage Savings

By: Yetunde Awojoodu | Splunk Consultant

Introduction

Tsidx Reduction was introduced in Splunk Enterprise v6.4 to provide users with the option of reducing the size of index files (tsidx files) primarily to save on storage space. The tsidx reduction process transforms full size index files into minified versions which will contain only essential metadata. A few scenarios to consider tsidx reduction include:

  • Consistently running out of disk space or nearing storage limits but not ready to incur additional storage costs
  • Have older data that are not searched regularly
  • Can afford a tradeoff between storage costs and search performance

How it works

Each bucket contains a tsidx file (time series index data) and a journal.gz file (raw data). A tsidx file associates each unique keyword in your data with location references to events, which are stored in the associated rawdata file. This allows for fast full text searches. By default, an indexer retains tsidx files for all its indexed data for as long as it retains the data itself.

When buckets are tsidx reduced, they still contain a smaller version of the tsidx files. The reduction applies mainly to the lexicon of the bucket which is used to find events matching any keywords in the search. The bloom filters, tsidx headers, and metadata files are still left in place. This means that for reduced buckets, search terms will not be checked against the lexicon to see where they occur in the raw data.

Once a bucket is identified as potentially containing a search term, the entire raw data of the bucket that matches the time range of the search will need to be scanned to find the search term rather than first scanning the lexicon to find a pointer to the term in the raw data. This is where the tradeoff with search performance occurs. If a search hits a reduced bucket, the resulting effect will be slower searches. By reducing tsidx files for older data, you incur little performance hit for most searches while gaining large savings in disk usage.

The process can decrease bucket size by one-third to two-thirds depending on the type of data. For example, a 1GB bucket would decrease in size between 350MB – 700MB. The exact amount depends on the type of data. Data with many unique terms require larger tsidx files. To make a rough estimate of a bucket’s reduction potential, look at the size of its merged_lexicon.lex file. The merged_lexicon.lex file is an indicator of the number of unique terms in a bucket’s data. Buckets with larger lexicon files have tsidx files that reduce to a greater degree.

When a search hits the reduced buckets, a message appears in Splunk Web to warn users of a potential delay in search completion: “Search on most recent data has completed. Expect slower search speeds as we search the minified buckets.” Once you enable tsidx reduction, the indexer begins to look for buckets to reduce. Each indexer reduces one bucket at a time, so performance impact should be minimal.

Benefits

  • Savings in disk usage due to reduced tsidx files
  • Extension of data lifespan by permitting data to be kept longer (and searchable) in Splunk
  • Longer term storage without the need for extra architectural steps like adding S3 archival or rolling to Hadoop.

Configuration

The configuration is pretty straight forward and you can perform a trial by starting with one index and observing the results before taking further action on any other indexes. You will need to specify a reduction age on a per-index basis:

1. On Splunk UI:

  • Go to Settings > Indexes > Select an Index
    Set tsidx reduction policy.

2. Splunk Configuration File:

  • indexes.conf
    [<indexname>]
    enableTsidxReduction = true
    timePeriodInSecBeforeTsidxReduction = <NumberOfSeconds>

The attribute “timePeriodInSecBeforeTsidxReduction” is the amount of time, in seconds, that a bucket can age before it becomes eligible for tsidx reduction. When this time difference is exceeded, a bucket becomes eligible for tsidx reduction. Default Is 604800

To check whether a bucket is reduced, run the dbinspect search command:

| dbinspect index=_internal
The tsidxState field in the results specifies “full” or “mini” for each bucket.

To restore reduced buckets to their original state, refer toSplunk Docs

A few notes

  • Tsidx reduction should be used on old data and not on frequently searched data. You can continue to search across the aged data, if necessary, but such searches will exhibit significantly worse performance. Rare term searches, in particular, will run slowly.
  • A few search commands do not work with reduced buckets. These include ‘tstats’ and ‘typeahead’. Warnings will be included in search.log

Reference Links

https://docs.splunk.com/Documentation/Splunk/7.2.6/Indexer/Reducetsidxdiskusage

https://conf.splunk.com/files/2016/slides/behind-the-magnifying-glass-how-search-works.pdf

https://conf.splunk.com/files/2017/slides/splunk-data-life-cycle-determining-when-and-where-to-roll-data.pdf

Want to learn more about Tsidx Reduction for Storage Savings? Contact us today!

Operating a Splunk Environment with Multiple Deployment Servers

Operating a Splunk Environment with Multiple Deployment Servers

By: Eric Howell | Splunk Consultant

Splunk Environments come in all shapes and sizes, from the small single-server installation managing all of your Splunk needs in one easily-managed box, to the multi-site, extra complex environments scaled out for huge amounts of data and all the bells and whistles to get in-depth visibility and reporting into a wide variety of circumstances as suits functionally any use case you can throw at Splunk. And, of course, everything in between.

For those multi-site, or multi-homed environments, that many data centers require for any range of needs, managing your configurations begins to get complicated between the additional firewall rules, data management stipulations, and any other broad range of issues that might crop up.

Thankfully, Splunk Enterprise allows for your administrative team, or Splunk professional services, to set up a Deployment Server to manage the configurations (bundled into apps) for all of the universal forwarders, so long as they’ve been set up as deployment clients. In a complicated environment, you may find that you need two deployment servers to manage the workload, for any number of reasons. Perhaps you are trying to keep uniform configuration management systems in multiple environments, or perhaps you are aiming to spread the communication load across multiple servers for these deployments. Whatever the use case, setting up two (or more) deployment servers is not the heartache you may be worried about, and the guide below should be ample to get you on the right track.

Multiple Deployment Servers – Appropriate Setup

To set up multiple deployment servers in an environment, you will need to designate one of the Deployment Servers as the “Master” or “Parent” server (DS1). This is likely to be the original deployment server that houses all of the necessary apps, and is likely already serving as deployment server to your environment.

The use case below will allow you to service a multi-site environment where each environment requires the same pool of apps, but is small enough to be serviced by a single deployment server.

  1. Stand up a new box (or repurpose a decommissioned server, as is your prerogative)! Install Splunk on this new server. This will act as your second deployment server (DS2).
  2. The key difference between these servers is that DS2 will actually be a client of the DS1.
  3. Initial set up is minimal, but make sure that this server has any standard configurations the rest of your environment holds, such as an outputs.conf to send its internal logs to the indexer layer, if you are leveraging that functionality.
  • You will create a deployment client app on DS2. You could use a copy of a similar app that resides on one of your heavy forwarders that poll DS1 for configuration management, but you will need to make two key adjustments in deploymentclient.conf:

  • Once this change has been made, the apps that will be pulled down from DS1 will reside in the appropriate location on DS2 to be deployed out to any servers that poll it.
  • Restart Splunk on DS2
  • Next, you will need to navigate to the ForwarderManagement UI on DS1 and create a Server Class for your Slave or ChildDeployment Servers (DS2 in this case)
  • Add all apps to this new server class
    • Allowing Splunk to restart with these apps isfine, as changes made to the originating Deployment Server (DS1) will allow DS2
      to recognize that the apps that it holds have been updated and are ready for
      deployment.
  • Add DS2 to this Server Class
  • Depending on the settings you have configured indeploymentclient.conf on DS2 for its polling period (phoneHomeIntervalInSecs
    attribute), and how many apps there are for it to pull down from DS1, wait an appropriate amount of time (longer than your polling period, and more) and
    verify if the apps have all been deployed.
  • After this, updates made to the apps on DS1 will propagate down to DS2.

Alternative Use Case

If you are planning to leverage multiple deployment servers to service the same group of servers/forwarders, you will want to also copy over the serverclass.conf from DS1. If all server classes have been created through the web ui, the file should be available here:

$SPLUNK_HOME/etc/system/local/serverclass.conf

If this is your intended use case, you will also want to work with your Network Team to place the Deployment Servers behind a loadbalancer. If you do so, you’ll need to modify the following attribute in deploymentclient.conf in your deployment client app that resides on your forwarders to indicate the VLAN:

You will also need to make sure both Deployment Servers generate the same “checksums” so that servers polling in and reaching different DS servers do not redownload the full list of apps with each check-in.

To do so, you will need to modify serverclass.conf on both Deployment Servers to include the following attribute:

This attribute may not be listed by default, so you may need to include it manually. This can be included with the other attributes in your [global] stanza.

Want to learn more about operating a Splunk environment with multiple deployment servers? Contact us today!