Fortune 500 Next Generation Content Platform

Brian McCallion Big Data, Case Study 0 Comments

iStock_000024402402_web

Case Study: Fortune 500 Next Generation Content Management Platform

By Brian McCallion

 

 

Summary


This Fortune 500 firm required a stable, neutral, and flexible platform on which to locate content and data.

  1. Business to Business Partner Subscription to Data
  2. Static content, combined with very large sets of structured data
  3. High performance data processing and continuous analysis and processing of data
  4. Single millisecond search response time for customer searches
  5. Push notification to customer subscribers when new and relevant data and reports surfaced

Download the Complete Study

Business Requirements

Th firm required a stable, neutral, and flexible platform on which to locate content and data.

  1. Business to Business Partner Subscription to Data
  2. Static content, combined with very large sets of structured data
  3. High performance data processing and continuous analysis and processing of data
  4. Single millisecond search response time for customer searches
  5. Push notification to customer subscribers when new and relevant data and reports surfaced

After reviewing options the customer chose Amazon Web Services for its consistency across global region, neutral service platform, and available web services.

1.1    Challenges

1.1.1    Data Security and Compliance
Customer security, risk management, and compliance policy mandates all data not stored on customer premises must be encrypted in transit and at rest.

1.1.2    Existing Customer Investment in Software licenses
Existing Oracle Enterprise License for both the Oracle Enterprise Database edition and Oracle WebLogic Application Server. How would these licenses apply in Amazon Web Services, if at all?

1.1.3    Common Application Server and Operating System for Data Center and Cloud.
The customer preferred (required) running Oracle WebLogic so as to provide a familiar application server for support and operations

1.1.4    Batch Process to load DataSets

500 million record data set required to load within an initial twelve hour window; and ability to continue to improve upon this load time and to make it continuous

1.1.5    High Availability and Financial Penalties in the Event of Down Time.

Business subscribers to the content require high availability. Brief outages were covered in the application SLA and interruptions were guaranteed by hundreds of thousands of dollars in potential payouts even for an outage of a single hour.

1.2    Solution Design and Success Criteria

1.2.1    High Availability in the Event of Loss of an Availability Zone (loosely equivalent to a data center)

Amazon RDS Oracle 11gR2 Enterprise with Advanced Security Option (Bring Your Own License)

Licensing Considerations

  • Enterprise License
  • Bring Your Own License option for AWS RDS Oracle
  • Oracle Advanced Security Option License

1.2.1.1   Enterprise License

While the customer already had an enterprise license for the Oracle database, the customer licensed the required Advanced Security Option from Oracle. Under the BYOL (Bring your own license) option, the customer was then able to use both the enterprise license AND the Advanced Security option license in AWS.

1.2.2    Database

1.2.3    High Availability

While each component of the solution was critical, the database posed the initial challenge. At the time, Amazon did not offer Transparent Data Encryption for RDS Oracle. Bronze Drum worked with AWS and the necessary technology was released. Bronze Drum worked with AWS and the customer to test the RDS Oracle Multi-zone capability, which enables synchronous storage replication to a stand-by Oracle database running in another availability zone. In the event of an issue with the primary database, the standby database is automatically brought online, and the endpoint updated. The application seamlessly reconnected to the Oracle database. Bronze Drum and the customer observed that by comparison, if the corporate data center were lost, ordinary time to bring DR databases online is measured in hours and required manual intervention by database administrators. RDS Oracle Multi-zone automatically managed the failover. Further, all the encryption keys and data were up-to-date even for changes committed just a second before cutover to the standby database.

1.2.4    Data Security and Compliance

After mapping and identifying all PII and other confidential data, all such data is placed in a dedicated, encrypted area of Oracle database and the encryption managed by Oracle’s Transparent Data Encryption Technology (TDE). As a result, the developers did not have to change any of their code or to be aware of the encryption at all. The “transparent” nature of the encryption service manages these concerns. Further, the Oracle Advanced Security module enabled other required capabilities, such as the ability to encrypt the data “in transit” as it moved between the database and the application servers across the Virtual Private Cloud network. To further control access to the data, the Oracle database runs in designated network segments. Security groups, and access control lists strictly define how and what services and systems may connect to the database. Even in the event of a breach, multiple layers of controls still restricted access to the database and the sensitive data. Continuous logs replication and analysis to systems provides a monitoring service to examine network activity

1.2.5     Phase I Technology

S3

Amazon RedShift

AWS DataPipeline

AWShttp://aws.amazon.com/elasticloadbalancing/Elastic Load Balancer

RDS Oracle with Transparent Data Encryption

MicroStrategy Business Intelligence

SQL Server with Transparent Data Encryption

1.3    Solution Design, Documentation, and Delivery

The customer’s on premises process included preparation of a detailed solution design package document. Bronze Drum updated this document to include Cloud and Amazon specific information. As the application moved from the planning, to the build, to the run stage, Bronze Drum continued to update this document.[i]

1.4    Application and Content High Availability

1.4.1    Oracle WebLogic Application Server

Separate “strings” of web (Apache) and application (WebLogic) instance run in two separate availability zones

A health check running on the application server creates an html page if all application and services tests for the “string” succeed. Otherwise, the health-check fails, and the Elastic Load Balancer does not direct any requests to that degraded “string.” While there are other ways to approach this, the simplicity of routing around any issue carried the day.

1.4.2    Alfresco Enterprise Content Management Server

Two nodes run in separate availability zones configured as independent strings. The only common infrastructure is the S3 storage and the database.

Note: As each of these services (Oracle RDS multi-zone and S3) has inherent failover capabilities and data durability capabilities, the architecture and operation of the solution leverages these services rather than attempting to build a “bespoke” implementation of either. In prior solutions attempts to use NSF or to implement Oracle high availability via other means proved to be significant impediments and introduced considerable cost, complexity, and “entropy.”

1.4.3    Object Content Store

1.4.4    Search and Index Service

Solr Cloud Cluster with Zookeeper in the Solr “cloud” configuration was implemented to manage the availability and integrity of the Solr cluster. ZooKeeper may be more widely as a “service locator” and service in manager in future deployments.[ii]

 Let’s Get Started

[contact-form-7 404 "Not Found"]
[i] Based on our experience with this customer solution, Bronze Drum subsequently modernized this approach by creating automated “stacks” that substantially document and manage application. Today Bronze Drum’s Stack for WebLogic and Oracle AWS OpsWorks provides a layered approach to managing the different components of the solution and managing the configuration and operations related to this solution.

 

To discuss the details of this solution please call to schedule a consultation, or register for our workshop on building an enterprise content management platform on Amazon Web Services.

[ii] AirBnB at 2013 AWS re:invent presents how Zookeeper plays a central role managing AirBnB services.