2.8 Lake house with SAP data

Welcome AWS Data Lakes Workshop

User Story

As evolving cloud capabilities are transforming systems and IT landscapes for many companies, setting up a solid data and analytics platform to break their data silos typically takes precedence in their business transformation.

As part of their transformation journey, customers need to collect data from various sources within the organization to create complex reporting capabilities that can provide a “single version of the truth” about various business events to make informed decisions and innovate.

Analytics at scale with Amazon Redshift lake house architecture using data from SAP and Amazon Connect

Lab Agenda

  1. We will set up a data and analytics platform to consolidate SAP data along with non-SAP application data to get better insights into the frequency of customer service calls.

  2. We will see how to use SAP Data Services – an application-level extractor – to extract the sales order data from SAP to Amazon Redshift for data warehousing, analyze the frequency of customer service calls with Amazon Connect data sets stored in Amazon Simple Storage Service (Amazon S3).

  3. We will use AWS Glue to transform the service call data and load the data to Amazon Redshift.

  4. For the reporting layer, we will use Amazon QuickSight and SAP Analytics Cloud to filter the frequency of customer service by sales order type.

Centralized lake house architecture using Amazon Redshift

Preparation menu

➡️ For this blog, we will use SAP Operational Data Provisioning (ODP), a framework that enables data replication capabilities between SAP applications and SAP and non-SAP data targets using a provider and subscriber model. Let us start with the SAP source, and look at the steps involved.


For this tutorial, you should have the following prerequisites:

  1. An AWS account with the necessary permissions to configure Amazon Redshift, Amazon S3, AWS Glue and Amazon QuickSight. See the setup instructions:
  1. Ability to create source data in either SAP ECC or SAP S/4 HANA systems.
  2. Necessary permissions in SAP Data Services to configure data integration and transformation.
  3. Download the sample Contact Trace Records (CTR) JSON data model from Amazon S3 for post processing analytics exercise.