Let’s understand the flow first.
➡️ In this step, we will create a command in SAP to call AWS S3 bucket from the EC2 role to land the SAP data files to S3.
➡️ Note this code is only for demonstration purpose.
*&---------------------------------------------------------------------*
*& Report Z_SAP_SALES_DATA_EXTRACTOR
*&---------------------------------------------------------------------*
*&
*&---------------------------------------------------------------------*
REPORT z_sap_sales_data_extractor.
DATA : BEGIN OF it_sales_data OCCURS 0,
matnr LIKE vbap-matnr,
erdat LIKE vbap-erdat,
kwmeng LIKE vbap-kwmeng,
END OF it_sales_data.
TABLES vbap.
TABLES vbak.
* In this code we do a simple query o Sales line item data in SAP resides in table VBAP and header information in table VBAK
* The Features (SAP tables fields) in this query are:
* select material number(Matnr) & sum of Order Quantity (KWMENG) for the material from the table VBAP. The output field name is extracted as item_id (Matnr) and demand (KWMENG)
* Date on which the record was created (ERDAT) from the table VBAK. The output field name is extracted as timestamp (ERDAT).
* Joining the VBAP and header information in table VBAK with the Sales document number(VBELN)
* Where Clause on SAP material number 'MZ-FG-C950','MZ-FG-C900' (Bike models). This is hard coded for this lab but you can create your own methods to do where clause on materials by using parameters.
* Finally Grouping by P~MATNR P~ERDAT.
SELECT p~matnr
p~erdat
SUM( p~kwmeng )
FROM vbak AS k INNER JOIN vbap AS p ON k~vbeln = p~vbeln INTO TABLE it_sales_data WHERE p~matnr IN ('MZ-FG-C950','MZ-FG-C900') GROUP BY p~matnr p~erdat.
*LOOP AT IT_SALES_DATA.
* WRITE : IT_SALES_DATA-MATNR,
* IT_SALES_DATA-ERDAT,
* IT_SALES_DATA-KWMENG.
* NEW-LINE.
*ENDLOOP.
*call function 'GUI_DOWNLOAD'
* exporting
* filename = 'C:/Users/Administrator/Desktop/reportout.txt'
* tables
* data_tab = IT_SALES_DATA.
*item_id,demand,timestamp
DATA lt_file_table TYPE rsanm_file_table.
DATA ls_file_table TYPE rsanm_file_line.
DATA: lv_lines_written TYPE i.
APPEND 'item_id,demand,timestamp' TO lt_file_table.
LOOP AT it_sales_data.
DATA s_demand TYPE string.
s_demand = it_sales_data-kwmeng.
DATA s_date TYPE string.
CONCATENATE it_sales_data-erdat+0(4) it_sales_data-erdat+4(2) it_sales_data-erdat+6(2) INTO s_date SEPARATED BY '-'.
CONDENSE s_demand.
CONDENSE s_date.
CONCATENATE it_sales_data-matnr ',' s_demand ',' s_date INTO ls_file_table.
APPEND ls_file_table TO lt_file_table.
ENDLOOP.
*using FM "GENERAL_GET_RANDOM_STRING". Pass the length of the string you want and FM will generate the alphabetical string.
DATA : filename TYPE string.
DATA : filepath TYPE string.
CALL FUNCTION 'GENERAL_GET_RANDOM_STRING'
EXPORTING
number_chars = 32
IMPORTING
random_string = filename.
*Extracting the data and storing in the file path as csv.
CONCATENATE '/usr/sap/S4E/D00/work/' filename '.csv' INTO filepath.
CALL METHOD cl_rsan_ut_appserv_file_writer=>appserver_file_write
EXPORTING
i_filename = filepath
i_overwrite = abap_true
i_data_tab = lt_file_table
IMPORTING
e_lines_written = lv_lines_written
EXCEPTIONS
open_failed = 1
write_failed = 2
close_failed = 3
OTHERS = 4.
IF sy-subrc <> 0.
MESSAGE ID sy-msgid TYPE sy-msgty NUMBER sy-msgno
WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
ELSE.
WRITE :/ filepath, ' created'.
ENDIF.
DATA: lv_status TYPE extcmdexex-status,
lv_retcode TYPE extcmdexex-exitcode,
lt_output TYPE STANDARD TABLE OF btcxpm.
FIELD-SYMBOLS:
<lfs_output> LIKE LINE OF lt_output.
*Sends the data to s3 bucket*
DATA addcommand TYPE btcxpgpar.
DATA s3path TYPE string.
CONCATENATE 's3://ft/sapsalesdata/' filename '.csv' INTO s3path.
CONDENSE s3path.
CONCATENATE 'cp' filepath s3path INTO addcommand SEPARATED BY ' '.
CALL FUNCTION 'SXPG_COMMAND_EXECUTE'
EXPORTING
commandname = 'ZAWSS3'
additional_parameters = addcommand
operatingsystem = 'Linux'
IMPORTING
status = lv_status
exitcode = lv_retcode
TABLES
exec_protocol = lt_output
EXCEPTIONS
no_permission = 1
command_not_found = 2
parameters_too_long = 3
security_risk = 4
wrong_check_call_interface = 5
program_start_error = 6
program_termination_error = 7
x_error = 8
parameter_expected = 9
too_many_parameters = 10
illegal_command = 11
wrong_asynchronous_parameters = 12
cant_enq_tbtco_entry = 13
jobcount_generation_error = 14
OTHERS = 15.
IF lv_retcode = 0.
WRITE : / 'File ', filepath , 'copied to ', s3path.
ENDIF.
*"Display the results of the command:
*LOOP AT LT_OUTPUT ASSIGNING <LFS_OUTPUT>.
* WRITE: / <LFS_OUTPUT>-MESSAGE.
*ENDLOOP.
➡️ In this exercise, you use the Amazon Forecast console to import the extracted Sales data from AWS considering date of sale as the feature. However, you should consider all features like product attributes (color, shape, size etc.), time of the year, seasonality etc. that are relevant for your use case for reliable forecasting.
item_id | demand | timestamp |
---|---|---|
MZ-FG-C900 | 124 | 10/18/17 |
MZ-FG-C900 | 124 | 10/18/17 |
MZ-FG-C950 | 102 | 10/18/17 |
Sign in to the AWS Management Console and open the Amazon Forecast console at https://console.aws.amazon.com/forecast/
On the Amazon Forecast home page, choose Create dataset group.
On the Create dataset group page, for Dataset group details, provide the following information:
Dataset group name – Enter a name for your dataset group.
Forecasting domain – From the drop-down menu, choose Custom. For more information about how to choose a forecasting domain, see How Amazon Forecast Works and dataset domains and types.
Dataset name – Enter a name for your dataset.
Frequency of your data – Keep the default value of 1, and choose day from the drop-down menu. This setting must be consistent with the input time series data.
Data schema – Update the schema to match the columns of the time-series data in data types and order. For the sales forecasting input data, the columns correspond to: a timestamp, demand at the specified day (target_value), and the ID of the material you want to forecast.
Time stamp format is yyyy-mm-dd
Your screen should look similar to the following:
On the Import target time series data page, for Dataset import job details, provide the following information:
Dataset import job name – Enter a name for your dataset.
Timestamp format – Select (yyyy-MM-dd). The format must be consistent with the input time series data.
IAM role – Keep the default Enter a custom IAM role ARN.
Alternatively, you can have Amazon Forecast create the required IAM role for you by choosing Create a new role from the drop-down menu and following the on-screen instructions.
Custom IAM role ARN – Enter the Amazon Resource Name (ARN) of the IAM role that you created in Create an IAM Role for Amazon Forecast (IAM Console).
Data location – Use the following format to enter the location of your .csv file on Amazon S3:
s3://
To create a predictor, which is a trained model, choose an algorithm and the number (length times frequency) of predictions to make. You can choose a particular algorithm, or you can choose AutoML to have Amazon Forecast process your data and choose an algorithm to best suit your dataset group.
Amazon Forecast provides six built-in algorithms for you to choose from. These range from commonly used statistical algorithms like Autoregressive Integrated Moving Average (ARIMA), to complex neural network algorithms like CNN-QR and DeepAR+.
For information about algorithms, see Choosing an Amazon Forecast Algorithm.
Under Train a predictor, choose Start. The Train predictor page is displayed.
Your screen should look similar to the following:
Forecast horizon – Choose how far into the future to make predictions. This number multiplied by the data entry frequency (daily) that you specified in the previous step. Import the Training Data determines how far into the future to make predictions. For this exercise, set the number to 365, to provide predictions for 1 year.
Forecast frequency – Keep the default value of 1. From the drop-down menu, choose hour. This setting must be consistent with the input time series data. The time interval in the sample electricity-usage data is an hour.
Algorithm selection – Keep the default value Manual. From the drop-down menu, choose the ETS algorithm. For more information about recipes, see Choosing an Amazon Forecast Algorithm.
The remaining settings are optional, so leave the default values. Choose Train predictor.
➡️ To make predictions (inferences), you use a predictor to create a forecast. A forecast is a group of predictions, one for every item in the target dataset. To retrieve the prediction for a single item, you query the forecast. To retrieve the complete forecast, you create an export job.
Under Forecast generation, choose Start. The Create a forecast page is displayed.
On the Create a forecast page, for Forecast details, provide the following information:
Forecast name – Enter a name for your forecast.
Predictor – From the drop-down menu, choose the predictor that you created in Step 2: Train a Predictor.
The remaining setting is optional, so leave the default value. Your screen should look similar to the following:
Copy the Forecast ARN in your Clip board for the next lab. You will update the Forecast ARN in the Lambda function for the lab.
➡️ After the forecast has been created, you can query for a single item or export the complete forecast.
If the dashboard is not displayed, in the navigation pane, under your dataset group, choose Dashboard.
In the Dashboard, under Generate forecasts, choose Lookup forecast. The Forecast lookup page is displayed.
On the Forecast lookup page, for Forecast details, provide the following information.
Forecast – From the drop-down menu, choose the forecast that you created in Step 3: Create a Forecast.
Start date – Enter the start date for the historical demand you want to view. . Keep the default time of 00:00:00.
End date – Enter the end date for the forecast that you want to view. Keep the default time of 00:00:00.
The date range of 1 day corresponds to the Forecast horizon that you specified in Step 2: Train a Predictor.
Choose which keys/filters – Choose Add forecast key.
Forecast key – From the drop-down menu, choose item_id.
Value – Enter a value from the item_id column of the input time series of the sales data. An item_id (for example, MZ-FG-C900) identifies a particular item included in the dataset.
Your screen should look similar to the following:
The forecast should look similar to the following:
Forecast enables you to evaluate predictors using different forecast types, which can be a set of quantile forecasts and the mean forecast. The mean forecast provides a point estimate, whereas quantile forecasts typically provide a range of possible outcomes.
By default, Forecast computes wQL at 0.1 (P10), 0.5 (P50), and 0.9 (P90).
P10 (0.1) - For the p10 forecast, the true value is expected to be lower than the predicted value 10% of the time, and the wQuantileLoss[0.1] can be used to assess its accuracy. For a use case when there is not a lot of storage space and the cost of invested capital is high, or the price of being overstocked on an item is of concern, the p10 quantile forecast is useful to order a relatively low number of stock. The p10 forecast over-estimates the demand for an item only 10% of the time, meaning that approximately 90% of the time, that item will be sold out.
P50 (0.5) - For the p50 forecast (often also called the median) the true value is expected to be lower than the predicted value 50% of the time, and the wQuantileLoss[0.5] can be used to assess its accuracy. When being overstocked is not too concerning, and there is a moderate amount of demand for a given item, the p50 quantile forecast can be useful.
P90 (0.9) - For the p90 forecast, the true value is expected to be lower than the predicted value 90% of the time, and the wQuantileLoss[0.9] can be used to assess its accuracy. When being understocked on an item will result in a large amount of lost revenue, that is, the cost of not selling the item is extremely high or the cost of invested capital is low, the p90 forecast can be useful to order a relatively high number of stock.
➡️ See Amazon Forecast white-paper to get deeper insights.
To export the complete forecast:
In the navigation pane, under your dataset group, choose Forecasts.
Choose the radio button next to the forecast that you created in Step 3: Create a Forecast.
Choose Create forecast export. The Create forecast export page is displayed.
On the Create forecast export page, for Export details, provide the following information.
Export name – Enter a name for your forecast export job.
Generated forecast – From the drop-down menu, choose the forecast that you created in Step 3: Create a Forecast.
IAM role – Keep the default Enter a custom IAM role ARN.
Alternatively, you can have Amazon Forecast create the required IAM role for you by choosing Create a new role from the drop-down menu and following the on-screen instructions.
Custom IAM role ARN – Enter the Amazon Resource Name (ARN) of the IAM role that you created in Create an IAM Role for Amazon Forecast (IAM Console).
S3 forecast export location – Use the following format to enter the location of your Amazon Simple Storage Service (Amazon S3) bucket or folder in the bucket:
s3://name of your S3 bucket/folder path/
Your screen should look similar to the following:
➡️ Lambda function with proxy integration proxy integration so these parameters are passed as-is to the Lambda function.
To create a Lambda function with the console. Open the Functions page on the Lambda console.
Choose Create function.
Under Basic information, do the following:
For Function name, enter my-function.
For Runtime, confirm that Python 3.7 is selected.
Choose Create function.
Lambda creates a Python 3.7 function and an execution role that grants the function permission. The Lambda function assumes the execution role when you invoke your function, and uses the execution role to create credentials for the AWS SDK and to read data from event sources.
Code snippet in python below provides detail of how request and response are processed in lambda function attached to API Gateway. First we validate the credentials passed by the SAP system against credentials stored in AWS Secrets Manager.
[
{
"I": "string",
"D": "string",
"ME": "string",
"P10": "string",
"P50": "string",
"P90": "string"
}
]
Here is the complete code snippet in Lambda for for the lab. Copy and paste the code in
import json
import boto3
import base64
from botocore.exceptions import ClientError
forecastclient = boto3.client('forecastquery')
forecastARNclient = boto3.client('forecast')
def lambda_handler(event, context):
if ('Authorization' not in event['headers']):
# User ID Password not provided
return { "statusCode": 401,"isBase64Encoded" : False }
else:
secret_name = "SAP_Sales_Forecast_UserPass"
region_name = "us-east-1"
# Create a Secrets Manager client
session = boto3.session.Session()
client=session.client(service_name='secretsmanager',region_name=region_name)
# Get secret from secrets manager
get_secret_value_response = client.get_secret_value(SecretId=secret_name)
secret = json.loads(get_secret_value_response['SecretString'])
#Added on 02/
print(json.loads)
# Get credentials from authorization header and decode to string
userdet= event['headers']['Authorization'].split()[1]
userdet = base64.b64decode(userdet).decode("utf-8")
if (secret ["SAPSALES"] != userdet.split(":")[1]):
# Password Not Matched
return { "statusCode": 401,"isBase64Encoded" : False }
else:
# Password Matched, Authorization successful
itemid = event['queryStringParameters']['item_id']
fromdate = event['queryStringParameters']['from_date']
todate = event['queryStringParameters']['to_date']
response = get_forecast(itemid,fromdate,todate)
result = [];
for num, res in enumerate(response['Forecast']['Predictions']['p90']):
data = {}
data['I'] = itemid
data['D'] = res['Timestamp'][0:10]
if 'mean' in response['Forecast']['Predictions'] : data["ME"] = str(round(response['Forecast']['Predictions']['mean'][num]['Value'],2))
if 'p10' in response['Forecast']['Predictions'] : data["P10"] = str(round(response['Forecast']['Predictions']['p10'][num]['Value'],2))
if 'p50' in response['Forecast']['Predictions'] : data["P50"] = str(round(response['Forecast']['Predictions']['p50'][num]['Value'],2))
if 'p90' in response['Forecast']['Predictions'] : data["P90"] = str(round(response['Forecast']['Predictions']['p90'][num]['Value'],2))
if 'p99' in response['Forecast']['Predictions'] : data["P99"] = str(round(response['Forecast']['Predictions']['p99'][num]['Value'],2))
result.append(data)
return {
"statusCode": 200,
"isBase64Encoded" : False,
"body": json.dumps(result),
"headers" : { 'Content-Type' : 'application/json' },
"multiValueHeaders" : {}
}
def get_forecast(item_id,from_date,to_date):
response = forecastARNclient.list_forecasts(
MaxResults=50,
Filters=[
{
'Key': 'DatasetGroupArn',
'Value': 'arn:aws:forecast:us-east-1:********:dataset/my_erp_sales_data',
'Condition': 'IS'
},
]
)
for item in response['Forecasts']:
print(item['ForecastArn'])
response = forecastclient.query_forecast(
ForecastArn='arn:aws:forecast:us-east-1:******:forecast/my_erp_forecast',
StartDate = from_date+'T00:00:00',
EndDate = to_date+'T00:00:00',
Filters = {
'item_id' : item_id
}
)
return response