Real-time Reporting

Overview

The Real-Time Reporting Service enables data analysis of data collected from multiple sources over time. The service provides:

  • A means to retrieve metadata about existing data tables.
  • A means to create, update, and delete data tables.
  • A query mechanism for event data that allows users to discover and interpret meaningful patterns in the data.

The query mechanism excels at aggregating time-based data, and allows for multi-dimensional operations such as limiting, grouping, and filtering. Depending on the data that you have submitted to the system, you might use the Real-Time Reporting Service's query functionality to determine, during a specified time period:

  • The two most popular browsers used by a particular type of user
  • The most frequently-viewed product categories grouped by day for the last week
  • The total number of successful checkout events in the last month
  • The total number of site visitors for each hour during the previous day

NOTE: Data is automatically ingested into the Real-Time Reporting system via the Profile Edge service. If you create a table whose name and type are both 'clickstream' or are both 'commerce', relevant events submitted to the Profile Edge service will be ingested into this table.

NOTE: It is strongly recommended that PII (Personally Identifiable Information) be excluded from data submitted to the Real-Time Reporting Service.


API Reference

/templates

/templates/{templateId}

get

Returns a table meta data template, such as for 'clickstream' or 'commerce'. Requires a hybris.realtime_reporting_view scope.

/{tenant}

/{tenant}/lookups

get

Returns a list of all lookup name and associated entry value counts for the active tenant. Requires a hybris.realtime_reporting_view scope.

/{tenant}/lookups/{lookup}

get

Returns a description of the specified lookup for the active tenant. Requires a hybris.realtime_reporting_view scope.

put

Creates a lookup that defines sets of associated IDs. Requires a hybris.realtime_reporting_manage scope.

delete

Delete the lookup definition and all of the lookup's data. Requires a hybris.realtime_reporting_manage scope.

patch

Inserts lookup values into a lookup. Key and value field names must match those specified in lookup metadata. Requires a hybris.realtime_reporting_manage scope.

/{tenant}/tables

get

Returns a description of each table defined for the tenant. Requires a hybris.realtime_reporting_view scope.

/{tenant}/tables/restrictions

get

Returns a list of active data restrictions. Requires a hybris.realtime_reporting_manage scope.

/{tenant}/tables/restrictions/{profileId}

A data restriction associated with a profileId. Requires a hybris.realtime_reporting_manage scope.

put

Places a data restriction for a profile id. Impacts all tables for the tenant.

delete

Removes a data restriction for a profile id. Impacts all tables for the tenant.

/{tenant}/tables/{tableId}

A table represents a logical grouping of dimension columns as well as an input specification.

get

Retrieves the table definition (metadata). Requires a hybris.realtime_reporting_view scope.

put

Creates or replaces a table definition, which defines table dimension columns and parameters for dimension data. If replacement excludes a pre-existing dimension column, the associated data will be maintained but will not be accessible through subsequent queries. Requires a hybris.realtime_reporting_manage scope.

/{tenant}/tables/{tableId}/query

post

Performs the provided query, targeting the specified tenant and table, and returns a JSON representation of the query results. Requires a hybris.realtime_reporting_view scope.

/{tenant}/tables/{tableId}/events

patch

Posts a request to edit or delete data from a specified table. Such modification requests are executed periodically. Requires a hybris.realtime_reporting_manage scope.


Data Ingestion

The Real-time Reporting Service can automatically ingest event data submitted to the Edge Service. This includes the following event types:

  • Clickstream and page-load data representing user interaction with your web interface.
  • Commerce data about customer transactions.

Event data is ingested by the Real-time Reporting Service if a Real-time Reporting table associated with that data has been created.

Transformation of JSON Values to Target Column Type

In order to validate input data and guarantee ingestion, it is strongly recommended that event properties destined for numeric columns have an appropriate numeric JSON type. Properties targeting 'Long'-type columns should be expressed as an integral number in the input JSON; properties targeting 'Float'-type columns should be expressed as a JSON 'number' type.

The Real-time Reporting Service currently offers implicit conversion of String values to numeric columns, as long as the String presentation can be parsed into the target type. For instance, the String "1234.67" cannot be parsed into a Long column, because it represents a decimal value.

At this time, the service will strip any commas from Strings targeting numeric columns prior to attempting a conversion. The period symbol '.' will be treated as decimal point. This implicit conversion behavior may change in future releases and should be considered "experimental".

Messages containing String properties that cannot be successfully parsed will be rejected entirely at ingestion time.


Table Queries

The Real-time Reporting Service provides a rich query language to aggregate ingested data in a wide variety of ways.

As we explore the individual aspects of the query language, we will use some sample data to show how different elements of a query affect the data returned.

Example

The following represents a sample dataset ingested into the Real-time Reporting Service:

timestampidbrowserreferrer_domaincategory
2017-04-01T15:00:00Zevent1chromegoogle.comshoes
2017-04-01T15:01:00Zevent2firefoxfacebook.comelectronics
2017-04-01T15:03:00Zevent3chromegoogle.comelectronics
2017-04-02T11:03:00Zevent4unknowncnn.comshoes
2017-04-02T11:05:00Zevent5unknowncnn.comfood
2017-04-02T11:07:00Zevent6safaricnn.comfood
2017-04-02T11:09:00Zevent7unknowncnn.comelectronics
2017-04-02T11:11:00Zevent8nullcnn.comshoes

The following is an example query for this data that performs the following aggregation:

  • Retrieves data within an interval between April 1 (inclusive) and April 3, 2017 (exclusive)
  • Counts the number of page views per browser
  • Orders results by page views, from highest to lowest
  • Excludes results wherein the value of the "browser" column is null
  • Limits results to four records maximum
{
    "interval": "2017-04-01/2017-04-03",
    "granularity": "all",
    "dimensions": ["browser"],
    "aggregations": [
        {"type": "count", "name": "page_views"}
    ],
    "order": [
        {"column": "page_views", "direction": "descending"}
    ],
    "limit": 4,
    "filter": { "type": "notequal", "column": "browser", "value": null }
}

The above query produces the following results:

[
  {
    "page_views": 3,
    "browser": "unknown"
  },
  {
    "page_views": 2,
    "browser": "chrome"
  },
  {
    "page_views": 1,
    "browser": "firefox"
  },
  {
    "page_views": 1,
    "browser": "safari"
  }
]

Parameters

This section describes in detail the seven elements, or parameters, used in a Table Query.

Interval

REQUIRED An interval represents the time period that a query targets. The interval must be a string representing an ISO-8601 interval. Each of the following examples represents an interval of two days, starting on April 1st, 2017:

  • 2017-04-01T00:00:00Z/2017-04-03T00:00:00Z
  • 2017-04-01/2017-04-03
  • 2017-04-01/P2D

You can specify only one duration, prefix P, in the interval, either as the start or end component. The other item must be a date. The system interprets the duration relative to the date specified.

NOTE that if an interval specifies a start and end date (as in the first two examples above), then the first date is inclusive and the second date is exclusive.

Read more about ISO-8601 time intervals here.

Granularity

REQUIRED A granularity represents the timeframe into which events are grouped. For example: A granularity of hour will return data grouped by hour. The granularity can be one of the following strings:

  • all
  • hour
  • day
  • week
  • month

Note that a value of all yields no timeframe grouping.

Alternatively a granularity object can be specified. This object consists of two fields, period and timeZone. The period field must be a valid ISO-8601 period. For instance "P1M" would represent a month, or "PT2H" would represent 2 hours. The timeZone field represents which time zone to use for the grouping. A list of valid time zones is available here.

As an example the following granularity would group by quarters based on the Denver time zone:

{
    ...
    "granularity": { "period": "P3M", "timeZone": "America/Denver" }
    ...
}

<a id="Dimensions"></a>
#### Dimensions
The **dimensions** element defines returned data groups by dimension.  The **dimensions** element must be represented as a comma-delimited array of strings referencing one or more dimensions.  If no dimensions are specified in a query, then no grouping occurs and the raw events are returned.

In our example query we specify grouping by a single `browser` dimension:
```json
{
    ...
    "dimensions": ["browser"],
    ...
}

You can also group data using more than one dimension. The example below groups results in unique combinations of browser and device_type:

{
    ...
    "dimensions": ["browser", "device_type"],
    ...
}

When you aggregate data, the dimensions may not contain columns of the TIMESTAMP type.

Limit

A limit is an integer that defines the maximum number of records the query will return. The maximum number of return values allowed by the service is 10,000.

In this example, the limit is specified as 4:

{
    ...
    "limit": 4,
    ...
}

Aggregations

The aggregations element is a comma-delimited array of objects that defines one or more operations to be applied to retrieved data. Each such object contains a type, name, and optional column field.

  • name : An arbitrary label for the aggregation which can be referenced in the order element.
  • column : Identifies the data column to which aggregation is applied.
  • type : Defines the type of aggregation to be applied, and can be one of the following values:
    • count : Yields a count of value instances.
    • count_distinct : Yields an estimated count of unique instances of a value.
    • max : Yields the maximum value, according to the natural ordering of the data type.
    • mean : Yields the mean value, according to the natural ordering of the data type.
    • min : Yields the minimum value, according to the natural ordering of the data type.
    • sum : Yields the sum of data values.
    • stddev: Yields the standard deviation of data values.
    • variance: Yields the variance of data values.

In our example query we request row counts, and define the name page_views to label such counts:

{
    ...
    "aggregations": [ {"type": "count", "name": "page_views"} ]
    ...
}

You can specify multiple aggregations. The following returns row counts as page_views, and returns counts of distinct referrer_domain values as distinct_referrer_domains:

{
    ...
    "aggregations": [
        { "type": "count", "name": "page_views" },
        { "type": "count_distinct", "name": "distinct_referrer_domains", "column": "referrer_domain"},
    ]
    ...
}

A query with no "aggregations" element retrieves individual records, and thus may yield a very large result set and reduced performance.

When you aggregate data, the dimensions may not contain columns of the TIMESTAMP type.

Order

An order defines the order of returned results. The order is represented as a comma-delimited array of objects; multiple objects yield hierarchical ordering. Each such object must contain a column and a direction field.

  • column : Identifies the data column to which ordering is applied, and can reference a dimension or an aggregation name.
  • direction : Defines the sort order of the returned data, using one of the following values:
    • ascending
    • descending

In this example query, the results are specified to be returned in descending order of page_views:

{
    ...
    "order": [
        {"column": "page_views", "direction": "descending"}
    ]
    ...
}

Here the order element references a dimension (referrer_domain), and aggregation names (page_views, distinct_referrer_domains):

{
    ...
    "dimensions": [ "referrer_domain", "session_id"],
    "aggregations": [
        { "type": "count_distinct", "name": "distinct_referrer_domains", "column": "referrer_domain"}
    ],
    "order": [
        {"column": "referrer_domain", "direction": "descending"},
        {"column": "page_views", "direction": "descending"},
        {"column": "distinct_referrer_domains", "direction": "descending"}
    ]
    ...
}

Filter

A filter includes or excludes matched data from the retrieved set by applying rules represented as objects. Each rules object must contain a type, column, and value field.

  • type : Defines a condition to be met; valid values are 'equal' and 'notequal'.
  • column : Identifies the data column to which the rule applies.
  • value : Defines the data value targeted by the rule.

In our example query we specify a single filter that includes only data whose browser column value is not null:

{
    ...
    "filter": { "type": "notequal", "column": "browser", "value": null }
    ...
}

Rules, and sets of rules, optionally can be concatenated with the conjunctions 'and' and 'or' to create complex logic. The following returns data wherein: browser is not null and category is not "shoes":

{
    ...
    "filter": {
        "and": [
          { "type": "notequal", "column": "browser", "value": null },
          { "type": "notequal", "column": "category", "value": "shoes" }
        ]
    }
    ...
}

The following returns data wherein: (browser is not null) and (category is "shoes" or category is "food"):

{
    ...
     "filter": {
        "and": [
          { "type": "notequal", "column": "browser", "value": null },
          { "or": [
            { "type": "equal", "column": "category", "value": "shoes" },
            { "type": "equal", "column": "category", "value": "food" }
          ]}
        ]
    }
    ...
}


Table Metadata Retrieval

The Real-time Reporting Service enables retrieval of metadata about existing data tables. Table metadata comprises:

  • name: The required value is either 'clickstream' or 'commerce'.
  • description
  • ingestionType: The required value is either 'clickstream' or 'commerce'.
  • columns: A List of objects representing columns, each comprising:
    • name
    • description
    • type: Valid column types are 'string', 'long', 'float', and 'timestamp'. Exactly one 'timestamp' column must exist.
    • path: An XPath representing the path to the column in the input JSON data. For example, '$.x.y' indicates that this column is associated with a 'y' element in the JSON that is embedded in an 'x' element. Note that the column name can differ from the path location: A column named 'color' can be associated with an element named 'y' in the JSON.
    • collection: 'false' if the column holds a single data value, and 'true' if it holds an array of values.


Table Creation and Replacement

The Real-time Reporting Service enables creation and replacement of data tables.

Note that the value of the 'tableId' URI path of the PUT request and the table 'name' in the JSON must match. Once created, the table name cannot be altered.

The table schema defines table columns and parameters for column data. The basic schema for a 'clickstream' table is as follows:

{
  "description": "Profile clickstream events",
  "ingestionType": "clickstream",
  "name": "clickstream",
  "columns": [
    {
      "name": "action_name",
      "type": "string",
      "description": "Identifier for the type of event; e.g. 'CartUpdate', 'ProductDetailPageViewEvent'",
      "collection": false
    },
    {
      "name": "cookies_enabled",
      "type": "string",
      "description": "Indicates whether the source browser has cookies enabled",
      "path": "$.cookie",
      "collection": false
    },
    {
      "name": "browser",
      "type": "string",
      "description": "A string representing the source browser, including major and minor version; e.g. 'Chrome_21_1_5'",
      "collection": false
    },
    {
      "name": "browser_no_minor",
      "type": "string",
      "description": "A string representing the source browser, including major version only; e.g. 'Chrome_21'",
      "collection": false
    },
    {
      "name": "browser_no_version",
      "type": "string",
      "description": "A string representing the source browser, excluding version information; e.g. 'Chrome'",
      "collection": false
    },
    {
      "name": "browser_type",
      "type": "string",
      "description": "Indicates the type of the source browser; must be one of: 'browser', 'mobile_browser', 'email_client', 'application', 'unknown'",
      "collection": false
    },
    {
      "name": "city",
      "type": "string",
      "path": "$.event.location.city",
      "description": "City associated with the web request, based on IP geolocation data.",
      "collection": false
    },
    {
      "name": "country_code",
      "type": "string",
      "path": "$.event.location.country_code",
      "description": "Country code associated with the web request, based on IP geolocation data.",
      "collection": false
    },
    {
      "name": "date",
      "type": "timestamp",
      "description": "Indicates the event timestamp, in ISO-8601 format",
      "collection": false
    },
    {
      "name": "device_type",
      "type": "string",
      "description": "Indicates the type of the source device; must be one of: 'computer', 'mobile, 'tablet, 'game_console, 'digital_media_receiver, 'wearable_computer, 'unknown'",
      "collection": false
    },
    {
      "name": "funnel_level",
      "type": "long",
      "description": "Indicates the stage in the customer's journey through an internet advertising system or search system",
      "path": "$.profile_custom.funnelLevel",
      "collection": false
    },
    {
      "name": "guid",
      "description": "Unique identifier for the event. Auto-generated at ingestion time",
      "type": "string",
      "collection": false
    },
    {
      "name": "language",
      "type": "string",
      "description": "Represents the source language as an Accept-Language string; e.g. 'en', 'de'",
      "collection": false
    },
    {
      "name": "latitude",
      "type": "string",
      "path": "$.event.location.latitude",
      "description": "Latitude as derived from IP geolocation data.",
      "collection": false
    },
    {
      "name": "locale",
      "type": "string",
      "description": "Represents the source language and locale as an Accept-Language string; e.g. 'en-US', 'de-CH'",
      "collection": false
    },
    {
      "name": "longitude",
      "type": "string",
      "path": "$.event.location.longitude",
      "description": "Longitude as derived from IP geolocation data.",
      "collection": false
    },
    {
      "name": "operating_system",
      "type": "string",
      "description": "Indicates the source operating system and its version number; e.g. 'Windows_95'",
      "collection": false
    },
    {
      "name": "operating_system_no_version",
      "type": "string",
      "description": "Indicates the source operating system, excluding version number; e.g. 'Windows'",
      "collection": false
    },
    {
      "name": "product_category",
      "type": "string",
      "description": "Product category",
      "path": "$.profile_custom.productCategory",
      "collection": false
    },
    {
      "name": "product_price",
      "description": "Product price",
      "type": "float",
      "path": "$.profile_custom.productPrice",
      "collection": false
    },
    {
      "name": "product_sku",
      "description": "Product sku",
      "type": "string",
      "path": "$.profile_custom.productSku",
      "collection": false
    },
    {
      "name": "profile_id",
      "type": "string",
      "description": "Profile id associated with the end consumer",
      "collection": false
    },
    {
      "name": "profile_trace_id",
      "description": "Profile Trace Id. Auto-populated at ingestion time",
      "type": "string",
      "collection": false
    },
    {
      "name": "referrer",
      "type": "string",
      "description": "Represents the full URL of the referring site; e.g. 'http://m.facebook.com/ad/12345'",
      "path": "$.urlref",
      "collection": false
    },
    {
      "name": "referrer_domain",
      "type": "string",
      "description": "Represents the domain of the referring site; e.g. 'facebook.com'",
      "collection": false
    },
    {
      "name": "region_code",
      "type": "string",
      "path": "$.event.location.region_code",
      "description": "Region code associated with the web request, based on IP geolocation data.",
      "collection": false
    },
    {
      "name": "screen_resolution",
      "type": "string",
      "description": "Represents the raw resolution of the source display, as [width]x[height]; e.g. '1920x1200'",
      "path": "$.res",
      "collection": false
    },
    {
      "name": "search",
      "type": "string",
      "description": "Contains the literal search string applied, if any; e.g. 'mineral content, density'",
      "collection": false
    },
    {
      "name": "search_category",
      "type": "string",
      "description": "Represents the category of a search indicated by the 'search' term; e.g. 'Snow'",
      "path": "$.search_cat",
      "collection": false
    },
    {
      "name": "segments",
      "type": "string",
      "description": "Contains the segments involved in a personalization.",
      "path": "$.profile_custom.segments",
      "collection": true
    },
    {
      "name": "session_id",
      "type": "string",
      "description": "An internally-generated string identifying the current session",
      "path": "$.id",
      "collection": false
    },
    {
      "name": "url",
      "type": "string",
      "description": "Indicates the URL of the event source",
      "collection": false
    }
  ]
}

The basic schema for a 'commerce' table is as follows:

{
  "name": "commerce",
  "description": "ECP commerce events",
  "ingestionType": "commerce",
  "columns": [
    {
      "name": "channel_ref",
      "description": "References the channel for the event; e.g. 'webshop'",
      "type": "string",
      "path": "$.channelRef",
      "collection": false
    },
    {
      "name": "consumer_ref",
      "description": "References the consumer that the event was issued for; e.g. 'someone@mail.com'",
      "type": "string",
      "path": "$.consumer.ref",
      "collection": false
    },
    {
      "name": "consumer_type",
      "description": "Indicates the consumer reference type; e.g. 'email'",
      "type": "string",
      "path": "$.consumer.type",
      "collection": false
    },
    {
      "name": "currency",
      "description": "Indicates the currency of an order; e.g. 'EUR'",
      "type": "string",
      "path": "$.body.currency",
      "collection": false
    },
    {
      "name": "date",
      "description": "Indicates the event timestamp, in ISO-8601 format",
      "type": "timestamp",
      "collection": false
    },
    {
      "name": "event_type",
      "description": "Identifies the type of event: 'order', 'order shipment', 'return', 'partial return', 'cancellation', 'account registration', or 'login'",
      "type": "string",
      "path": "$.type",
      "collection": false
    },
    {
      "name": "guid",
      "description": "Unique identifier for the event. Auto-generated at ingestion time",
      "type": "string",
      "collection": false
    },
    {
      "name": "item_currency",
      "description": "Indicates the currency of a particular line item; e.g. 'EUR'",
      "type": "string",
      "path": "$.body.lineItems[*].currency",
      "collection": false
    },
    {
      "name": "item_index",
      "description": "The presence of this column indicates that the body.lineItems arrays in the event will be expanded, yielding a unique record for each array entry. The value of this column indicates the index of a given line item record. For example, an event with two line items will yield two unique line item records with 'item_index' equal to '0' and 'item_index' equal to '1', respectively",
      "type": "expansion",
      "path": "$.body.lineItems[*]",
      "collection": false
    },
    {
      "name": "item_pos",
      "description": "Indicates the position of an item in the array of line items. Synonymous with the 'item_index' column; e.g. '1' represents the second entry in an array",
      "type": "long",
      "path": "$.body.lineItems[*].pos",
      "collection": false
    },
    {
      "name": "item_price_effective",
      "description": "Contains the effective price for a particular line item; e.g. '15.00'",
      "type": "float",
      "path": "$.body.lineItems[*].price_effective",
      "collection": false
    },
    {
      "name": "item_price_list",
      "description": "Contains the original listed price for a particular line item; e.g. '17.00'",
      "type": "float",
      "path": "$.body.lineItems[*].price_list",
      "collection": false
    },
    {
      "name": "item_quantity",
      "description": "Indicates the quantity of a particular line item; e.g. '2'",
      "type": "long",
      "path": "$.body.lineItems[*].quantity",
      "collection": false
    },
    {
      "name": "item_ref",
      "description": "Defines the path to a line item reference ID; e.g. 'ABCDEF1'",
      "type": "string",
      "path": "$.body.lineItems[*].ref",
      "collection": false
    },
    {
      "name": "item_status",
      "description": "Indicates the status of a particular line item within the entire order; e.g. 'cancellation requested' or 'pre-order'",
      "type": "string",
      "path": "$.body.lineItems[*].status",
      "collection": false
    },
    {
      "name": "item_tax_amount",
      "description": "Contains the tax amount as a percentage for a particular line item; e.g. '17%'",
      "type": "float",
      "path": "$.body.lineItems[*].taxAmount",
      "collection": false
    },
    {
      "name": "item_type",
      "description": "Indicates the type of a particular line item; e.g. 'Product'",
      "type": "string",
      "path": "$.body.lineItems[*].type",
      "collection": false
    },
    {
      "name": "item_unit",
      "description": "Indicates the unit applied to a particular line item; e.g. 'kg', 'pieces'",
      "type": "string",
      "path": "$.body.lineItems[*].unit",
      "collection": false
    },
    {
      "name": "order_id",
      "description": "Represents an individual order; e.g. '123abc'",
      "type": "string",
      "path": "$.body.orderId",
      "collection": false
    },
    {
      "name": "order_value",
      "description": "Contains the total order value; e.g. '114.00'",
      "type": "float",
      "path": "$.body.orderValue",
      "collection": false
    },
    {
      "name": "payment_info_payment_type",
      "description": "Indicates the type of payment used for an order; e.g. 'Master'",
      "type": "string",
      "path": "$.body.paymentInfo.paymentType",
      "collection": false
    },
    {
      "name": "payment_info_status",
      "description": "Indicates the status of an order payment; e.g. 'requested'",
      "type": "string",
      "path": "$.body.paymentInfo.status",
      "collection": false
    },
    {
      "name": "profile_id",
      "description": "References the event's profile stored in the profile service",
      "type": "string",
      "collection": false
    },
    {
      "name": "profile_trace_id",
      "description": "Profile Trace Id. Auto-populated at ingestion time",
      "type": "string",
      "collection": false
    },
    {
      "name": "promotion_info_ref",
      "description": "References a particular promotion; e.g. 'xmas2017abc'",
      "type": "string",
      "path": "$.body.promotionInfo.ref",
      "collection": false
    },
    {
      "name": "promotion_info_type",
      "description": "Indicates the type of promotion; e.g. 'absolute reduction'",
      "type": "string",
      "path": "$.body.promotionInfo.type",
      "collection": false
    },
    {
      "name": "shipment_info_address_addition",
      "description": "Contains additional information for the shipment address; e.g. the company name",
      "type": "string",
      "path": "$.body.shipmentInfo.address.addition",
      "collection": false
    },
    {
      "name": "shipment_info_address_city",
      "description": "Contains the city of the shipment address; e.g. 'Boulder'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.city",
      "collection": false
    },
    {
      "name": "shipment_info_address_country",
      "description": "Contains the country of the shipment address; e.g. 'Germany'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.country",
      "collection": false
    },
    {
      "name": "shipment_info_address_first_name",
      "description": "Contains the first name of the addressee; e.g. 'Joe'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.firstName",
      "collection": false
    },
    {
      "name": "shipment_info_address_last_name",
      "description": "Contains the last name of the addressee; e.g. 'Smith'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.lastName",
      "collection": false
    },
    {
      "name": "shipment_info_address_number",
      "description": "Contains the address number of the shipment address; e.g. '1234'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.number",
      "collection": false
    },
    {
      "name": "shipment_info_address_street",
      "description": "Contains the street name of the shipment address; e.g. 'Broadway'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.street",
      "collection": false
    },
    {
      "name": "shipment_info_address_zip",
      "description": "Contains the zip code of the shipment address; e.g. '80302'",
      "type": "string",
      "path": "$.body.shipmentInfo.address.zip",
      "collection": false
    },
    {
      "name": "shipment_info_carrier",
      "description": "Indicates the order shipment carrier; e.g. 'UPS'",
      "type": "string",
      "path": "$.body.shipmentInfo.carrier",
      "collection": false
    },
    {
      "name": "shipment_info_status",
      "description": "Indicates the status of the order shipment; e.g. 'not delivered'",
      "type": "string",
      "path": "$.body.shipmentInfo.status",
      "collection": false
    },
    {
      "name": "shipment_info_tracking_ref",
      "description": "References the tracking information for the order shipment; e.g. 'abc123XYZ'",
      "type": "string",
      "path": "$.body.shipmentInfo.trackingRef",
      "collection": false
    },
    {
      "name": "status",
      "description": "Indicates the status of an order; e.g. 'new'",
      "type": "string",
      "path": "$.body.status",
      "collection": false
    }
  ]
}

If table replacement excludes a pre-existing column, the associated data will be maintained but will not be accessible through subsequent queries. Renaming columns is not supported at this time.

Table Details

Supported column types are:

  • String
  • Long
  • Float
  • Timestamp

NOTE that only 'String'-type columns can be flagged with "collection":true; all other column types must be single-valued.

The 'path' attribute defines the XPath expression that will be used to populate the column during ingestion, based on the supplied event data in JSON format. If no 'path' is defined, the path value defaults to the 'name' of the column; for example, if 'name' is 'url', the 'path' value defaults to ''$.url'.

If a column is flagged as "collection", data ingested into this column must be in the form of a JSON array of comma-delimited values, at the XPath location indicated.

Mandatory Columns

Timestamp Column Exactly one column of type Timestamp must be provided.

RTR GUID In order to support GDPR requirements for record editing, a column named guid of type string must be included in the table spec. The path may be omitted. Values for this column will be generated by Real-time Reporting at ingestion time.

Profile Trace ID In order to track events throughout profile, a column named profile_trace_id of type string must be included in the table spec. The path may be omitted. Values for this column will be populated by Real-time Reporting at ingestion time.

Profile ID If your table stores customer data, the table must have a string column named profile_id. This column will store the customer's consent reference, which enables Real-time Reporting to comply with privacy regulations.

Table Limitations

  • 'name'
    • must be provided
    • must match the table id used in the url
    • must be between 1 and 100 characters
    • may contain only the following characters: a-z, A-Z, 0-9, _, -
    • if the ingestion type is 'clickstream' or 'commerce', the table name must match the ingestion type
  • 'description'
    • must be between 1 and 1000 characters
  • 'ingestionType'
    • must be provided
    • if the ingestion type is 'clickstream' or 'commerce', the table name must match the ingestion type
  • 'columns'
    • the set of column names must be unique

Column Limitations

  • 'name'
    • must be provided
    • must be between 1 and 100 characters
    • may contain only the following characters: a-z, A-Z, 0-9, _, -
  • 'description'
    • must be between 1 and 1000 characters
  • 'type'
    • column type must be provided
  • 'path'
    • must be a valid JSON path
    • path must be between 1 and 100 characters


Table Data Editing

The Real-time Reporting service allows you to delete and edit previously-ingested table data.

The Real-time Reporting system caches and executes data modification requests periodically. Thus, there might be a significant delay between the submission of the request and the update of the target data.

Delete data

A data deletion request must provide filter that either specifies the guid of a single record in the database, or a profile_id. If a profile_id is specified, the system deletes all customer data associated with that profile_id. For example, this message deletes all data where the profile_id column has the value abc.

{
  "type": "delete",
  "filter": {
    "type": "equal",
    "column": "profile_id",
    "value": "abc"
  }
}

Edit data

A request to edit data requires a filter that specifies the guid of a single record in the database. In addition, the request provides a modifiedEvent key-value map of all record properties to reflect in the updated entry. The value map specifies the guid and the same timestamp as the record to modify. For example, this data edit message specifies that the record stored for the guid has two properties: productId with the value abc123, and productDescription with the value Tennis Shoes.

{
  "type": "edit",
  "filter": {
    "type": "equal",
    "column": "guid",
    "value": "61bc5b52-73e6-4dde-bb28-f361267b27ad"
  },
  "modifiedEvent": {
    "guid": "61bc5b52-73e6-4dde-bb28-f361267b27ad",
    "time": "2017-05-09T18:24:00.000Z",
    "productId": "abc123",
    "productDescription": "Tennis Shoes"
  }
}


Lookups

The Real-time Reporting Service supports the creation of lookup tables, or lookups. A lookup can represent a master list of coded values, such as an ISO country code to country name mapping, or it can link one or more aliases to a primary value.

For example, in data records for a single user, a first_name field can contain the value 'Robert' in one record and 'Bob' in another record, where 'Robert' is the primary value and 'Bob' is an alias for 'Robert'.

The Real-time Reporting service supports queries that match all aliases of a specified primary value. Thus, a query to retrieve records where first_name='Robert' can also target records where first_name='Bob'.
The service can also replace alias values with primary values in query results, such that the service returns records for 'Robert' and 'Bob' as first_name='Robert'.

Lookup Example

This example represents a lookup that contains two primary values, 'Robert' and 'Jennifer', with their associated aliases:

aliasValueprimaryValue
BobRobert
BobbyRobert
JenJennifer
JennyJennifer

Restrictions

For a given lookup table, each alias must be unique. For instance, the table shown above could not support mapping the alias "Bobby" to two primary values of "Robert" and "Roberta".

At this time, a tenant can have five custom lookup tables. A lookup table can hold a maximum of 10,000 entries.

The lookup profile is a reserved, auto-created, and auto-populated lookup for referencing related Profile IDs.

In addition, lookup table names may not be prefixed with "__". These lookups are reserved for system level operations.

For instructions about managing lookups and using them in queries, see the Manage Lookups section in the Tutorial.


Headers

Messages posted to the Real-time Reporting Service as a POST request have the following headers:

NameRequired?DescriptionExample
AuthorizationYesThe bearer token issued by yaas.ioBearer 020-943da2ff-0092-4228-bb4c-b44bfa0f4cb8
Content-TypeFor POST and PUT requests onlyIdentifies the data type of the submitted contentapplication/json


Data Privacy

In accordance with the European General Data Protection Regulation (GDPR), this service allows you to export and remove your tenant's customer data, as well as rectify the data of a single customer.

Identifying customer data

All tables storing customer information must provide a column named profile_id. This column contains a customer's consent reference so that you can edit or delete personal information when requested.

Export a customer's data

Customers can request a copy of their data. You can export customer data stored in Real-time Reporting in the form of JSON files, each consisting of an array of objects with column data for all records associated with a customer. The exact format depends on the table schema.

To request a customer's files, the tenant must contact the support team and specify the preferred delivery method.

Remove all of a customer's data

When a customer revokes consent, the system automatically removes all of a customer's data from Real-time Reporting. It can take up to 24 hours for the for the system to delete the data in Real-time Reporting.

You can also initiate data removal manually by using the service's PATCH /{tenant}/tables/{table}/events endpoint.

Edit a customer's data

You can edit a customer's data using the PATCH /{tenant}/tables/{table}/events endpoint. It can take up to 24 hours for the update to appear in Real-time Reporting.

Restrict a customer's data

When a customer restricts his or her data from processing in Profile, Real-time Reporting flags the data accordingly. As a result, the customer's records are excluded from all queries, as well as all modification and manual deletion attempts. However, normal data retention purge schedules still permanently remove the customer's data once they age out of the active data interval.

If a customer's data needs to be restricted for legal reasons, the tenant must contact the support team and request a customer data export for safekeeping until the legal restrictions are lifted.

This retains the data beyond the normal data retention schedule. You must ensure that the exported data is destroyed when the restriction is lifted.


Array Expansion

The Realtime Reporting Service currently does not support a truly relational data design. Many events, however, contain compound objects in the form of array elements. This example represents an order in which the customer purchases two items: a Canon camera costing 99.99, and a Nikon flash costing 5.50:

Order Event

{
  "order_id" : 123,
  "items" : [
    { "name" : "camera", "brand" : "canon", "price" : 99.99 },
    { "name" : "flash", "brand" : "nikon", "price" : 5.50 }
  ]
}

In order to store these nested event details in columnar fashion, this single items array is expanded to yield two independent order events, one for each item purchased. Realtime Reporting offers an array-expansion feature to achieve this, yielding the following data. Note that the items_index column, which tracks the order of the expanded records, is defined in table metadata rather than in the order event:

Order Data

order_iditems_indexitem_nameitem_branditem_price
1230cameracanon99.99
1231flashnikon5.50

The table's schema definition enables the array expansion, and can target one or more arrays for expansion. For each expanded array, the schema must define an index column, for instance, items_index in Order Data above. The system auto-generates index values during array expansion, starting at zero.

Enabling Array Expansion

For each array you expand, define:

  • An index column that targets the array you expand. In the example, items_index is the index column.
  • A column for each of the array entry's attributes included in the expanded data. In the example, item_name is the data column label for the array entry's name attribute, and item_brand is the data column label for the array entry's brand attribute.

The XPath expression 'items[*]' indicates "select all elements of the array found at root element 'items'". This XPath format is used to target arrays that are to be expanded. Thus, the path for a given array's index column must end in the array selector [*].

Order Schema

{
  [...],
  "columns": [
    [...],
     {
       "name": "items_index",
       "path": "$.items[*]",
       "type": "expansion"
     },
     {
       "name": "item_name",
       "path": "$.items[*].name",
       "type": "string"
     },
     {
       "name": "item_brand",
       "path": "$.items[*].brand",
       "type": "string"
     },
     [...]
  ]
}

Index Column Schema Rules

  • You must set the column type to expansion to indicate that this column serves as the index column for array expansion
  • You must set the collection attribute to 'false' (property will default to false if omitted)
  • You must define the path attribute path to the array that the system expands

As best practice, form the index column name by concatenating the name of the expanded array with the suffix '_index'. For example, items_index is the index column name for an expanded items array.

Child Arrays and Peer Arrays

Child Arrays

An event array may contain one or more child arrays. You can define a child array as expandable if its parent array is a top-level expandable array. Here, the event's expandable items array contains an expandable child array of variantOptions.

Child Array Event

{
  "order_id": 123,
  "items": [
    {
      "product": "socks",
      "variantOptions": [
        {
          "name": "size",
          "value": "large"
        },
        {
          "name": "color",
          "value": "white"
        }
      ]
    },
    {
      "product": "Core Workout",
      "variantOptions": [
        {
          "name": "format",
          "value": "MPEG"
        },
        {
          "name": "length",
          "value": "90 min"
        },
        {
          "name": "publisher",
          "value": "Fit4Life"
        }
      ]
    }
  ]
}

The schema defines expandable child arrays using the same rules and format applied to expandable parent arrays:

Child Array Order Schema

{
  [...],
  "columns": [
    [...],
    {
       "name": "variantOptions_index",
       "path": "$.items[*].variantOptions[*]",
       "type": "expansion"
    },
    {
       "name": "variant_name",
       "path": "$.items[*].variantOptions[*].name",
       "type": "string"
    },
    {
       "name": "variant_value",
       "path": "$.items[*].variantOptions[*].value",
       "type": "string"
    },
    [...]
  ]
}

The Child Array Event and Child Array Order Schema examples above result in the following expanded data store:

order_iditems_indexproductvariantOptions_indexvariant_namevariant_value
1230socks0sizelarge
1230socks1colorwhite
1231Core Workout0formatMPEG
1231Core Workout1length90 min
1231Core Workout2publisherFit4Life

Peer Arrays

An event can contain multiple arrays at the same level, known as 'peer' arrays. This example shows the peer arrays, items and payment:

Peer Array Event

{
  "order_id": 123,
  "items": [
    {
      "product": "socks",
      "sku": "abc"
    },
    {
      "product": "Core Workout",
      "sku": "efg"
    }
  ],
  "payment": [
    {
      "type": "giftCard",
      "amount": 25.00
    },
    {
      "type": "creditCard",
      "amount": 30.98
    }
  ]
}

If the schema defines multiple peer arrays as expandable, the resulting data for one expanded array excludes the data from its peer arrays. In this example, the first two columns include expanded items data and exclude payment data, and the final two columns include expanded payment data and exclude items data:

Peer Array Data

order_iditems_indexproductskupay_indexpay_typepay_amount
1230socksabc
1231Core...efg
1230giftCard25.00
1231creditCard30.98

Combining Child and Peer Arrays

You can combine child array and peer array expansions. At this time, you can expand parent and direct child arrays, and no more than 3 peer arrays at any given hierarchy level.

Querying Expanded Arrays

Consider array expansion when defining your queries, because the resulting data rows can share duplicate information. In the following example, an event contains duplicated data in the order_id and order_total fields in the expanded data store:

{
  "order_id" : 123,
  "order_total": 105.49
  "items" : [
    { "name" : "camera", "price" : 99.99 },
    { "name" : "flash", "price" : 5.50 }
  ]
}
order_iditems_indexorder_totalitem_nameitem_price
1230105.49camera99.99
1231105.49flash5.50

To count the number of unique order_id values, use the 'count_distinct' aggregation:

  [...]
  "aggregations": [
    { "type": "count_distinct", "name": "distinct_order_ids", "column": "order_id"}
  ]
  [...]

To retrieve the unique order_total value, target only rows where the index value is '0':

  [...]
  "filter": { "type": "equal", "column": "items_index", "value": 0 }
  [...]

NOTE

Using the expansion feature may drastically increase the number of database records that are created from a single ingestion event. At this time, the system will discard any ingestion event that would result in more than 1,000 individual records.


Prerequisites

  • You must have a tenant
  • Your tenant must be subscribed to the Realtime Reporting package
  • Your tenant must have a Client with the scopes hybris.realtime_reporting_view and hybris.realtime_reporting_manage


Setup

Generate a Bearer Token as follows:

  • Browse to your tenant in BUILDER
  • Browse to Development > Clients > [Client with realtime_reporting scopes]
  • Under GENERATE CREDENTIALS, click GENERATE ACCESS TOKEN
  • Select the correct Requested Region
  • Under Select Scopes, choose hybris.realtime_reporting_view
  • Click GENERATE and copy the resulting string, including the prefix "Bearer".


Execution

In a separate browser tab, browse to the SERVICE URL indicated at the top of this tutorial screen.

VIEW TABLES

For this example, use a table whose name and ingestionType are both 'clickstream'.

To determine whether such a table pre-exists, attempt to view the 'clickstream' table:

  • In the /{tenant}/tables/{tableId} row, click GET
  • In the tableId field, enter the name of the target table; for instance: 'clickstream'
  • In the tenant and hybris-tenant fields, enter the name of your tenant
  • In the Authorization field, enter the copied value of your access token, including the prefix "Bearer"

If the table exists, the query will yield a 200 response and a representation of the table similar to the following:

{
    "name": "clickstream",
    "description": "yProfile clickstream events",
    "ingestionType": "clickstream",
    "status": "active",
    "columns": [
        {
            "name": "action_name",
            "type": "string",
            "path": "$.action_name",
            "collection": false
        },
        {
            "name": "date",
            "type": "timestamp",
            "path": "$.date",
            "collection": false
        },
        [etc.]
    ]
}

The status attribute of the returned table indicates that it is 'active': available for use. Other possible status values are 'deleting' (in the process of being deleted) and 'error'. If the returned table's status is 'error', please contact support.

CREATE/REPLACE A TABLE

If no 'clickstream' table exists, you can create a new one by submitting a JSON request with the structure described in the VIEW TABLES section above. If a 'clickstream' table pre-exists, you can replace it with one whose structure matches your needs. Both goals can be met with the same process, described here.

  • In the /{tenant}/tables/{tableId} row, click PUT
  • In the tableId field, enter 'clickstream' - the required table name
  • In the tenant and hybris-tenant fields, enter the name of your tenant
  • In the Authorization field, enter the copied value of your access token, including the prefix "Bearer"
  • In the BODY section, enter the JSON representation of your table. A default representation is included in this section. NOTE that a timestamp-type column is required, and that - if you are replacing an existing table - this column cannot be renamed.

The request should yield a 204 response, indicating that your table is being created or replaced. After a short waiting period, your table will be available.

QUERY FOR DATA

  • In the /{tenant}/tables/{tableId}/query row, click POST
  • Enter 'clickstream' in the tableId field
  • In the tenant and hybris-tenant fields, enter the name of your tenant
  • In the Authorization field, enter the copied value of your access token, including the prefix "Bearer"
  • In the BODY field, enter the following, replacing interval datetime values with appropriate values:
    {
      "granularity": "all",
      "interval": "2017-04-12T00:00:00.000/2017-04-13T00:00:00.000",
      "limit": 5
    }
    
    This query should yield a 200 response containing up to five JSON response elements. To retrieve more elements, increase the value of the limit attribute in the query.

You can now execute more complex queries, such as:

{
  "granularity": "day",
  "interval": "2017-04-12T19:00:00.000/2017-04-13T20:00:00.000",
  "dimensions": [ "url" ],
  "filter": {"type": "notequal", "column": "url", "value": null},
  "aggregations": [
    { "type": "count_distinct", "name": "distinct_sessions_by_url", "column": "session_id"}
  ],
  "limit": 5,
  "order": [ {"column": "distinct_sessions_by_url", "direction": "descending"} ]
}


Manage Lookups

In a separate browser tab, browse to the SERVICE URL indicated at the top of this tutorial screen.

Create, update, view, and delete lookups

See the Lookups section for details about lookups.

Each of the lookup endpoints can be called following this protocol:

  • In the /{tenant}/lookups/{lookup} row, click PUT, PATCH, GET, or DELETE, as appropriate.
  • In the lookup field, enter the name of the target lookup; for instance: 'mylookup'
  • In the tenant and hybris-tenant fields, enter the name of your tenant
  • In the Authorization field, enter the copied value of your access token, including the prefix "Bearer"

To create a lookup, use the PUT endpoint:

  • In the BODY section, enter the JSON representation of your lookup. For example:
    {
    "values": [
      { "primaryValue": "Robert",   "aliasValue": "Bob" },
      { "primaryValue": "Robert",   "aliasValue": "Bobby" },
      { "primaryValue": "Jennifer", "aliasValue": "Jen" },
      { "primaryValue": "Jennifer", "aliasValue": "Jenny" }
    ]
    }
    
    The request should yield a 204 response, indicating that your lookup is being created. After a short waiting period, your lookup will be available.

To update a lookup, use the PATCH endpoint:

  • In the BODY section, enter the JSON representation of your lookup. Note that a 'delete' attribute indicates that the associated mapping will be deleted. For example:
    {
    "values": [
      { "primaryValue": "Robert",   "aliasValue": "Bob", "delete": true },
      { "primaryValue": "Robert",   "aliasValue": "Rob" }
    ]
    }
    
    The request should yield a 204 response, indicating that your lookup is being updated. After a short waiting period, your lookup changes will be available.

To view a lookup, use the GET endpoint. The query should yield a 200 response and a representation of the target lookup. For example:

{
  "values": [
    { "primaryValue": "Robert",   "aliasValue": "Bob" },
    { "primaryValue": "Robert",   "aliasValue": "Bobby" },
    { "primaryValue": "Jennifer", "aliasValue": "Jen" },
    { "primaryValue": "Jennifer", "aliasValue": "Jenny" }
  ]
}

To delete a lookup, use the DELETE endpoint. If the request yields a 204 response, this means the system is deleting your lookup. After a short waiting period, the system deletes your lookup.

Query using lookups

By referencing a lookup, a query that contains a filter element can target all aliases associated with a primary value. For example, the following filter element targets the primary value 'Robert' in the first_name field:

"filter": {"type": "equal", "column": "first_name", "value": "Robert"}

To target all aliases of 'Robert' as well as the primary value itself, reference the associated lookup in the filter element, as follows:

"filter": {"type": "equal", "column": "first_name", "value": "Robert", "lookup": "lookup_name"}

The results of a query can display the literal values returned, or it can replace all alias values with the primary value. For example, the following dimensions element returns the literal primary and alias values for the first_name field:

"dimensions": [ "first_name", "last_name" ]

To convert all aliases of 'Robert' to the string 'Robert' in the results, create a dimension element that indicates the lookup source, 'lookup', and the dimension element to be converted, 'name', as in this example:

"dimensions": [ {"lookup": "lookup_name", "name": "first_name"}, "last_name" ]


Glossary

TermDescription
aggregationsAn optional query element that defines one or more operations to be applied to retrieved data. Valid types are "count", "count_distinct", "sum", "min", "max", "mean", "stddev", and "variance".
columnA string, numeric, or timestamp field representing a data category. A column can represent data stored as a category in the service (e.g. "url", "price"), or data derived upon retrieval from the service (e.g. "distinct_urls", "count").
dimensionA column that represents a data category explicitly stored in the service, rather than derived upon retrieval. A dimension can be filtered or grouped on.
filterAn optional query element that includes or excludes matched data from the retrieved set by applying logical operators; for example, data equals a specified value and does not equal a specified value.
granularityA query element that determines the scale of time groupings for retrieved data: "hour", "day", "week", "month", or "all" (which yields a single grouping).
intervalA query element that indicates the ISO-8601 date-time bounds for data targeted by a query. The interval is delimited by a forward slash. If the interval comprises two dates (for example, "2017-03-02T19:00:00.000/2017-03-02T22:00:00.000"), then the first date is inclusive and the second date is exclusive.
limitAn optional query element that defines the maximum number of results to return.
orderAn optional query element that enforces ordering of data by data type; for example, dimension X values are returned in numerically-ascending order.
packageA developer-defined bundle of microservices. A partner, team, tenant, or another package can subscribe to a particular package to access the functionality of those bundled microservices.
tableA collection of dimensions associated with a particular tenant.
tenantA registered entity with a shared commercial goal that subscribes to SAP Hybris Profile services and packages to reach that goal. A tenant can also develop and contribute enrichers and context adapters to the SAP Hybris Profile suite. Within YaaS, a tenant is a project.


  • Send feedback

    If you find any information that is unclear or incorrect, please let us know so that we can improve the Dev Portal content.

  • Get Help

    Use our private help channel. Receive updates over email and contact our specialists directly.

  • hybris Experts

    If you need more information about this topic, visit hybris Experts to post your own question and interact with our community and experts.