SAP DATA INTELLIGENCE
SAP Data Intelligence provides the toolbox to stay in control of data and its paths in a rapidly evolving, complex IT landscape (especially in a combination of on-premise and cloud).
Do you have data stored in a heterogeneous landscape in different ways - from files in diverse cloud storage to prepared data in a business warehouse? It becomes more difficult to ensure control over the available data and its preparation for quality-assured reporting or further consumption possibilities without losing too much flexibility?
For this scenario, SAP offers the SAP Data Intelligence solution, which primarily enables the orchestration of data streams in distributed IT landscapes, incl. data management. corresponding manipulation is possible. Additionally, data categorization / identification and machine learning capabilities are also included.
Basically, the product is not to be understood as a competitor to your business warehouse, but as a tool for placing it in a larger context.
SAP Data Intelligence can be provided as a service in the SAP Cloud Platform, but can also be run on non-SAP cloud environments (AWS, Google, Microsoft) or completely on-premise.
The extremely modular basis using various standards established in the cloud environment (Kubernetes and Docker) brings together on-premise landscapes (HANA, BW, Data Services, ...) with cloud landscapes in one orchestration layer and enables end-to-end management of data flows and a clamp over your distributed data storage.
Included tools
SAP Data Intelligence includes a variety of applications through which the software can be managed and used. The most important apps are briefly presented below.
Connection Management
Within this component, all connections used by other components can be defined and managed centrally. The type of connection in turn defines which operations are possible.
The following connection types are currently supported:
SAP Sources (On Premise or Cloud)
- ABAP (S/4, ECC, R/3)
- BW (SAP BW, BW on HANA, BW/4HANA)
- DATASERVICES (SAP Dataservices Remote Connection)
- HANA_DB (SAP HANA Database)
- HANA_XS (SAP HANA Application System)
- VORA (SAP VORA database)
- CPI (SAP Cloud Platform Integration)
- OPEN_CONNECTORS (SAP Cloud Platform Open Connectors)
- SAP_IQ (SAP IQ Database)
- CLOUD_DATA_INTEGRATOIN (Cloud Data Integration)
Third-party databases
- DB2 (IBM DB2 Database)
- MSSQL (Microsoft SQL Server Database)
- ORACLE (Oracle Database)
- MYSQL (MySQL Database)
- REDSHIFT (Redshift Database)
- AZURE_SQL_DB (Azure SQL Database
General protocols
- ODATA (Open Data Protocol)
- HTTP (HTTP system)
- IMAP (Internet Message Access Protocol (receive e-mails))
- SMTP (Simple Mail Transfer protocol (send e-mails))
- SFTP (SSH File Transfer Protocol)
Cloud service provider
- ADL (Azure Data Lake Store)
- AWS_SNS (Azure Simple Notification Storage)
- WASB (Microsoft Windows Azure Storage Blobs)
- GCP_BIGQUERY (Google Cloud Platform - BigQuery)
- GCP_DATAPROC (Google Cloud Platform - Dataproc)
- GCP_PUBSUB (Google Cloud Platform - Pub/Sub)
- GCS (Google Cloud Storage)
- S3 (Amazon simple storage service)
- OSS (Alibaba Cloud Object Storage Service)
More
- HDFS (Hadoop Distributed File System)
- KAFKA (Kafka cluster for messages production and consumption)
- RSERVE (RServe System)
Modeler
This is the central component regarding orchestration of data streams and provides the tools with which so-called pipelines can be defined. These consist of a combination of specific uses of different operators, i.e. operator-specific attributes can be stored for each use in a pipeline. The operators themselves are largely deployed as standalone Docker containers. A large and growing number of operators are provided by SAP and can be used directly for pipeline definition. Very high flexibility is provided by operators that encapsulate customer implementations and allow reusable custom developments in different programming languages, e.g. Python, JavaScript, Go or R.
Metadata Explorer
In contrast to the Modeler, this functionality is intended for an extended group of people. The possibility is given to enrich data sources of different connections with metadata (column labels, data types, descriptions) and to release them for general exploration based on this classification (for example for Data Scientists).
VORA Tools
This app provides the management of the embedded SAP VORA instance, which is a distributed database system that provides unified access to different storage systems (from files on a cloud storage to an on-premise HANA database).
Complementary apps
Apps for monitoring, policy management (user administration) and system management are also available. Furthermore, there are various tools for SAP's own machine learning framework.
Conclusion
Even though this is a very new SAP product with some gaps and teething problems, we see the extremely modular and flexible approach as a very helpful tool to ensure quality requirements for different consumption of various data. However, the scope of possibilities also requires a high degree of attention to the architecture and design of corresponding applications.
Our offer
If we have aroused your interest in SAP Data Intelligence and you would like to learn more about the possible applications, please contact us for an initial, non-binding exchange.