Lookout
Expert trails guide Denodo users through all the relevant materials related to a specific topic, including official doc, KB articles, training, Professional Services offering, and more. The main goal is to give users a single place with references to all the information that they need to become a Denodo expert on any specific topic.
Solution Expert Trails intend to provide a backcountry guide of the options available and concepts that need to be considered in designing a comprehensive solution, with resources around each topic to prepare for this journey. Solution Expert Trails are typically more open-ended as we head into the backcountry, every organization needs to plot their own course based on the specific goals and resources of their implementation.
Basecamp
This Solution Expert Trail should be understood in the context of the Self-Service for Data Democratization use case. In particular, this Solution Expert Trail guides an organization on how to accelerate business access to data, and, in this case, to provide the complete solution with an API-first approach.
In today’s modern data ecosystem, having data is not enough. It is past the point where the major need of an organization is to collect and store data. More and more, the demand for connectivity and sharing of data assets, internally and externally, is becoming the major focus of digital business initiatives. It all sounds simple, but throwing in the myriad of disparate formats and technologies used to store the growing volume of business data assets, the traditional way of physically consolidating all data in a single location now seems exhausting, time-consuming, and inefficient. The article Modern Data Ecosystems that Drive Business Value gives an in-depth description of what a modern data ecosystem looks like and the emerging solutions being created to keep up with its demands.
One of the solutions that emerged in the past years to answer this need is by providing data services, which is an abstraction layer that provides data to multiple consumers through application programming interfaces, or APIs. This enables applications to “talk” to each other and share data without the necessity to store a copy of the data. By exposing data beyond its initial scope and borders, applications participate in an interconnected web of applications and contribute to a teeming data ecosystem.
It is also good to note that “data service” must not be confused with “Data-as-a-service,” which provides curated data, delivered specifically to business users to aid in decision-making. Data services deliver a variety of transactional and historical data to a variety of users, including developers, for a variety of purposes.
Adopting an API-based approach in providing data services on an enterprise level requires a considerable amount of planning to design a comprehensive solution suited to the organization’s needs. In this solution expert trail, we will take you through a guided path on how to map out your enterprise data services solution, from the design and implementation stage, and all through data delivery, governance, and solution improvement process.
The Hike
Stage 1: Design
In designing an enterprise data services solution, one of the first things organizations must consider is if they need to build a data services layer, weighing in the role it plays in streamlining data access across the organization. Though it is difficult to answer in a general way whether a data services layer is the right fit for an organization, here are some points to consider in answering these questions:
- How many types of end-users need to access the data? Consider the users who access the data through third party analytical software and tools. More types of users means more administrative effort required to manage the security and permissions on the data source level.
- How many applications need to connect and consume the data? Consider any existing or future internal applications implemented in the organization, taking into account all the application developers that may need access during their development lifecycle.
- What are various data access patterns the users and applications employ to satisfy their business needs? The level of heterogeneity of the access patterns determines the number of delivery methods that need to be implemented for each data asset.
- How homogeneous are the data sources? The less homogeneity, the more the value is added when implementing a data services layer.
- How many data sources does your data services layer need to access? A large number of configurations can point to efficiencies in managing the connections to the underlying sources.
Characteristics of an enterprise data services solution
There are key features and characteristics that must be considered when designing an enterprise data services solution in order for it to be robust, flexible, and effectively serve the demands of a modern data ecosystem. In this webinar, Data Services and the Modern Data Ecosystem, the key characteristics of a data services solution are explained in detail, highlighting why they are important.
API and Data Virtualization
In most API-based data services solution implementations, API platforms assist with providing a point of entry for users to access the data services, as they act as gateways. API platforms enable organizations to manage large numbers of APIs, monitor their usage, and establish security between them. API-based approach also enables organizations to implement microservice architectural style by ensuring the data services are independently deployable and loosely coupled from one another.
Data virtualization is a technology that can greatly enhance the capabilities of an API
platform, augmenting the benefits of an API-based architecture. This solution brief Data Virtualization and the API Ecosystem discusses the various ways in which data virtualization adds value to any API-based data services strategy, and some use cases illustrating how data virtualization can support an organization’s existing API ecosystem.
Top-down design
Building a data services solution means exposing the data to a diverse set of applications and users with various data needs in various formats and granularity. Employing a top-down approach to designing and modeling the data services or APIs to be consumed by the other applications and users can help speed up the overall implementation of a data services solution by decoupling the work associated with the requirements gathering and the API implementation.
In this webinar, Data Services and Data Mesh projects made easy using Top-Down Modeling, the benefits of using top-down design are discussed. It also provides a guide on how to build data services using a top-down approach.
Stage 2: Integration
To serve the diverse needs of various applications and users, a data services solution should enable the cleansing, integration and preparation of data sets, ensuring that data is exposed in a suitable and standardized format to other applications and users. The integration of raw source data into usable data services or APIs would typically entail multiple facets of Data Preparation, which includes modeling the data to ensure its discoverability and correct usability.
Data Cleansing
Working with raw source data does not always guarantee clean data. There are times when raw data includes duplicate records, inconsistencies in formatting or context, unexpected values, and so on. In these scenarios, data cleansing must be addressed in preparing the data service layer so the consuming applications and users have a clean and consistent set of data to work with.
Data Integration
Any data blending, filtering, or denormalization that is required by consuming applications or users must also be performed in preparing data for the data services layer. With this, units of data are readily combined and filtered, without the need for consuming applications or users to perform it themselves, thus avoiding errors that may result in incorrect or inconsistent data.
Relationship Modeling
As part of Data Augmentation, it is critical that the data provided through the data services layer are built with defined relationships. This gives consuming applications and users the ability to discover and traverse the underlying data model and helps them make sense of the data to utilize the data services effectively without needing a deep knowledge of the data model. In this step, relationships between data objects need to be clearly defined.
Stage 3: Data delivery
An enterprise data services solution has to serve multiple users and applications. The data delivery method and format must be on par with today’s standards, and must also be easily consumed to ensure interoperability with other applications. Additionally, it must support the organization’s growing data API needs by providing easy and quick development, deployment, and documentation. The following sections detail the essential points to consider when delivering data through a data services layer.
Delivery methods
REST APIs have been the de facto standard delivery method for data services since they are lightweight, flexible, and easily deployed over HTTP. Any data services solution must support this method, including standardized protocols like OData, and GraphQL services. In addition to REST, it is good to be able to deliver data via SOAP protocols to cater to and support legacy platforms.
Data representations
For the past years, JSON and XML formats have been two of the widely-used representations of data for electronic data interchange between applications. Any data services solution must have the ability to support these formats for easy interoperability with other applications. The data services layer can define a primary data delivery format but with the ability to switch formats to account for any changing business needs.
Low code development and deployment
A data services solution must also support a rapid development and deployment approach to meet the fast-growing demands of data consumers. If a consumer has to wait for weeks for a field or data set to be delivered, it negatively impacts the effectiveness and adoption of the data services solution.
Documentation (OpenAPI / Swagger)
In an enterprise-level data services solution, a vast variety of API endpoints can be made available to the consuming applications and users. To allow both applications and users to discover and understand the capabilities of these API endpoints, a standard, language-independent interface must be provided and published along with the APIs.
Stage 4: Governance
Any enterprise data solution that deals with an increasing volume of data that requires to be managed and to comply with various regulations such as GDPR needs a data governance model to secure and govern the organization’s data. This can be challenging to implement, especially with today’s demanding and complex data landscape, so it is important for organizations to streamline and simplify their approach to data governance. The key components to consider when implementing data governance in a data services solution are discussed in the following section.
Data Security
Organizations must always maintain complete control and understanding of who has access to the data they are publishing through the data services layer, not just considering the consumers inside the organization, but also external consumers that need access to the data. Maintaining this control at the data source level is very challenging, if not impossible, due to the fact that different data repositories and technologies all have diverse approaches in terms of access control. A simpler approach is to maintain the access control and security policies in a logical layer that leverages the data virtualization technology, which helps centralize and simplify the data access control process.
Data Discovery and Catalog
Enabling the easy discovery of data assets is essential to drive the quick adoption and effectiveness of the organization’s data services solution. This includes a collaboration-ready platform where users and applications can browse data assets that are enriched with metadata information that describes its intended purpose, enterprise data dictionaries, as well as categorization, tags, and additional properties. A workflow can also be incorporated to streamline the data access request process.
In addition, OpenAPI assets can be published along with the API to be used as a reference for various aspects of the API, including its endpoints, version, format, schema, etc. This can serve as an alternative avenue for users and developers to discover the data assets when providing access to the organization’s traditional data discovery tools does not make sense.
Monitoring
It is a best practice, if not mandatory, to have a monitoring platform to capture and analyze the usage of the data services solution. It can be used for audit purposes to track who is accessing what data, as well as to diagnose and monitor the performance of all the deployed data services, including information about the underlying queries, data sources, and servers. The information captured by the monitoring platform can help decision makers to implement necessary improvements related to the solution’s infrastructure, security, data quality, etc.
Stage 5: Iterative improvement
Organizations can, and usually should, take an iterative approach when implementing a data services solution. This means organizations can start out with a small business requirement and build the data services solution end-to-end, from the design phase to the data delivery stage, and slowly improve and expand the solution based on the feedback of the consuming users and applications. However, it is important to build a solution that is flexible enough to accommodate and document changes, and have a process to streamline the propagation of the data services updates to the end users and applications. Publishing OpenAPI assets helps in managing API changes, wherein standardized versioning can be adopted for minor and major changes, backward compatibility, and deprecation of endpoints.
Lastly, organizations can also leverage the information captured by the monitoring tools to continually improve the quality and performance of the data services offered by the solution.
Exploration
Fill up your backpack with additional gear:
Program Design |
|
KB Articles |
|
Solution Brief |
|
Webinars |
|
Podcasts |
|
Additional Resources |
Governance and Delivery |
|
Expert Trails |
|
Webinars |
|
Podcasts |
|
Additional Resources |
Guided Routes
Denodo Training Courses
Denodo training courses provide expert data virtualization training for data professionals, including administrators, architects, and developers.
If you are interested in Enterprise Data Services you should enroll the following course/s:
Data Modeling with Denodo
Building API Services with Denodo
- Connecting to Denodo from my Application
- Denodo Security Deep-Dive
- Self-Service and Data Marketplace with Denodo
Success Services
Denodo Customer Success Services can help you at the start or any part of your Self-Service Analytics trail. You can find information about the Denodo Success Services offering in:
Advisory Sessions
Denodo Customers with active subscriptions have access to request Meet a Technical Advisory sessions.
These are the sessions available related to Enterprise data services.
Understand roadmap |
Use Case: Guidance |
Guidance on designing and implementing a use case using the Denodo Platform. Review if it fits as a good data virtualization use case, considering such topics as: - Data services - Analytics - Data science - Sandboxing recommendations - etc. |
Architecture Definition |
Reference Architecture |
Assistance in defining or reviewing your Reference Architecture. Where to position the Denodo Platform in your ecosystem and the main role that the Denodo Platform can play in your organization’s use cases. |
Development Methodology |
Development Lifecycle |
Assistance in defining the development lifecycle when working on a Denodo Platform project. |
Development Methodology |
View Modeling: Best Practices |
Recommendations on defining: - Business model: Define Canonical Model, Report presentation layer, Relationships - Integration with database modeling tools (i.e. Embarcadero, Erwin, etc.) - Top-Down and Bottom-up development approaches |
Development Methodology |
Data Delivery: Architecture Overview & Best Practices |
- Define policies for accessing the Denodo Platform through the different available interfaces. - Data services best practices: RESTful architecture, OData, GraphQL, Swagger, etc. |
Data Delivery Enablement |
Client Access Configuration |
Assist on how to integrate with client applications using out-of-the-box features or Denodo Connects: - Access via Services: REST/SOAP/OData/GraphQL - IDU (insert, delete, update) operations via VQL, SOAP and REST |
Security of Denodo components: Protocols |
SAML/Oauth Configuration |
Steps to configure SAML or OAuth for the REST services in the VDP Server. |
Data Catalog |
Data Catalog: Overview & Best Practices |
Capabilities review and general best practices that can be followed to kick start your cataloging activities: - Governance: i.e. relationships, data lineage, and model descriptions - Metadata documentation and classification (tags, categories) for data stewardship - Advanced metadata and content search - Data exploration: Web-based query wizard for non-technical users, personal reports, and sharing options - Usage statistics |
Success Accelerators
In addition to Advisory sessions, Success Services includes Success Accelerators that can help you.
- Architecture Modernization
Engineer and evolve existing client architecture with technical / development teams to support modern business requirements.
Deliverables and Outcomes
- Assessment of current state architecture
- Future state design and recommendations
- Business Case Discovery
Prioritize Data Virtualization Use Cases based on overall business value and feasibility to maximize the impact of your Data Virtualization program and investments.
Deliverables and Outcomes
- Categorize and align selected Use Cases with Data Virtualization capabilities and expectations
- Prioritize and ranke recommended Use Cases based on expected Business Value and Organizational Feasibility for implementation
- Operating Model and CoE
Identify and formalize the DV roles, responsibilities, artifacts, and processes required to accelerate insights and deliver high-value business outcomes
Deliverables and Outcomes
- Selection Guidance and organizational recommendations for CoE deployment
- Discussion on best proactive Governance requirements aligned with selected Denodo Use Cases and Solutions
If you are a Denodo customer, you can reach out to your Customer Success Manager for details about any Guided Route that you need.
Big Hike Prep Check
Let’s see if you are ready to start your big trail. Take this 5-question questionnaire to check your readiness for an enjoyable hike.
Read the questions below, think about the solution and check if you got them right by looking at the solution. Have you become an expert?
- What are the essential characteristics of an effective enterprise data services solution?
Click here to check if you got it right
An effective data services solution is able to abstract the complexity of underlying data sources. Its semantic design is aligned to logical data models and its endpoints are reusable, interoperable and flexible to various consumption patterns. It is important that it is secure with controlled access and has a good governance framework to help with data discovery and to centralize usage metrics and monitoring. |
- What are the ways on how data virtualization technology can augment an API ecosystem?
Click here to check if you got it right
To explore these in detail, refer to the solution brief Data Virtualization and the API Ecosystem |
- What are the ways to prepare your data for consumption and why are they important?
Click here to check if you got it right
To prepare the data for consumption, data cleansing, data integration, data modeling must be performed. These data preparation activities help ensure that data is exposed in a suitable and standardized format to consuming applications and users. |
- How does having a logical layer simplify the implementation of data access control and security policies in a data services solution?
Click here to check if you got it right
A logical layer leveraging data virtualization technology can help centralize and simplify data access control and monitoring within a single control plane. By mapping all the data sources into a logical layer and centralizing all data access points, it can dramatically simplify the data access control process. A logical layer also decouples the data security policy from the underlying data repositories. By implementing the necessary business rules at the logical layer, they can be defined once and applied across all data repositories. Refer to the solution brief Data Virtualization - The Essential Tool for Security and Governance for more information. |
- What are the ways that can be implemented to a data services solution to enable the easy discovery of data assets within the organization?
Click here to check if you got it right
Organizations can implement a collaboration-ready platform where users and applications can browse data assets including their metadata information that describes their intended purpose, reference to enterprise data dictionaries, as well as categorization, tags, and additional properties. |