Data Fabric vs Data Virtualization

What is the best way to connect data across multiple sources while supporting large data sets, complex data structures, and real-time needs?

 

How do Data Fabric and Data Virtualization compare when speeding up data & analytics? What do they have in common, and how do they differ?


Let’s define the two before diving into the differences, pros, and cons.


What is Data Fabric?


Forrester defines data fabric as a platform for “orchestrating disparate data sources intelligently and securely in a self-service and automated manner... to deliver a unified, trusted, and comprehensive real-time view of customer and business data across the enterprise.”


What is Data Virtualization?


Data virtualization is a logical data layer that can integrate enterprise data siloed across disparate systems, manages and unifies data for centralized security and governance, and delivers it to the business users in real-time.


How do they compare?

Functionality

Data Fabric

Data Virtualization

Data Catalog

Yes

Limited

Data Pipeline

Yes

Limited

Data Modeling

Yes

Limited

Data Types

Structured, Unstructured, Semi-Structured

Primarily Structured Data

Data Connectivity

Extensive

Extensive

Data Preparation

Yes

No

Push-Down

Yes

No

Caching, In-memory

Yes

Optional

Data Security, Governance

Yes

Yes

Natural Language Processing (NLP)

Yes

No

AI, ML Based Automation (actively uses metadata)

Yes

No

Composable, Reusable Components

Yes

No

Self-Service Data (Governed)

Yes

No

Self-Service Analytics

Yes

No

What are the Pros of Data Virtualization?

  • Provides a virtual approach to accessing and delivering data

  • Helps to integrate data siloed across enterprise systems

  • Returns the integrated information in real-time to the applications used by business users

What are the Cons of Data Virtualization?

  • Incomplete solution when compared to Data Fabric

  • Users have limited data pipeline capabilities

  • Implementing a data catalog isn’t possible

  • Cannot prepare data properly or effectively

  • Inability to use Natural Language Processing to run queries across datasets

  • No exposure to Artificial Intelligence or Machine Learning-based automation

What are the Pros of Data Fabric?

  • Data does not have to be moved; you can access it where it lives

  • Ingest, transform and integrate data on the fly without needing to persist data to a data lake or warehouse first

  • See results in real-time at each step without waiting for the data to be transformed

  • Save money by minimizing the amount of data duplication

  • When data needs to be persisted for performance or other reasons, it can be

What are the Cons of Data Fabric?

  • The traditional approach to a Data Fabric is to buy a bunch of tools and stitch them together - think long, expensive system integration projects

  • The way vendors are marketing Data Fabric is causing confusion. [read What is a Data Fabric]

  • Going too big on day one, instead of targeting a smaller achievable out

What is a business use case of a Data Fabric?


Let’s say your business is in the beverage industry. You have data from Salesforce, Excel and Oracle. The trick is, data about your corporate accounts live in Salesforce, data about the account managers maintaining relationships with vendors live in Excel and data about supply-chain updates Oracle. Data Fabric connects all three. Not to mention, it models the relationships between each source - all without moving any of the data and running queries across them through Natural Language Processing.


Learn how the Promethium Data Fabric connects 400+ data sources in a single data analytics platform saving go-to-market time and over 91% of integration costs.



74 views

Recent Posts

See All