Concepts
This page describes core concepts within Dataworkz platform
Datasource
A data source represents a repository of data that can be categorized as
Cloud Object Stores - AWS S3, Azure BLOB Storage, or Google Cloud Storage
Relational database - Oracle, MySQL, MSSQL, and more.
SaaS Applications - Salesforce, Marketo, Outreach, Pendo to name a few.
NoSQL Database - Aerospike, Couchbase, MongoDB
Workspace
A workspace is a logical construct that groups together related collections and datasets. It can be used to organize different datasets and collections of datasets as logically separate units.
Collection
A collection is a grouping of multiple datasets - the format of datasets contained in a collection can vary.
Datasets
Datasets provide the foundation for various data-related tasks which are displayed in a tabular format in Dataworkz. The underlying data might be stored in a relational database, Hive table, cloud data warehouse, or NoSQL database. When you navigate to a dataset in Dataworkz, you’ll be presented with a visual profile of the data with summarized statistics.
Transformations
A transformation is an operation that is performed on a field in a dataset. The transformation operations vary depending on the data type of the field.
Lineage
Catalog
Data Discovery
This is the process of connecting with datasets stored in a lake, a cloud data warehouse, a relational/NoSQL database, or a SaaS application like Salesforce.
Last updated