Concepts

This page describes core concepts within Dataworkz platform

  • Datasource

    • A data source represents a repository of data that can be categorized as

      • Cloud Object Stores - AWS S3, Azure BLOB Storage, or Google Cloud Storage

      • Relational database - Oracle, MySQL, MSSQL, and more.

      • SaaS Applications - Salesforce, Marketo, Outreach, Pendo to name a few.

      • NoSQL Database - Aerospike, Couchbase, MongoDB

  • Workspace

    • A workspace is a logical construct that groups together related collections and datasets. It can be used to organize different datasets and collections of datasets as logically separate units.

  • Collection

    • A collection is a grouping of multiple datasets - the format of datasets contained in a collection can vary.

  • Datasets

    • Datasets provide the foundation for various data-related tasks which are displayed in a tabular format in Dataworkz. The underlying data might be stored in a relational database, Hive table, cloud data warehouse, or NoSQL database. When you navigate to a dataset in Dataworkz, you’ll be presented with a visual profile of the data with summarized statistics.

  • Transformations

    • A transformation is an operation that is performed on a field in a dataset. The transformation operations vary depending on the data type of the field.

  • Lineage

  • Catalog

  • Data Discovery

    • This is the process of connecting with datasets stored in a lake, a cloud data warehouse, a relational/NoSQL database, or a SaaS application like Salesforce.

Last updated