Dataset
Last updated
Last updated
Dataworkz Datasets provide a common abstraction over a number of data sources. These include Relational Databases such as MySQL or Postgres, Cloud Databases such as MongoDB or Snowflake, and a variety of other systems. Dataworkz imports the schema and metadata of the table or collection of data from the connected data source. Dataworkz Agents can leverage this data via the Dataset Tool.
Follow the Discovery process to discover relevant tables/collections into your Workspace
To use a Dataset Tool you have to -
Select AI Agents > Create a Tool > Dataset to create the Dataset Tool
Name: Provide a name for the tool
Description: Provide a description of what the tool achieves and should be used for
Dataset: Select the Dataset from the Dataset Explorer in the tool
Filter Criteria: Provide your filter criteria depending on what you want your tool to achieve. E.g. if your tool is meant to return a list of orders of a customer, then you would want to filter the data by customerId.
The dialect used is a simple SQL dialect. The filter criteria is the where part of a SQL SELECT query.
Parameters to the filter query can be referenced by using the format ${parameter_name}
.e.g customer_id = ‘${customer_id}’
Input Parameters: Any filter parameters will automatically show up in the Input Parameters section
You should adjust the type and the description of the parameters. The description should include any additional information such as a list of values if the parameter can only take a few fixed values or the format of the data
Output Parameters: Here’s where you select the projections - columns/headers/fields that you want this tool to output
Providing a description helps the Agent chain tools correctly when this tool is used as a part of a larger plan
Pro Tip: If you are using the same Dataset for multiple tools, you should edit the Header descriptions in the Catalog of the Dataset and then you will be able to reuse the descriptions in all tools
When you need to access data such as operational databases, lookups, etc. The Dataset tool currently works for Relational Databases, Snowflake and MongoDB. The Dataset tool is limited to simple filtered views of the data but this usually works well with Agents. You can create multiple Dataset tools and the Agent is capable of performing joins by chaining the parameters over multiple tool calls across Datasets in one scenario.