Product Docs
  • What is Dataworkz?
  • Getting Started
    • What You Will Need (Prerequisites)
    • Create with Default Settings: RAG Quickstart
    • Custom Settings: RAG Quickstart
    • Data Transformation Quickstart
    • Create an Agent: Quickstart
  • Concepts
    • RAG Applications
      • Overview
      • Ingestion
      • Embedding Models
      • Vectorization
      • Retrieve
    • AI Agents
      • Introduction
      • Overview
      • Tools
        • Implementation
      • Type
      • Tools Repository
      • Tool Execution Framework
      • Agents
      • Scenarios
      • Agent Builder
    • Data Studio
      • No-code Transformations
      • Datasets
      • Dataflows
        • Single Dataflows:
        • Composite dataflows:
        • Benefits of Dataflows:
      • Discovery
        • How to: Discovery
      • Lineage
        • Features of Lineage:
        • Viewing a dataset's lineage:
      • Catalog
      • Monitoring
      • Statistics
  • Guides
    • RAG Applications
      • Configure LLM's
        • AWS Bedrock
      • Embedding Models
        • Privately Hosted Embedding Models
        • Amazon Bedrock Hosted Embedding Model
        • OpenAI Embedding Model
      • Connecting Your Data
        • Finding Your Data Storage: Collections
      • Unstructured Data Ingestion
        • Ingesting Unstructured Data
        • Unstructured File Ingestion
        • Html/Sharepoint Ingestion
      • Create Vector Embeddings
        • How to Build the Vector embeddings from Scratch
        • How do Modify Existing Chunking/Embedding Dataflows
      • Response History
      • Creating RAG Experiments with Dataworkz
      • Advanced RAG - Access Control for your data corpus
    • AI Agents
      • Concepts
      • Tools
        • Dataset
        • AI App
        • Rest API
        • LLM Tool
        • Relational DB
        • MongoDB
        • Snowflake
      • Agent Builder
      • Agents
      • Guidelines
    • Data Studio
      • Transformation Functions
        • Column Transformations
          • String Operations
            • Format Operations
            • String Calculation Operations
            • Remove Stop Words Operation
            • Fuzzy Match Operation
            • Masking Operations
            • 1-way Hash Operation
            • Copy Operation
            • Unnest Operation
            • Convert Operation
            • Vlookup Operation
          • Numeric Operations
            • Tiles Operation
            • Numeric Calculation Operations
            • Custom Calculation Operation
            • Numeric Encode Operation
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
            • Convert Operation
            • VLookup Operation
          • Boolean Operations
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
          • Date Operations
            • Date Format Operations
            • Date Calculation Operations
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
            • Encode Operation
            • Convert Operation
          • Datetime/Timestamp Operations
            • Datetime Format Operations
            • Datetime Calculation Operations
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
            • Encode Operation
            • Page 1
        • Dataset Transformations
          • Utility Functions
            • Area Under the Curve
            • Page Rank Utility Function
            • Transpose Utility Function
            • Semantic Search Template Utility Function
            • New Header Utility Function
            • Transform to JSON Utility Function
            • Text Utility Function
            • UI Utility Function
          • Window Functions
          • Case Statement
            • Editor Query
            • UI Query
          • Filter
            • Editor Query
            • UI Query
      • Data Prep
        • Joins
          • Configuring a Join
        • Union
          • Configuring a Union
      • Working with CSV files
      • Job Monitoring
    • Utility Features
      • IP safelist
      • Connect to data source(s)
        • Cloud Data Platforms
          • AWS S3
          • BigQuery
          • Google Cloud Storage
          • Azure
          • Snowflake
          • Redshift
          • Databricks
        • Databases
          • MySQL
          • Microsoft SQL Server
          • Oracle
          • MariaDB
          • Postgres
          • DB2
          • MongoDB
          • Couchbase
          • Aerospike
          • Pinecone
        • SaaS Applications
          • Google Ads
          • Google Analytics
          • Marketo
          • Zoom
          • JIRA
          • Salesforce
          • Zendesk
          • Hubspot
          • Outreach
          • Fullstory
          • Pendo
          • Box
          • Google Sheets
          • Slack
          • OneDrive / Sharepoint
          • ServiceNow
          • Stripe
      • Authentication
      • User Management
    • How To
      • Data Lake to Salesforce
      • Embed RAG into your App
  • API
    • Generate API Key in Dataworkz
    • RAG Apps API
    • Agents API
  • Open Source License Types
Powered by GitBook
On this page
  • Step 1: Create a RAG Application
  • Step 2: Create a RAG Experiment
  • Step 3: Select Your Experiment Parameters
  • Step 4: Save and Run the Experiment
  • Next Steps:
  1. Guides
  2. RAG Applications

Creating RAG Experiments with Dataworkz

Quick overview of Experiments, how to create a RAG experiment and how to compare the response accuracy

PreviousResponse HistoryNextAdvanced RAG - Access Control for your data corpus

Last updated 2 months ago

Overview: This guide walks you through the process of creating experiments with your RAG (Retrieval-Augmented Generation) applications using Dataworkz. You’ll learn how to create a new RAG application, then experiment with different chunking and embedding strategies to optimize the performance of your RAG-based models.

Prerequisites

Before you start creating RAG experiments, ensure you’ve set up a RAG application in Dataworkz. For instructions on creating your RAG application, refer to for detailed steps.


Step 1: Create a RAG Application

To start creating experiments, you first need to create a RAG application. If you haven't done so already, follow the instructions below:

  1. Navigate to the RAG Applications tab in Dataworkz.

  2. Click on Create New Application and follow the prompts to set up your RAG application. For detailed instructions on creating a RAG application, refer to .


Step 2: Create a RAG Experiment

Once your RAG application is set up, you can begin experimenting with different configurations. Here’s how:

  1. Go to the RAG Applications tab in Dataworkz.

  2. Locate the RAG application you want to experiment with and click on the 'Create Experiment' button next to it.

    This will allow you to create a new experiment based on the application and modify key parameters.


Step 3: Select Your Experiment Parameters

After clicking the 'Create Experiment' button, you’ll be prompted to configure several parameters for your experiment:

1. Chunking Strategy

Choose a chunking strategy that suits your application’s needs. The chunking strategy defines how your documents are split into smaller pieces for efficient retrieval and processing. Options may include:

  • Fixed-size chunks

  • Semantic-based chunking

  • Custom strategies

2. Chunking Size

Specify the size of the chunks you want to use. This value determines the length of each chunk (in terms of tokens or characters). Larger chunks may reduce retrieval overhead, while smaller chunks could improve accuracy.

3. Chunk Overlap

Set the overlap between adjacent chunks. Overlapping chunks can help ensure that relevant information is retained across splits, but too much overlap may lead to redundancy. Adjust the overlap to strike a balance between performance and precision.

4. Embedding Model

Select an embedding model

Step 4: Save and Run the Experiment

Once you've configured the settings, click Save to create a copy of your original RAG application, now with the new chunking and embedding strategies applied. There will be a new experimental RAG application under the drop down below the original with these parameters, and you can compare the results to the original RAG application for performance improvements.


Next Steps:

Once the experiment is created, you can monitor its progress and analyze the results to see how the new chunking and embedding strategies impact your RAG application’s performance. If needed, you can create additional experiments to further refine the setup. If you want to use the you can follow the steps and compare the experimental version of your RAG app.

RAG Evaluation repo
Create a RAG Application
RAG Application Quickstart