Product Docs
  • What is Dataworkz?
  • Getting Started
    • What You Will Need (Prerequisites)
    • Create with Default Settings: RAG Quickstart
    • Custom Settings: RAG Quickstart
    • Data Transformation Quickstart
    • Create an Agent: Quickstart
  • Concepts
    • RAG Applications
      • Overview
      • Ingestion
      • Embedding Models
      • Vectorization
      • Retrieve
    • AI Agents
      • Introduction
      • Overview
      • Tools
        • Implementation
      • Type
      • Tools Repository
      • Tool Execution Framework
      • Agents
      • Scenarios
      • Agent Builder
    • Data Studio
      • No-code Transformations
      • Datasets
      • Dataflows
        • Single Dataflows:
        • Composite dataflows:
        • Benefits of Dataflows:
      • Discovery
        • How to: Discovery
      • Lineage
        • Features of Lineage:
        • Viewing a dataset's lineage:
      • Catalog
      • Monitoring
      • Statistics
  • Guides
    • RAG Applications
      • Configure LLM's
        • AWS Bedrock
      • Embedding Models
        • Privately Hosted Embedding Models
        • Amazon Bedrock Hosted Embedding Model
        • OpenAI Embedding Model
      • Connecting Your Data
        • Finding Your Data Storage: Collections
      • Unstructured Data Ingestion
        • Ingesting Unstructured Data
        • Unstructured File Ingestion
        • Html/Sharepoint Ingestion
      • Create Vector Embeddings
        • How to Build the Vector embeddings from Scratch
        • How do Modify Existing Chunking/Embedding Dataflows
      • Response History
      • Creating RAG Experiments with Dataworkz
      • Advanced RAG - Access Control for your data corpus
    • AI Agents
      • Concepts
      • Tools
        • Dataset
        • AI App
        • Rest API
        • LLM Tool
        • Relational DB
        • MongoDB
        • Snowflake
      • Agent Builder
      • Agents
      • Guidelines
    • Data Studio
      • Transformation Functions
        • Column Transformations
          • String Operations
            • Format Operations
            • String Calculation Operations
            • Remove Stop Words Operation
            • Fuzzy Match Operation
            • Masking Operations
            • 1-way Hash Operation
            • Copy Operation
            • Unnest Operation
            • Convert Operation
            • Vlookup Operation
          • Numeric Operations
            • Tiles Operation
            • Numeric Calculation Operations
            • Custom Calculation Operation
            • Numeric Encode Operation
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
            • Convert Operation
            • VLookup Operation
          • Boolean Operations
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
          • Date Operations
            • Date Format Operations
            • Date Calculation Operations
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
            • Encode Operation
            • Convert Operation
          • Datetime/Timestamp Operations
            • Datetime Format Operations
            • Datetime Calculation Operations
            • Mask Operation
            • 1-way Hash Operation
            • Copy Operation
            • Encode Operation
            • Page 1
        • Dataset Transformations
          • Utility Functions
            • Area Under the Curve
            • Page Rank Utility Function
            • Transpose Utility Function
            • Semantic Search Template Utility Function
            • New Header Utility Function
            • Transform to JSON Utility Function
            • Text Utility Function
            • UI Utility Function
          • Window Functions
          • Case Statement
            • Editor Query
            • UI Query
          • Filter
            • Editor Query
            • UI Query
      • Data Prep
        • Joins
          • Configuring a Join
        • Union
          • Configuring a Union
      • Working with CSV files
      • Job Monitoring
    • Utility Features
      • IP safelist
      • Connect to data source(s)
        • Cloud Data Platforms
          • AWS S3
          • BigQuery
          • Google Cloud Storage
          • Azure
          • Snowflake
          • Redshift
          • Databricks
        • Databases
          • MySQL
          • Microsoft SQL Server
          • Oracle
          • MariaDB
          • Postgres
          • DB2
          • MongoDB
          • Couchbase
          • Aerospike
          • Pinecone
        • SaaS Applications
          • Google Ads
          • Google Analytics
          • Marketo
          • Zoom
          • JIRA
          • Salesforce
          • Zendesk
          • Hubspot
          • Outreach
          • Fullstory
          • Pendo
          • Box
          • Google Sheets
          • Slack
          • OneDrive / Sharepoint
          • ServiceNow
          • Stripe
      • Authentication
      • User Management
    • How To
      • Data Lake to Salesforce
      • Embed RAG into your App
  • API
    • Generate API Key in Dataworkz
    • RAG Apps API
    • Agents API
  • Open Source License Types
Powered by GitBook
On this page
  • AWS IAM Policy and User Creation
  • IAM Policy
  • IAM User
  • S3 Bucket
  • Configuring S3 connector in Dataworkz
  1. Guides
  2. Utility Features
  3. Connect to data source(s)
  4. Cloud Data Platforms

AWS S3

PreviousCloud Data PlatformsNextBigQuery

Last updated 1 month ago

Before setting up S3 connectivity in Dataworkz, necessary permission should have been configured in AWS for the same

AWS IAM Policy and User Creation

Dataworkz recommends creating an IAM policy and user for Dataworkz to access the S3 bucket.

Once the policy is attached to the bucket/user and AWS's security credentials are generated, these credentials need to be configured in Dataworkz to access objects in S3.

IAM Policy

From the AWS management console, navigate to the IAM section. Choose Account Settings from the left pane.

Check the Security Token Service Regions list and confirm that the region for your account is active as shown below.

IAM User

To create a new AWS IAM user, choose the “Users” option in the left pane and click on “Add users”.

Specify the user name and save the same.

Click the newly created user from the list and select the “Security credentials” tab.

Scroll down to the “Access keys” section and click “Create access key”

Choose “Other” in the next section since Dataworkz only has permission to access objects in a specific bucket.

Create the access key and click “Download .csv file” for configuring it in Dataworkz.

S3 Bucket

Create a new S3 bucket that is managed using IAM policies.

There are 2 options to add permissions for Dataworkz to access the buckets.

OR

Bucket Permissions

Select the bucket and goto the “Permissions” tab and edit the bucket policy.

User Permissions

Goto IAM -> User

Select the user and goto the “Permissions” section and select “Add Permission”. Select the option to attach policies directly. Select AmazonS3FullAccess and save the same. This would give user access to all the S3 resources for read, write & list.

Configuring S3 connector in Dataworkz

  • Login to Dataworkz Application

  • Go to Configuration -> Cloud Data Platforms -> S3

  • Click the + icon to add a new S3 connection

  • Give the Storage a name

  • Add the Key (your S3 user security access key)

  • Add the Value (your S3 user security key value)

  • Select the storage base path (url path to the storage location)

  • Choose the region of choice

  • Test the connection

  • Once you have a successful connection, Save the configuration

Attaching a policy to the bucket ()

Attaching a policy to the user ()

Give all the permissions (read & write) on the bucket and the encapsulating directory (see the image above). Edit the policy json to include the Principal of the user created for the purpose (see section “”).

Bucket Permissions
User Permissions
IAM User