Unstructured File Ingestion

PDFs/Excel/PlainText/MarkDown/Word/PowerPoint

This feature allows you to ingest unstructured data (e.g., PDFs, Excel files, PlainText, Markdown, Word, PowerPoint) and convert it into a structured dataset, making it ready for chunking and embedding. Follow these simple steps to ingest your unstructured data:

1. Select File Type

  • From the list, choose the type of file you want to ingest. You can select more than one file type (e.g., PDF, Excel, Word, etc.).

2. Select Storage Location

  • Choose the storage where your files are located (e.g., S3, GCS, or Other Object Store).

3. View Available Files

  • Click on View All Files to see the list of available files from the selected storage location.

4. Select Files

  • From the displayed list, select the files you want to include in the ingestion process.

Last updated