File Sources

Upload PDF, Word, TXT, Markdown, and image files to build your agent's knowledge base

File Sources#

Upload documents to your agent's knowledge base. AlonChat extracts text from your files, processes it, and indexes it so your agent can answer questions using that content.

Supported File Formats#

Format	Extensions	Best For	Notes
PDF	.pdf	Documents, manuals, reports	Text-based PDFs only (scanned images are not supported)
Microsoft Word	.docx	Documents, policies, guides	Text and structure are extracted
Plain Text	.txt	Notes, transcripts, simple docs	Imported as-is
Markdown	.md	Technical docs, README files	Formatting is preserved
Images	.jpg, .png, .webp	Screenshots, product photos, diagrams	Visual content for your agent

For spreadsheet data (price lists, catalogs), use Structured Data instead -- it imports Excel (.xlsx/.xls) and CSV files and keeps rows queryable.

Step-by-Step: Adding a File Source#

1. Navigate to Sources#

Go to your agent's dashboard
Click Sources in the sidebar
Open Files
Click Add Source

2. Upload Your File#

Option A: Drag and Drop

Drag your file into the upload area
The file name appears when the upload completes

Option B: Click to Browse

Click Choose File
Select a file from your computer
Click Open

3. Configure Source Settings#

Source Name (required)

Give this source a descriptive name
Examples: "Product Manual 2026", "Pricing Sheet", "Policy Document"
This helps you identify sources later in the dashboard

Priority (optional)

Normal (default): Standard retrieval weight
High: More likely to be retrieved when relevant (use for important information)
Low: Lower retrieval priority (use for background or supplementary information)

Mark as Price Information (checkbox)

Check this if the file contains pricing information
Pricing questions will prioritize this source in results

4. Upload and Train#

Click Upload File
Wait for the upload to complete (you will see a progress indicator)
The source appears with a "Processing" status
Click Train Agent to process the file
Wait for the status to change to "Trained"

Your agent can now answer questions using this file's content.

Processing Time#

Small files (under 1MB): 30-60 seconds
Medium files (1-10MB): 1-3 minutes
Large files (10MB+): 5-10 minutes

Best Practices#

File Preparation#

Do:

Use clear, descriptive filenames
Use searchable PDFs (created from Word, Google Docs, or similar)
Organize content with headings and sections
Remove unnecessary pages (covers, blank pages, appendices you do not need)

Don't:

Upload scanned images (text cannot be extracted from image-only PDFs)
Include password-protected files
Upload corrupted files
Use special characters in filenames

PDF Best Practices#

Good PDFs:

Text-based (created digitally, not scanned)
Clear structure with headings
Tables with selectable text
You can highlight and copy text from the document

Problematic PDFs:

Scanned documents (images, not text)
Password-protected files
Heavily image-based layouts
Complex multi-column layouts

If you have a scanned PDF, use OCR software (such as Adobe Acrobat or an online OCR tool) to convert it to searchable text first.

Spreadsheet Data#

Spreadsheets are not file sources. To import price lists, catalogs, or any tabular data from Excel (.xlsx/.xls) or CSV, use Structured Data -- it keeps each row queryable so your agent can look up exact items and prices.

Word Document Best Practices#

Good Word documents:

Clear headings (Heading 1, Heading 2, etc.)
Bulleted or numbered lists
Simple tables
Primarily text content

Problematic Word documents:

Complex formatting (text boxes, multi-column layouts)
Embedded objects (videos, complex charts)
Track changes enabled (may cause parsing issues)

Updating File Sources#

Editing Settings#

Find the source in the Files list
Click Edit (pencil icon)
Update the name, priority, or pricing flag
Click Save
Retrain your agent for changes to take effect

Replacing File Content#

Delete or archive the old source
Upload the new file as a new source
Train your agent

Reprocessing a File#

If processing failed or you suspect issues:

Find the source in the Files list
Click Reprocess (refresh icon)
Wait for processing to complete
Check the status

Troubleshooting#

"Upload Failed"#

Common causes:

File is too large
Network connection was interrupted
File format is not supported
File is corrupted

Solutions:

Check file size and compress if needed
Try uploading again on a stable connection
Convert to a supported format (e.g., export to PDF)
Verify the file opens correctly on your computer

"Processing Failed"#

Common causes:

File is password-protected
File is corrupted or malformed
Text extraction failed (scanned PDF without OCR)

Solutions:

Remove password protection before uploading
Try re-exporting the file from its original application
Convert scanned PDFs to searchable text using OCR
Try converting to plain PDF or TXT

"Agent Not Using File Content"#

Common causes:

You did not retrain after uploading
The file is still processing
Questions are not related to the file content
Priority is set too low

Solutions:

Click the Train Agent button
Wait for the source status to show "Trained"
Test with questions that directly reference content in the file
Increase the priority to "High" for important sources

Examples#

Example 1: Product Manual#

File: product-manual-v2.pdf (8.5MB, 120 pages)

Settings:

Name: "Product Manual v2 (2026)"
Priority: High
Is Price: No

Example questions your agent can answer:

"How do I install the product?"
"What are the technical specifications?"
"How do I troubleshoot error codes?"

Example 2: Price List#

File: pricing-2026.pdf (250KB, 3 pages)

Settings:

Name: "2026 Pricing"
Priority: High
Is Price: Yes

For a live, queryable price list that updates from a spreadsheet, use Structured Data instead.

Example questions your agent can answer:

"How much does the Pro plan cost?"
"What's included in the Enterprise tier?"
"Do you offer discounts for annual plans?"

Example 3: Company Policies#

File: employee-handbook.docx (2.1MB, 45 pages)

Settings:

Name: "Employee Handbook 2026"
Priority: Normal
Is Price: No

Example questions your agent can answer:

"What's the vacation policy?"
"What are the work-from-home guidelines?"
"What benefits do employees get?"

Limits#

File size limits and the maximum number of sources vary by plan. Check your plan details in Settings for the specific limits that apply to your account.

Next Steps#

Text Sources -- Add text directly without uploading a file
Q&A Sources -- Create specific question-answer pairs
Website Sources -- Crawl and index websites
Training Your Agent -- Best practices for training