File Sources
Upload PDF, Word, Excel, CSV, and Markdown files to build your agent's knowledge base
File Sources#
Upload documents to your agent's knowledge base. AlonChat extracts text from your files, processes it, and indexes it so your agent can answer questions using that content.
Supported File Formats#
| Format | Extensions | Best For | Notes |
|---|---|---|---|
| Documents, manuals, reports | Text-based PDFs only (scanned images are not supported) | ||
| Microsoft Word | .docx | Documents, policies, guides | Text and structure are extracted |
| Microsoft Excel | .xlsx, .csv | Price lists, data tables | Tables are converted to text |
| Markdown | .md | Technical docs, README files | Formatting is preserved |
Step-by-Step: Adding a File Source#
1. Navigate to Sources#
- Go to your agent's dashboard
- Click Sources in the sidebar
- Open Files
- Click Add Source
2. Upload Your File#
Option A: Drag and Drop
- Drag your file into the upload area
- The file name appears when the upload completes
Option B: Click to Browse
- Click Choose File
- Select a file from your computer
- Click Open
3. Configure Source Settings#
Source Name (required)
- Give this source a descriptive name
- Examples: "Product Manual 2026", "Pricing Sheet", "Policy Document"
- This helps you identify sources later in the dashboard
Priority (optional)
- Normal (default): Standard retrieval weight
- High: More likely to be retrieved when relevant (use for important information)
- Low: Lower retrieval priority (use for background or supplementary information)
Mark as Price Information (checkbox)
- Check this if the file contains pricing information
- Pricing questions will prioritize this source in results
4. Upload and Train#
- Click Upload File
- Wait for the upload to complete (you will see a progress indicator)
- The source appears with a "Processing" status
- Click Train Agent to process the file
- Wait for the status to change to "Ready"
Your agent can now answer questions using this file's content.
Processing Time#
- Small files (under 1MB): 30-60 seconds
- Medium files (1-10MB): 1-3 minutes
- Large files (10MB+): 5-10 minutes
Best Practices#
File Preparation#
Do:
- Use clear, descriptive filenames
- Use searchable PDFs (created from Word, Google Docs, or similar)
- Organize content with headings and sections
- Remove unnecessary pages (covers, blank pages, appendices you do not need)
Don't:
- Upload scanned images (text cannot be extracted from image-only PDFs)
- Include password-protected files
- Upload corrupted files
- Use special characters in filenames
PDF Best Practices#
Good PDFs:
- Text-based (created digitally, not scanned)
- Clear structure with headings
- Tables with selectable text
- You can highlight and copy text from the document
Problematic PDFs:
- Scanned documents (images, not text)
- Password-protected files
- Heavily image-based layouts
- Complex multi-column layouts
If you have a scanned PDF, use OCR software (such as Adobe Acrobat or an online OCR tool) to convert it to searchable text first.
Excel and CSV Best Practices#
Good spreadsheets:
- Clear column headers in the first row
- One table per sheet
- Simple formatting without merged cells
Problematic spreadsheets:
- Multiple unrelated tables on one sheet
- Heavily merged cells
- Charts and graphs (these will not be processed)
For complex Excel files, consider exporting relevant data to CSV or creating a Text source with the important data instead.
Word Document Best Practices#
Good Word documents:
- Clear headings (Heading 1, Heading 2, etc.)
- Bulleted or numbered lists
- Simple tables
- Primarily text content
Problematic Word documents:
- Complex formatting (text boxes, multi-column layouts)
- Embedded objects (videos, complex charts)
- Track changes enabled (may cause parsing issues)
Updating File Sources#
Editing Settings#
- Find the source in the Files list
- Click Edit (pencil icon)
- Update the name, priority, or pricing flag
- Click Save
- Retrain your agent for changes to take effect
Replacing File Content#
- Delete or archive the old source
- Upload the new file as a new source
- Train your agent
Reprocessing a File#
If processing failed or you suspect issues:
- Find the source in the Files list
- Click Reprocess (refresh icon)
- Wait for processing to complete
- Check the status
Troubleshooting#
"Upload Failed"#
Common causes:
- File is too large
- Network connection was interrupted
- File format is not supported
- File is corrupted
Solutions:
- Check file size and compress if needed
- Try uploading again on a stable connection
- Convert to a supported format (e.g., export to PDF)
- Verify the file opens correctly on your computer
"Processing Failed"#
Common causes:
- File is password-protected
- File is corrupted or malformed
- Text extraction failed (scanned PDF without OCR)
Solutions:
- Remove password protection before uploading
- Try re-exporting the file from its original application
- Convert scanned PDFs to searchable text using OCR
- Try converting to plain PDF or TXT
"Agent Not Using File Content"#
Common causes:
- You did not retrain after uploading
- The file is still processing
- Questions are not related to the file content
- Priority is set too low
Solutions:
- Click the Train Agent button
- Wait for the source status to show "Ready"
- Test with questions that directly reference content in the file
- Increase the priority to "High" for important sources
Examples#
Example 1: Product Manual#
File: product-manual-v2.pdf (8.5MB, 120 pages)
Settings:
- Name: "Product Manual v2 (2026)"
- Priority: High
- Is Price: No
Example questions your agent can answer:
- "How do I install the product?"
- "What are the technical specifications?"
- "How do I troubleshoot error codes?"
Example 2: Price List#
File: pricing-2026.xlsx (250KB, 3 sheets)
Settings:
- Name: "2026 Pricing"
- Priority: High
- Is Price: Yes
Example questions your agent can answer:
- "How much does the Pro plan cost?"
- "What's included in the Enterprise tier?"
- "Do you offer discounts for annual plans?"
Example 3: Company Policies#
File: employee-handbook.docx (2.1MB, 45 pages)
Settings:
- Name: "Employee Handbook 2026"
- Priority: Normal
- Is Price: No
Example questions your agent can answer:
- "What's the vacation policy?"
- "What are the work-from-home guidelines?"
- "What benefits do employees get?"
Limits#
File size limits and the maximum number of sources vary by plan. Check your plan details in Settings for the specific limits that apply to your account.
Next Steps#
- Text Sources -- Add text directly without uploading a file
- Q&A Sources -- Create specific question-answer pairs
- Website Sources -- Crawl and index websites
- Training Your Agent -- Best practices for training