Vectorize Newsletter 2025-01-13

Jamie Ferguson•January 13, 2025

Welcome to the first ever Vectorize newsletter! We’ll be sharing our latest features, technical insights, and community highlights to help you build better AI systems.

✨ What’s New

Source connectors are used to ingest unstructured data into your retrieval-augmented generation (RAG) pipeline. We’ve added a number of new connectors to Vectorize over the past few months – here’s a quick overview of what we support today.

For cloud storage, we have connectors for S3, Azure Blob Storage, and Google Cloud Storage. These are great at efficiently handling large document collections.

Need to pull from workspace tools? We support Confluence, Dropbox, Google Drive, OneDrive, and SharePoint.

Our Discord and Intercom connectors let you tap into valuable support conversations and user discussions.

Use Firecrawl or our Web Crawler connector to incorporate external knowledge bases and documentation.

And for local files or testing, there’s a simple File Upload connector.

You can use multiple connectors in each RAG pipeline. For example, you could create a pipeline that combines web crawled documentation with Discord discussions and Intercom tickets to capture both official docs and real-world usage for use by an AI assistant.

We’ve also created a collection of how-to guides that walk you through related tasks like how to create a Discord bot or a Confluence API token.

We’re constantly adding more connectors. If there’s a source you need that’s not listed here, let us know!

💡AI Engineering Insights

We’re publishing an in-depth series on building AI agent systems that actually work in production – from core architecture fundamentals to the real challenges you’ll face as your system grows:

– Part 1: Agent Architectures – How to build across tool, reasoning, and action layers while avoiding common pitfalls that can cascade into production issues

– Part 2: Modularity – Breaking monolithic agents into specialized sub-agents that handle specific domains like returns, orders, or product information

– Part 3: Agent to Agent Interactions – Creating standardized interfaces for clean agent communication and easier troubleshooting

– Part 4: Data Retrieval and Agentic RAG – Moving beyond basic RAG to connect with databases, APIs, and vector stores while maintaining speed and accuracy

– Coming Soon – Part 5: Cross-Cutting Concerns – The often overlooked but critical elements: monitoring, security, governance, and ethics

📚 Resource Corner

Check out our latest video!- Building a Web Scraping Chatbot – Transform any website into a production-ready conversational AI. This tutorial shows you how to build and deploy a RAG-powered chatbot that actually understands your content.

⚡Coming Up

We’re adding support for Weaviate, an open-source vector database known for its flexibility and built-in ML capabilities. This integration will let you create RAG pipelines that automatically keep your embeddings fresh as your data evolves – ensuring your AI always works with up-to-date information.

We’ve created Vectorize Iris, a fine-tuned vision model that transforms how RAG systems handle PDFs and other complex documents. We recently gave FamilyCloud.AI early access to Iris, and a perfect use case presented itself almost immediately. As FamilyCloud.AI founder Venkat Lakshmi describes:

“It’s unbelievable what we can do now. One of our users sent a photo of a physical document to our WhatsApp Bot, and the contents were extracted. Normally that would’ve been the end of the story. But Iris also extracted a link to a Google form, and our chatbot was able to provide that link to the user when queried – turning a static document into an interactive resource.”

📬 Stay Connected

If you have any questions, email us at contact@vectorize.io or message us in Discord.

Thanks,

The Vectorize Team