Divergent, Not Convergent, RAG

September 7, 2024 promptgramming RAG

Last year I showed a friend some RAG stuff I was hacking around with. Earlier this year, he asked if I could build him an app, based on those ideas, that he could use to (finally) plow through years --decades probably – of disorganized notes, writings, half-finished book outlines, and more.

I said heck yeah.

One of the things that was intuitively obvious to both of us is that the retrieval phase of the workflow should be separated from the LLM conversation phase. I’m not sure I’ve ever seen another RAG app architected this way. They all seem to be driven by a “Knowledgebase Chatbot” design goal.

Even Quivr, which positions as an AI-augmented “brain” seems to hew to this design goal.

The Knowledgebase Chatbot assumes: you’re looking for an answer, and the job of the RAG phase is to find relevant context, and the job of the LLM is to massage that context into an answer. That’s a totally worthwhile goal, but it’s just a fraction of what a RAG + LLM system can do. In terms of flow, the Knowledgebase Chatbot will:

Take your question/prompt (USER TOUCHPOINT)
Maybe re-work the question/prompt to be retreival-optimized
Conduct a vector or hybrid search on the knowledgestore, generally with a fixed top-k and other parameters similarly fixed or globally configured
Combine the retrieved content with a global or predetermined prompt and send to the LLM
Show the LLM response to the user and let the user continue the conversation (USER TOUCHPOINT)

This model is driven by the idea that we’re trying to converge on an answer. And maybe from that point we‘ll want to explore deeper, diverge, explore other directions, etc. But I think a good RAG system offers a lot of value if it affords tweaking and user interactions in steps 2 through 4 in the process above.

My friend is brilliant, and has explored a ton of topics across multiple domains. Fox, not hedgehog. The corpus of everything he’s committed to writing has tons of value, but no answers, and definitely no singular answer. So he needs a different RAG + LLM model.

I’m at the library and not able to screenshot what I’m building for him, so that’ll come in a separate post.