For many engineering teams that try to bring additional context to large language models (LLMs) to serve their business needs, there has long been a sequence of techniques to try. First, you go with prompt engineering, and if that doesn’t work, you try retrieval-augmented generation (RAG), and then, if all else fails, you fine-tune your own model.
But as The New Stack contributor and Oracle senior application engineer Ibrahim Kamal argues, it’s a mistake to think of these as a linear progression.
“Instead, they represent different architectural methods for addressing different types of problems and introduce their own limitations and failure modes. Viewing them as a linear progression creates a false narrative that can lead to brittle systems that cannot adapt to changing requirements,” Kamal writes.
To decide which method will work best for a given use case and business need, Kamal proposes a decision tree with six dimensions.
Read the full piece to learn how to decide which architecture will work best for your use case.
Other stories we’re following today:
- Google is following Anthropic’s lead and is building hooks into its Gemini CLI coding agent. Those hooks allow developers to run pre-written scripts at specific moments in the agent loop to run security scans, log tool usage, or add additional context for the model. TNS Senior Editor for AI Frederic Lardinois explains how to use this new feature.