Implementing RAG with Amazon Bedrock Knowledge Bases

The Cost-Effective, Faster Alternative to LLM Fine-Tuning

May 23, 2025

Large language models (LLMs) have a key limitation: they rely solely on their pre-trained knowledge, which might be outdated or incomplete. Imagine an LLM trying to answer questions about a brand new product that you’ve been working on. Its knowledge could be years behind (if it is even aware of it in the first place).

This is where Retrieval-Augmented Generation (RAG) comes in.

Unlike fine-tuning, which requires modifying model parameters and retraining with domain-specific data, RAG allows models to dynamically fetch relevant information from external sources.

RAG is far more cost-effective and significantly less time-consuming. It is ideal for scenarios where data updates frequently. When it comes to building lightweight business applications, it is likely the better option.

Fine-tuning, on the other hand, is useful for highly specialised applications like medical diagnostics or legal document interpretation, where consistent responses based on structured internal knowledge are required.

While the underlying principles of RAG are straightforward, implementing it effectively can involve a few moving parts. This is where cloud platforms like Amazon Bedrock significantly simplify the process, particularly with their Knowledge Bases feature which enables RAG-based querying.

With Knowledge Bases, you can easily iterate from the AWS Console, adding more knowledge to a model and testing it out until you're happy with the results. When you’re ready, you can integrate everything into your application using the AWS SDK.

Setting Up a Knowledge Base in Amazon Bedrock

Step 1: Define Your Knowledge Base

Amazon Bedrock’s Knowledge Bases can ingest structured and semi-structured data, including documents in various formats like HTML and Markdown.

Step 2: Prepare Your Data

The quality of data ingested directly impacts RAG performance. Typically with RAG, the content you upload as a data source should be as clean as possible. For example, you’d want to:

- Remove unnecessary metadata (such as YAML frontmatter that isn’t useful for querying).

- Standardise headings and formatting to ensure logical information retrieval.

- Chunk content strategically—divide long articles into smaller segments for better retrieval precision.

- Ensure consistency in terminology to improve keyword relevance.

While the gold standard for RAG data is always meticulous cleanliness, one of the real advantages of Bedrock Knowledge Bases for rapid prototyping is its surprising tolerance for less-than-perfect data during initial setup.

Bedrock Knowledge Bases also offers to perform standard chunking out of the box. You can control how the chunking is done, but you may not need to. If you're not happy with the initial results, you can dig a little deeper there and find the chunking strategy that works best for your particular data set.

Step 3: Index Your Knowledge Base

Once your documents are cleaned and structured, they need to be indexed into a data source for efficient retrieval. This step converts textual data into numerical representations, allowing Amazon Bedrock to identify and fetch relevant information.

What You Need:

- An embedding model: Amazon Titan Embeddings is pretty much the only real option here (other Bedrock-supported embedding models might be available depending on the region, or you can integrate custom embedding models via advanced configurations.).

- A vector database: Amazon OpenSearch Serverless is the easiest solution, but Aurora PostgreSQL is a solid option too, especially if you’re already familiar with Postgres.

- Chunking strategy: Optimal chunk sizes (e.g., 200–500 words) for better retrieval. Again, don't worry about this when you're first experimenting with a new data set.

Traditionally, setting up these components can be a significant time investment when building RAG from scratch. However, Bedrock's Knowledge Bases streamline this entire process, allowing you to get these essential pieces in place with just a few clicks.

Step 4: Connect Your Knowledge Base to Amazon Bedrock

Now that your knowledge base is ready to go, you need to connect it to Amazon Bedrock so it can power your RAG-based queries.

How to Connect:

Configure your Knowledge Base: This involves pointing Bedrock to your S3 bucket (your primary data source) and selecting a vector database (like Amazon OpenSearch Serverless or Aurora PostgreSQL with pgvector) to store the embeddings. Bedrock then automatically handles the indexing and establishes the retrieval pipeline, which converts your queries into vectors and uses them to find relevant information.
Set up access control: Use AWS IAM roles to make sure your data access is secure and controlled.
Test it out: Run some sample queries to confirm that Bedrock is pulling the right information.

Step 5: Query Your Knowledge Base

Once everything is connected, Bedrock makes it really use to try out the new knowledge.

Testing Amazon Bedrock Knowledge Bases after adding some articles from The Serverless Mindset (using Llama 3.3 as the foundation model)

What’s fun about it, is that you can quickly and easily switch between foundation models.

Testing against the same RAG data source, but with DeepSeek as foundation model

Once you’re done playing with it, you can click on the copy icon on top right of the Configurations section. This conveniently places the JSON configuration for your setup into your clipboard, allowing you to easily transfer your experimental settings.

You can then pass that JSON when hitting the service using the AWS SDK from your application.

***

By following these steps, you can build an efficient RAG pipeline in Amazon Bedrock without costly model fine-tuning. The quality of the AI-generated responses will vary based on how well structured the data is.

Thinking about your own AI journey?

I've been working closely with clients to help them navigate their AI strategy, especially when it comes to crucial areas like safety, security, and copyright protection. It's all about empowering your team to leverage AI internally without the risks of exposing sensitive information or compromising company secrets and IP.

If you're looking to integrate AI responsibly within your organisation, I'd love to chat and see how I can help. Feel free to get in touch!

Shoot me an email to find out more!

The Serverless Mindset