> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# 🧠 RAG: Extracting Answers

> **Retrieval-Augmented Generation (RAG)** is the process of retrieving information that the AI does not have in its training data to answer a question. RAG is usually used to answer customer service questions, analyze legal documents, answer internal business questions, and much more.

This is part 2 of a 2 step tutorial. Here you'll learn how to use RAG to extract answers from a dataset assuming search is already done. For the first step of embedding your data, check out [🔢 RAG Generating Embeddings](/use-cases/rag-embeddings).

A couple of examples would be:

> If you're a student and want to reference a textbook for any questions you have to your AI, you would use RAG to ensure it answers based on the textbook.
>
> If a business wants to know "What was the total amount of the invoice?" the LLM would access the invoice, comb through it to get the total amount, and answer with the correct number.

Evaluating how well your model can extract answers is an important part of building a robust pipeline. You may want to tweak your prompt, evaluate different models, and continuously add new data to your dataset to improve the quality of your results.

## Upload Your Dataset

Open the dataset you want to work with. You can find an example dataset on our [explore page](https://www.oxen.ai/explore) or if you want to follow along with the example, you can clone the [RAG Answer Extraction dataset](https://www.oxen.ai/ox/RAG-Answer-Extraction) we are using. We're going to use a subset of the 50,000 rows available (that would take foreverrr) so if you want to do the same go to the branch 'train\_data\_subset'.

```bash theme={null}
oxen clone https://hub.oxen.ai/ox/RAG-Answer-Extraction --all
export OXEN_USERNAME="your-username" # NOTE: Replace with your username
oxen create-remote --name "$OXEN_USERNAME/RAG-Answer-Extraction"
oxen config --set-remote origin "$OXEN_USERNAME/RAG-Answer-Extraction"
oxen push origin main
```

<img className="block" src="https://mintcdn.com/oxenai/W8EBryVYRkX4WOUa/images/rag_dataset.png?fit=max&auto=format&n=W8EBryVYRkX4WOUa&q=85&s=41a6674590df78abe0e4a9428daf4ad7" alt="Oxen.ai RAG Repo" width="2880" height="1734" data-path="images/rag_dataset.png" />

## Create a Model Evaluation

Open the file you want to use and press the glowing button with the rocket 🚀 on it at the top right of the screen.

<img className="block" src="https://mintcdn.com/oxenai/W8EBryVYRkX4WOUa/images/rmod_button.png?fit=max&auto=format&n=W8EBryVYRkX4WOUa&q=85&s=6555cb2650a84bdc133996caa90cfa0f" alt="Where to find model evaluations" width="2880" height="1736" data-path="images/rmod_button.png" />

## Setting Up The Evaluation

You will now find Oxen's model evaluation feature. This is where you can choose a model, set up a prompt, and choose the output column.

In this case, we are using OpenAI's [o1-Mini](https://www.oxen.ai/explore/models#openai) and passing in the question and data related to the question in the prompt:

```
Answer the following question only using facts from the facts given after the question.
Keep your answer grounded in the facts given.
If no facts given after the question, return 'None'.

Question:
{query}

Facts:
{context}
```

After selecting your model, give your evaluation a name, fill in your prompt with the values you want to pass to the model and decide if you want to run a quick sample on a few rows or click "Next" to finalize the answer extracting preparations:

<img className="block" src="https://mintcdn.com/oxenai/W8EBryVYRkX4WOUa/images/rag_sample_responses.png?fit=max&auto=format&n=W8EBryVYRkX4WOUa&q=85&s=d3568d722f65f1f97517de1642c0d4ea" alt="RMOD prompt example" width="2880" height="1734" data-path="images/rag_sample_responses.png" />

## Select Your Destination

After clicking "Next" once your sample has been completed, you will see a commiting page. Here you will decide the target branch, target path, and if you would like to commit instantly or after reviewing the analysis.
In this case, a new branch called `o1-mini_tests` is being created and storing the changes. Once you've decided, click "Run Evaluation".

<img className="block" src="https://mintcdn.com/oxenai/W8EBryVYRkX4WOUa/images/rag_commit_page.png?fit=max&auto=format&n=W8EBryVYRkX4WOUa&q=85&s=b75a46e516a170de12df68d5d27228ad" alt="RAG commit page" width="2880" height="1734" data-path="images/rag_commit_page.png" />

## Monitor Your Evaluation

Feel free to grab a coffee, close the tab, or do something else while the evaluation is running. Your trusty Oxen Herd will be running in the background.

While the evaluation is running you will see a progress bar showing how many rows have been completed, an update of how many tokens are being used, and how expensive the run is so far.

<img className="block" src="https://mintcdn.com/oxenai/W8EBryVYRkX4WOUa/images/rag_status.png?fit=max&auto=format&n=W8EBryVYRkX4WOUa&q=85&s=0a684ab4c3ee443b3cad9039a52a09a6" alt="RMOD processing bar" width="2880" height="1734" data-path="images/rag_status.png" />

## Next Steps

Once done, you will see your new dataset committed to the branch you specified. If you don't like the results, don't worry! Under the hood, all the runs are versioned so you can always revert to or compare to a previous version.

<img className="block" src="https://mintcdn.com/oxenai/W8EBryVYRkX4WOUa/images/rag_final.png?fit=max&auto=format&n=W8EBryVYRkX4WOUa&q=85&s=a7edd5bb14f6288d68e1fe65e44f2931" alt="RMOD completed" width="2880" height="1734" data-path="images/rag_final.png" />

You can then search through the outcomes immediately with Oxen's Text2SQL feature. In this case, we are finding all the rows where the answer and prediction columns have different results to check o1-mini's work.

```
give me all the rows where the answer column and prediction columns are different

```

<img className="block" src="https://mintcdn.com/oxenai/s_o9ZlhOEkYJf27_/images/RAG_t2s.png?fit=max&auto=format&n=s_o9ZlhOEkYJf27_&q=85&s=50356e5ee474f76e5bd9748b21f2da05" alt="Text2SQL example" width="2880" height="1734" data-path="images/RAG_t2s.png" />

You can also run queries such as *"Give me all the rows with invoice in the query column?"* or *"How many rows have May 3rd in the answer column?"*

Congratulations! You've just seen how easy it is to use RAG on your data. Feel free to tweak your prompt, model, or dataset and see how the results change.
