<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Bartek Sadlej, Author at TantusData</title>
	<atom:link href="https://tantusdata.com/author/bartosz/feed/" rel="self" type="application/rss+xml" />
	<link>https://tantusdata.com</link>
	<description>That uncovers wisdom.</description>
	<lastBuildDate>Tue, 25 Mar 2025 11:40:06 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.7.1</generator>

<image>
	<url>https://tantusdata.com/app/uploads/2023/01/cropped-Favicon-32x32.png</url>
	<title>Bartek Sadlej, Author at TantusData</title>
	<link>https://tantusdata.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>What you need to know before deploying Open Source LLM</title>
		<link>https://tantusdata.com/insights/what-to-know-before-deploying-open-source-llm/</link>
		
		<dc:creator><![CDATA[Bartek Sadlej]]></dc:creator>
		<pubDate>Sat, 19 Oct 2024 13:57:27 +0000</pubDate>
				<category><![CDATA[Deployment Strategies for LLMs]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[LLM Benchmarks Understanding]]></category>
		<category><![CDATA[LLM Licensing]]></category>
		<category><![CDATA[LLM Performance Trade-offs]]></category>
		<category><![CDATA[Open Source LLM Deployment]]></category>
		<guid isPermaLink="false">https://tantusdata.com/?post_type=insights&#038;p=1925</guid>

					<description><![CDATA[<p>Navigating the complexities of deploying open-source Large Language Models (LLMs) can be daunting. From understanding licensing restrictions and making crucial decisions about accuracy, speed, and cost trade-offs, to comprehending benchmark evaluations and exploring deployment strategies, this guide provides essential insights for leveraging open-source LLMs effectively in your projects.</p>
<p>The post <a href="https://tantusdata.com/insights/what-to-know-before-deploying-open-source-llm/">What you need to know before deploying Open Source LLM</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="585" src="https://tantusdata.com/app/uploads/2024/03/What-you-need-to-know-before-deploying-Open-Source-LLM-1024x585.jpg" alt="" class="wp-image-2174" srcset="https://tantusdata.com/app/uploads/2024/03/What-you-need-to-know-before-deploying-Open-Source-LLM-1024x585.jpg 1024w, https://tantusdata.com/app/uploads/2024/03/What-you-need-to-know-before-deploying-Open-Source-LLM-300x171.jpg 300w, https://tantusdata.com/app/uploads/2024/03/What-you-need-to-know-before-deploying-Open-Source-LLM-768x439.jpg 768w, https://tantusdata.com/app/uploads/2024/03/What-you-need-to-know-before-deploying-Open-Source-LLM-1536x878.jpg 1536w, https://tantusdata.com/app/uploads/2024/03/What-you-need-to-know-before-deploying-Open-Source-LLM.jpg 1792w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>There are a few key questions which need to be thoroughly understood and answered before selecting a large language model to be used for building an application:</p>



<ul class="wp-block-list">
<li>License &#8211; because you don’t want to end up in a legal trap</li>



<li>Expectations: accuracy, speed and cost tradeoffs</li>



<li>Understanding of benchmarks the model was evaluated on &#8211; so you don’t get surprised when evaluating the model with your users on your data</li>



<li>Deployment options &#8211; because building a PoC you run on your laptop is often far from production deployment.</li>
</ul>



<h2 class="wp-block-heading">License</h2>



<p>This sounds easy; open source is open, as the name suggests. Well, not exactly. Ensure that the model you choose can be used as you want. For example, there is a statement in the Llama-2 license:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p><em>v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof)</em>.</p>
</blockquote>



<p>This means that if you start with Llama-2 or its fine-tuned successors and later, at some point in time, decide to switch to a different model, you are not allowed to use your historical data to train the new LLM. Or are you? If you were to modify one line of model code, it is no longer the original <code>Llama Material</code> and so on. In general, AI and code base product regulations are hard to assess and interpret, so it is probably safe to try finding a model with an Apache License or an even more permissive license first, if possible.</p>



<h2 class="wp-block-heading">Define your expectation: accuracy, speed and cost tradeoffs.</h2>



<p>It is tempting to dream big, especially for non-technical people who have seen the recent OpenAI Dev Day with the announcements of GPTs, Assistants and Google Gemini &amp; Lumiere models. But in reality, meeting excessive expectations is challenging and often impossible. Going from 0% to 90% AI automation is difficult but doable; closing the gap between 90% and 100% is exceptionally demanding.</p>



<p>Think about Github Copilot. It won&#8217;t write a project for you, but its blazingly fast few-line completions, which usually require little adjusting, make engineers far more productive.&nbsp;</p>



<p>Ask the question: does my model need to figure out the nitty-gritty details? Or leave some space for users to interact and fill the missing gaps while creating a valuable product.</p>



<p>Maybe some parts of the pipeline can be postponed and implemented as batch jobs? The cost reductions might be significant in this case since closed model providers don&#8217;t offer lower prices for batch requests. In-house LLM allows you to process tasks offline in batches.</p>



<p>The recently released small Open Source LLM, such as Llama-2 and Mistral, or their further fine-tuned versions, like Zephyr and OpenHermes-2.5, are a perfect match for such a scenario. If you cannot compromise on accuracy, maybe there is a way to algorithmically fix weak spots.&nbsp;</p>



<p>On the other hand, it might be valuable to provide users with a few different model outputs or allow them to iterate and guide the suggestions quickly, such as with GithubCopilot. GPT-4 is powerful, but it would take minutes to call it a few times. Smaller models allow you to do such things.&nbsp;<a href="https://huggingface.co/blog/optimum-nvidia" target="_blank" rel="noreferrer noopener">Recent features</a>&nbsp;from Hugging Face and Nvidia can run Llama-v2-13b with an unbelievable speed of 1200 tokens per second.&nbsp;</p>



<h2 class="wp-block-heading">Understand the benchmarks the model was evaluated on</h2>



<p>When choosing the model, you will probably focus on its size, performance, and &#8216;the vibe&#8217; &#8211; whether the model&#8217;s responses generally feel good. The performance is most often checked using the results of well-known benchmarks.</p>



<p>What are the weak spots of this approach?</p>



<p>First, the ML Labs releasing the model does not always publish the training data or even more precise information about what the model was trained on. It is often the case that we only see that &#8216;the model was trained on a well-curated corpus of X tokens&#8217;. And because those benchmarks are so popular, there is a possibility of some leakages into the training set. Not immediately the whole corpus, but for example, an automatic web crawl can contain conversations from Reddit or X/Twitter feeds about a particular task where people are discussing some parts of the benchmark.</p>



<p>Secondly, keep in mind that, in general, it is hard to benchmark written text automatically.&nbsp;&nbsp;</p>



<p>To uncover that, it is crucial to understand how each of those benchmarks is created and what it measures.&nbsp;</p>



<p>Let&#8217;s see an example question from one of the most popular ones, the MMLU (<a href="https://arxiv.org/abs/2009.03300" target="_blank" rel="noreferrer noopener">Massive Multitask Language Understanding</a>):</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Question: Glucose is transported into the muscle cell:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Choices:<br>A. via protein transporters called GLUT4.<br>B. only in the presence of insulin.<br>C. via hexokinase.<br>D. via monocarboxylic acid transporters.</p>
</blockquote>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Correct answer: A</p>
</blockquote>
</blockquote>



<p>And let’s ask ChatGPT:</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="676" src="https://tantusdata.com/app/uploads/2024/07/glucose_q-1024x676.jpg" alt="" class="wp-image-2132" srcset="https://tantusdata.com/app/uploads/2024/07/glucose_q-1024x676.jpg 1024w, https://tantusdata.com/app/uploads/2024/07/glucose_q-300x198.jpg 300w, https://tantusdata.com/app/uploads/2024/07/glucose_q-768x507.jpg 768w, https://tantusdata.com/app/uploads/2024/07/glucose_q-1536x1014.jpg 1536w, https://tantusdata.com/app/uploads/2024/07/glucose_q-2048x1352.jpg 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Good? The answer is just &#8220;A&#8221;, so even though the model gets it, automatic evaluation would score it as a failure!</p>



<p>Without doing a deep dive into how the evaluation is actually done, there is an excellent&nbsp;<a target="_blank" href="https://huggingface.co/blog/evaluating-mmlu-leaderboard" rel="noreferrer noopener">blog</a>&nbsp;on HuggingFace explaining it in detail; you should just know that it requires taking bare following tokens&#8217; probabilities and using the model through code in a different way than you would interact with it through chat on some WebUI.</p>



<p>So, the key takeaway is that while those benchmarks provide us with a general ranking of models&#8217; performance, one should pay close attention to how they are evaluated and whether this form of evaluation is meaningful for their use case.&nbsp;</p>



<p>For example, ChatGPT is a so-called Instruction-Finetuned model tuned to follow user instructions and interact with them. If you put a phrase:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Can you help me with that:<br>{arbitrary problem description}</p>
</blockquote>



<p>It will very likely start it’s response with:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Certainly! {probably a good solution to your problem}</p>
</blockquote>



<p>And if you were to check tokens probabilities for options A, B, C, and D from the above-mentioned MMLU example, as it is done in one implementation of MMLU, you would get C! But not because the model thinks the completion for the</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>Correct answer</p>
</blockquote>



<p>Is C, but because it wants to start with Certainly!</p>



<h2 class="wp-block-heading"><strong>Deployment options</strong></h2>



<p>Last but not least, let&#8217;s talk about inference. When you have chosen and maybe even fine-tuned your model further, it&#8217;s time to answer the question of what exactly you want to deploy and where.</p>



<p>The &#8216;what exactly&#8217; part is essential. To start with, you probably have X billion parameters model in (b)float16. There are two options for improvement here: quantization and pruning.</p>



<p>Quantization converts some 16-bit weights into 8 or 4 bits so you can run the model on a smaller and cheaper GPU. Of course, by doing so, we lose some information and accuracy. It can be done automatically using some general formulas, or you can specify an evaluation dataset to quantize in a way that reduces some metrics the least.</p>



<p>It is important to note that currently, on most hardware, quantization reduces memory usage but reduces inference speed. Although the model weights are much smaller, some values must be cast back and forth. But it allows you to inference/fine-tune the model on cheaper hardware or just the available hardware since it might be hard for new players to get access to A100 &amp; H100 clusters.</p>



<p>Both ways are available in the HuggingFace library and can be easily applied.&nbsp;<a target="_blank" href="https://huggingface.co/blog/overview-quantization-transformers" rel="noreferrer noopener">Here</a>, you can find the blog post going through their pros and cons and inference speed/memory comparison.</p>



<p>Pruning, on the other hand, works by completely removing some weights from the model.</p>



<p>It is important to remember that the transformer model under the hood does matrix multiplication, so you can just remove all entries close to zero and expect the performance to improve because it will cause some non-sequential memory accesses. A more gentle solution is needed. The PyTorch team has recently posted 2&nbsp;<a href="https://pytorch.org/blog/accelerating-generative-ai/" target="_blank" rel="noreferrer noopener">blog posts</a>&nbsp;about accelerating Generativ-AI, where they go into detail about available options.</p>



<h2 class="wp-block-heading">Where to deploy?</h2>



<p>Though for real-time chat applications, data centre deployment or on-premise, with high availability, there are some cost-saving techniques if you have offline steps in your data pipeline.</p>



<p>Currently, everyone runs LLM models on either A10 or A100 / H100, but surprisingly, not so many people know that cards from the RTX family are also a good performance choice for such applications.</p>



<p>Unfortunately, NVIDIA knows that, and they put the following statements in their license.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>No Datacenter Deployment. The SOFTWARE is not licensed for datacenter deployment, except that blockchain processing in a datacenter is permitted.</p>
</blockquote>



<p>But there are companies like&nbsp;<a href="https://vast.ai/" target="_blank" rel="noreferrer noopener">vast.ai</a>&nbsp;which offer RTX cards but with lower reliability than, for example, AWS ec2 instances, which you can use for offline data processing. The default filter for availability here is set to 90%, while on the AWS EC2 Service Level Agreement, commitment is 99.99%.</p>
<p>The post <a href="https://tantusdata.com/insights/what-to-know-before-deploying-open-source-llm/">What you need to know before deploying Open Source LLM</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>NeMo-Guardrails</title>
		<link>https://tantusdata.com/insights/nvidia-nemo-guardrails-chatbot-development-guide/</link>
		
		<dc:creator><![CDATA[Bartek Sadlej]]></dc:creator>
		<pubDate>Tue, 05 Dec 2023 12:10:04 +0000</pubDate>
				<category><![CDATA[Chatbot Development]]></category>
		<category><![CDATA[Conversational AI]]></category>
		<category><![CDATA[Custom LLM Integration]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Nvidia NeMo-Guardrails]]></category>
		<guid isPermaLink="false">https://tantusdata.com/?post_type=insights&#038;p=1865</guid>

					<description><![CDATA[<p>Building a dedicated chatbot is both challenging and dangerous. At company X, the model should talk about X&#8217;s offer and, ideally, nothing else to save cost, not block throughput, and be sure not to insult anyone. It would also be nice to meet all of those requirements while not sacrificing the chatbot&#8217;s performance. The field [&#8230;]</p>
<p>The post <a href="https://tantusdata.com/insights/nvidia-nemo-guardrails-chatbot-development-guide/">NeMo-Guardrails</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="1024" src="https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-1024x1024.png" alt="" class="wp-image-1866" srcset="https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-1024x1024.png 1024w, https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-300x300.png 300w, https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-150x150.png 150w, https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-768x768.png 768w, https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-1536x1536.png 1536w, https://tantusdata.com/app/uploads/2023/12/circuit-board-patterns-2048x2048.png 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Building a dedicated chatbot is both challenging and dangerous. At company X, the model should talk about X&#8217;s offer and, ideally, nothing else to save cost, not block throughput, and be sure not to insult anyone. It would also be nice to meet all of those requirements while not sacrificing the chatbot&#8217;s performance.</p>



<p>The field of LLM-powered bots is new and rapidly evolving, so many different solutions have emerged, but one of them caught our attention: Nvidia&nbsp;<a href="https://github.com/NVIDIA/NeMo-Guardrails">NeMo-Guardrails</a>. Its core value is the ability to define rails to guide conversations while being able to connect an LLM to other services seamlessly and securely.&nbsp;</p>



<p>You can check out how to get started using the examples and user guide on its GitHub page, but since it is very new and, at the time of this writing, the current release is alpha 0.5, there are not many resources online on how to build more complex applications. At TantusData, we&#8217;ve been using it a lot recently and want to share a few practical tips.</p>



<p>Agenda:</p>



<ul class="wp-block-list">
<li>How to make it work with a model of your choice</li>



<li>Multiple bot actions and responses per one user message and output formatting</li>



<li>Two chat histories: one for displaying to the user, different to guide the model</li>



<li>General tips</li>
</ul>



<h2 class="wp-block-heading">How to make it work with a model of your choice</h2>



<p>We will use&nbsp;<a href="https://docs.mistral.ai/quickstart">Mistal7BInstruct</a>&nbsp;to illustrate that. The advantage of using this open-source model is that it comes with an official Docker image, which you can use to self-host it, and the API schema follows the one from OpenAI, so it is super easy to integrate it. You can also use this Docker with any other model from HuggingFace. If it is a gated one, such as Llama-2, remember to run Docker with -e HF_TOKEN=&#8230; to get access.</p>



<p>There are two things to cover here—connection to the model and prompting.</p>



<p>The connection consists of two parts: config and implementation. The bare minimum implementation follows&nbsp;<a href="https://python.langchain.com/docs/modules/model_io/models/llms/custom_llm">LangChain LLM interface</a>, which should be put in `config.py` file with an additional line registering it in guardrails:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# config.py
import openai

class Mistral7BInstruct(LLM):
model: str
endpoint_url: str
# also useful to define: temperature ~ 0.0, max_tokens ~ 2K, frequency_penalty ~ 1.

def _call(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -&gt; str:

openai.api_key = None
openai.api_base = self.endpoint_url

response = openai.Completion.create(
model = self.model,
prompt = prompt,
stop = stop,
**kwargs
)

return response.choices[0].text

@property
def _identifying_params(self):
...

@property
def _llm_type(self):
return {}

register_llm_provider(&quot;my_engine_name&quot;, Mistral7BInstruct)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">config</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">py</span></span>
<span class="line"><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">openai</span></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">class</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">Mistral7BInstruct</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">LLM</span><span style="color: #D8DEE9FF">):</span></span>
<span class="line"><span style="color: #8FBCBB">model</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">str</span></span>
<span class="line"><span style="color: #8FBCBB">endpoint_url</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">str</span></span>
<span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #8FBCBB">also</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">useful</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">to</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">define</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">temperature</span><span style="color: #D8DEE9FF"> ~ 0.0</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">max_tokens</span><span style="color: #D8DEE9FF"> ~ 2</span><span style="color: #8FBCBB">K</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">frequency_penalty</span><span style="color: #D8DEE9FF"> ~ 1.</span></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">_call</span><span style="color: #D8DEE9FF">(</span></span>
<span class="line"><span style="color: #8FBCBB">self</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #8FBCBB">prompt</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">str</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #8FBCBB">stop</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">Optional</span><span style="color: #D8DEE9FF">[</span><span style="color: #8FBCBB">List</span><span style="color: #D8DEE9FF">[</span><span style="color: #8FBCBB">str</span><span style="color: #D8DEE9FF">]] = </span><span style="color: #8FBCBB">None</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #8FBCBB">run_manager</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">Optional</span><span style="color: #D8DEE9FF">[</span><span style="color: #8FBCBB">CallbackManagerForLLMRun</span><span style="color: #D8DEE9FF">] = </span><span style="color: #8FBCBB">None</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #81A1C1">**</span><span style="color: #8FBCBB">kwargs</span><span style="color: #D8DEE9FF">: </span><span style="color: #8FBCBB">Any</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">) -&gt; </span><span style="color: #8FBCBB">str</span><span style="color: #D8DEE9FF">:</span></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">openai</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">api_key</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">None</span></span>
<span class="line"><span style="color: #8FBCBB">openai</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">api_base</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">self</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">endpoint_url</span></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">response</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">openai</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">Completion</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">create</span><span style="color: #D8DEE9FF">(</span></span>
<span class="line"><span style="color: #8FBCBB">model</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">self</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">model</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #8FBCBB">prompt</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">prompt</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #8FBCBB">stop</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">stop</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #81A1C1">**</span><span style="color: #8FBCBB">kwargs</span></span>
<span class="line"><span style="color: #D8DEE9FF">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">return</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">response</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">choices</span><span style="color: #D8DEE9FF">[0].</span><span style="color: #8FBCBB">text</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">@</span><span style="color: #8FBCBB">property</span></span>
<span class="line"><span style="color: #8FBCBB">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">_identifying_params</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">self</span><span style="color: #D8DEE9FF">):</span></span>
<span class="line"><span style="color: #D8DEE9FF">...</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">@</span><span style="color: #8FBCBB">property</span></span>
<span class="line"><span style="color: #8FBCBB">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">_llm_type</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">self</span><span style="color: #D8DEE9FF">):</span></span>
<span class="line"><span style="color: #8FBCBB">return</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">{}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">register_llm_provider</span><span style="color: #D8DEE9FF">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">my_engine_name</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">Mistral7BInstruct</span><span style="color: #D8DEE9FF">)</span></span></code></pre></div>



<p></p>



<p>Then what you need to do is specify the engine and parameters in `config.yml` file.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="models:
- type: main
engine: my_engine_name
parameters:
model: mistralai/Mistral-7B-Instruct-v0.1
endpoint_url: ..." style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">models</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> type</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">main</span></span>
<span class="line"><span style="color: #D8DEE9FF">engine</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">my_engine_name</span></span>
<span class="line"><span style="color: #D8DEE9FF">parameters</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">model</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">mistralai</span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9">Mistral</span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF">7</span><span style="color: #D8DEE9">B</span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9">Instruct</span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9">v0</span><span style="color: #ECEFF4">.</span><span style="color: #B48EAD">1</span></span>
<span class="line"><span style="color: #D8DEE9FF">endpoint_url</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">...</span></span></code></pre></div>



<p></p>



<p>The next thing to cover is prompts.</p>



<p>By now, NeMo-Guardrails works best with `text-davinci-003` (first chat GPT). More recent OpenAI models expect different prompts to create structured output, whereas the OpenSource model needs more strict instructions on what to do; they won&#8217;t automatically spot the pattern in two examples and follow.</p>



<p>The main challenge is generating user intent given the current input and definitions provided in `*.co` files. There are prompts for some already implemented and general ones that will be used if the engine is not explicitly implemented. The problem with them is that they lack explicit instruction on what to do, and as we noticed, usually less powerful models, instead of following the intent pattern, go ahead and try to respond to user input.</p>



<p>The solution for mistral is to include explicit instruction.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="- task: generate_user_intent
content: |-
&quot;&quot;&quot;
{{ general_instruction }}
You must write only user intent as shown in the example. Do not respond to the user. Do not write anything else.
&quot;&quot;&quot;
..." style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> task</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">generate_user_intent</span></span>
<span class="line"><span style="color: #D8DEE9FF">content</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">|-</span></span>
<span class="line"><span style="color: #ECEFF4">&quot;&quot;&quot;</span></span>
<span class="line"><span style="color: #A3BE8C">{{ general_instruction }</span><span style="color: #D8DEE9">}</span></span>
<span class="line"><span style="color: #D8DEE9">You</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">must</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">write</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">only</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">user</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">intent</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> shown in the example</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF"> Do not respond to the user</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF"> Do not write anything else</span><span style="color: #ECEFF4">.</span></span>
<span class="line"><span style="color: #ECEFF4">&quot;&quot;&quot;</span></span>
<span class="line"><span style="color: #A3BE8C">..</span><span style="color: #D8DEE9">.</span></span></code></pre></div>



<p></p>



<h2 class="wp-block-heading">Multiple bot actions and responses per one user message and output formatting</h2>



<p>There are situations when we want to execute more than one action or value extraction per round and combine all outputs into the final response. The reason not to just write a wrapper function which will do everything at once is the ability to later easily filter or modify some parts from history, which gets automatically inserted into the model prompt. I will cover that in the next section.&nbsp;&nbsp;</p>



<p>Also, when you want to include some line breaks for better formatting, in the default definition&#8230; it will either not be visible or displayed as &#8216;\n&#8217; instead and all bot messages from one round will simply get concatenated.</p>



<p>Now, let&#8217;s look at how an example flow might look like:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="define bot answer_with_cited_document provide answer
&quot;Answer: $answer_with_cited_document \n\n&quot;

define bot matches_in_db found matches
&quot;Found matches: $matches_in_db&quot;

define flow answer with cited documents
user ask question
$answer_with_cited_document = ...
bot $answer_with_cited_document provide answer
$cited_documents = ... 
$matches_in_db = execute db_search(cited_documents=$cited_documents)
bot $matches_in_db inform found matches" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">define</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">bot</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer_with_cited_document</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">provide</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer</span></span>
<span class="line"><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Answer: $answer_with_cited_document </span><span style="color: #EBCB8B">\n\n</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">define</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">bot</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">matches_in_db</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">found</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">matches</span></span>
<span class="line"><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Found matches: $matches_in_db</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">define</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">flow</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">with</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">cited</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">documents</span></span>
<span class="line"><span style="color: #D8DEE9">user</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">ask</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">question</span></span>
<span class="line"><span style="color: #D8DEE9">$answer_with_cited_document</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">...</span></span>
<span class="line"><span style="color: #D8DEE9">bot</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">$answer_with_cited_document</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">provide</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer</span></span>
<span class="line"><span style="color: #D8DEE9">$cited_documents</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">...</span><span style="color: #D8DEE9FF"> </span></span>
<span class="line"><span style="color: #D8DEE9">$matches_in_db</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">execute</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">db_search</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">cited_documents</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">$cited_documents</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #D8DEE9">bot</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">$matches_in_db</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">inform</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">found</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">matches</span></span></code></pre></div>



<p></p>



<p>Here, the printed bot answer after concatenation will look as follows:</p>



<p>&#8220;Answer: $answer_with_cited_document \n\nFound matches: $matches_in_db”</p>



<p>The way to achieve better formatting might look like this:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="define bot formatted_answer print answer
&quot;$formatted_answer&quot;
define flow answer with cited documents
user ask question
$answer_with_cited_document = ...
$cited_documents = ... 
$matches_in_db = execute db_search(cited_documents=$cited_documents)
$formatted_answer = execute format_answer(ans=$answer_with_cited_document, docs=$matches_in_db)
bot formatted_answer print answer" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">define</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">bot</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">formatted_answer</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">print</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer</span></span>
<span class="line"><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">$formatted_answer</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"><span style="color: #D8DEE9">define</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">flow</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">with</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">cited</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">documents</span></span>
<span class="line"><span style="color: #D8DEE9">user</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">ask</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">question</span></span>
<span class="line"><span style="color: #D8DEE9">$answer_with_cited_document</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">...</span></span>
<span class="line"><span style="color: #D8DEE9">$cited_documents</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">...</span><span style="color: #D8DEE9FF"> </span></span>
<span class="line"><span style="color: #D8DEE9">$matches_in_db</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">execute</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">db_search</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">cited_documents</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">$cited_documents</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #D8DEE9">$formatted_answer</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">execute</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">format_answer</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">ans</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">$answer_with_cited_document</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">docs</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">$matches_in_db</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #D8DEE9">bot</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">formatted_answer</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">print</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">answer</span></span></code></pre></div>



<p></p>



<h2 class="wp-block-heading">Two chat histories: one for displaying to the user, different to guide the model</h2>



<p>The most straightforward reason we must do something with history at some point is the fact that LLMs have limited context. But apart from that, one should understand what gets inserted into the model&#8217;s prompt and whether you are not wasting tokens unnecessarily.</p>



<p>By default, NeMo Guardrails inserts the action output into the prompt with such format:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="execute db_search
# The result was /* Full result returned from action here */" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">execute</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">db_search</span></span>
<span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">The</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">result</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">was</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88">/* Full result returned from action here */</span></span></code></pre></div>



<p></p>



<p>If our db_search returns a massive Json we have a problem. In the long run, it will fill up the context, but even before that, it can distract the model from paying attention to relevant parts.&nbsp;</p>



<p>It depends on the particular use case, but if all you want to do is display the results with, e.g. links and scores when left unchanged, the search results will be inserted into the prompt twice, once after action execution and the second time as a final bot answer if you use additional action for output formatting even thrice!</p>



<p>We can take advantage of filters to adjust that.</p>



<p>In general prompts, you can find templates like this:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# prompts_general.yml

{{ history | colang }}" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">prompts_general</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">yml</span></span>
<span class="line"></span>
<span class="line"><span style="color: #ECEFF4">{{</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">history</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">|</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">colang</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">}}</span></span></code></pre></div>



<p></p>



<p>— which takes the whole history and parses it into the prompt in colang.</p>



<p>To filter or modify some events, one can add a custom filter in such a way:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# config.py

def modify_actions(events: List[dict]) -&gt; List[dict]:
events = deepcopy(events)

# filter formatting since we will see the exact same string as a final bot answer
events = [event for event in events if not (event['type'] == 'InternalSystemActionFinished' and event['action_name'] == 'format_answer')]

for event in events:
if event['type'] == 'InternalSystemActionFinished' and event['action_name'] == &quot;your_action_name_here&quot;:

event['return_value'] = modify event['return_value']])

# filter formatting since we will see the exact same string as the final bot answer

return events

def init(llm_rails: LLMRails):
llm_rails.register_filter(modify_actions, &quot;modify_actions&quot;)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">config</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">py</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">modify_actions</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">events</span><span style="color: #D8DEE9FF">: </span><span style="color: #D8DEE9">List</span><span style="color: #D8DEE9FF">[</span><span style="color: #D8DEE9">dict</span><span style="color: #D8DEE9FF">]) </span><span style="color: #81A1C1">-&gt;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">List</span><span style="color: #D8DEE9FF">[</span><span style="color: #D8DEE9">dict</span><span style="color: #D8DEE9FF">]:</span></span>
<span class="line"><span style="color: #D8DEE9">events</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">deepcopy</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">events</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">filter</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">formatting</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">since</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">we</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">will</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">see</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">the</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">exact</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">same</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">string</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> a final bot answer</span></span>
<span class="line"><span style="color: #D8DEE9">events</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> [</span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">in</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">events</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">if</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">not</span><span style="color: #D8DEE9FF"> (</span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">type</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">==</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">InternalSystemActionFinished</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">and</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">action_name</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">==</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">format_answer</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">)]</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">in</span><span style="color: #D8DEE9FF"> events</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #81A1C1">if</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">type</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">==</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">InternalSystemActionFinished</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">and</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">action_name</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">==</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">your_action_name_here</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">:</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">return_value</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">modify</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">return_value</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">]])</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">filter</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">formatting</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">since</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">we</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">will</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">see</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">the</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">exact</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">same</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">string</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> the final bot answer</span></span>
<span class="line"></span>
<span class="line"><span style="color: #81A1C1">return</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">events</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">init</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">llm_rails</span><span style="color: #D8DEE9FF">: </span><span style="color: #D8DEE9">LLMRails</span><span style="color: #D8DEE9FF">):</span></span>
<span class="line"><span style="color: #D8DEE9">llm_rails</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">register_filter</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">modify_actions</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">modify_actions</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)</span></span></code></pre></div>



<p></p>



<p>And then use it in prompts like this:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# prompts_general.yml

{{ history | modify_actions | colang }}" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF"># </span><span style="color: #D8DEE9">prompts_general</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">yml</span></span>
<span class="line"></span>
<span class="line"><span style="color: #ECEFF4">{{</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">history</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">|</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">modify_actions</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">|</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">colang</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">}}</span></span></code></pre></div>



<p></p>



<p>We use deepcopy because Python&#8217;s dictionary modifications like my_dict[&#8216;key&#8217;] = val modify the variable passed to function, and without it in later chat rounds, we would have to check whether the value is already modified or not.</p>



<p>Sometimes, it does make sense to clean up the whole history. For example, a user intends to start from the beginning and send a new request. Without history cleaning, previously entered information might produce incorrect prompts and cause irrelevant search results. To achieve that, we define the following flow in the colang file:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="define user start new search
    &quot;I'd like to start a new search&quot;
    &quot;May I look for something different&quot;
    &quot;I want to try another conditions&quot;
    &quot;Forget all I asked before&quot;" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">define</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">user</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">start</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">new</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">search</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">I&#39;d like to start a new search</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">May I look for something different</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">I want to try another conditions</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Forget all I asked before</span><span style="color: #ECEFF4">&quot;</span></span></code></pre></div>



<p></p>



<p>After that, you can enhance modify_actions method with the following extract:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code=" history = []
    for event in events:
        history.append(event)
        if event['type'] == &quot;UserIntent&quot; and event['intent'] == &quot;start new search&quot;:
            while len(history) &gt; 0:
                history.pop()
    return history" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">history</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> []</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">in</span><span style="color: #D8DEE9FF"> events</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">        </span><span style="color: #D8DEE9">history</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">append</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">        </span><span style="color: #81A1C1">if</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">type</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">==</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">UserIntent</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">and</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">event</span><span style="color: #D8DEE9FF">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">intent</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">] </span><span style="color: #81A1C1">==</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">start new search</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">            </span><span style="color: #81A1C1">while</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">len</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">history</span><span style="color: #D8DEE9FF">) </span><span style="color: #81A1C1">&gt;</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">0</span><span style="color: #D8DEE9FF">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">                </span><span style="color: #D8DEE9">history</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">pop</span><span style="color: #D8DEE9FF">()</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #81A1C1">return</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">history</span></span></code></pre></div>



<p></p>



<h2 class="wp-block-heading">General tips</h2>



<p>Also, when working with NeMo Guardrails you may find those tips useful.</p>



<ul class="wp-block-list">
<li>Use chat mode instead of server when developing. It makes errors easier to spot and highlights output in verbose mode</li>



<li>Take advantage of Python&#8217;s logging module. Guardrails print a lot in verbose mode, and configuring different files as output for different modules makes reading much more convenient.</li>



<li>When using a custom LLM, explicitly log its inputs and outputs as this is the most fragile part of Guardrails. If your model is not following the colang pattern for getting the user intent you can&#8217;t move forward.</li>
</ul>
<p>The post <a href="https://tantusdata.com/insights/nvidia-nemo-guardrails-chatbot-development-guide/">NeMo-Guardrails</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>What if the data is too large for the LLM context?</title>
		<link>https://tantusdata.com/insights/what-if-the-data-is-too-large-for-the-llm-context/</link>
		
		<dc:creator><![CDATA[Bartek Sadlej]]></dc:creator>
		<pubDate>Thu, 28 Sep 2023 11:00:00 +0000</pubDate>
				<category><![CDATA[ChatBot]]></category>
		<category><![CDATA[LangChain]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[Self-Query]]></category>
		<guid isPermaLink="false">https://tantusdata.com/?post_type=insights&#038;p=1844</guid>

					<description><![CDATA[<p>Navigating Large Data with LLM: Splitting, Context, and Self-Query Solutions</p>
<p>The post <a href="https://tantusdata.com/insights/what-if-the-data-is-too-large-for-the-llm-context/">What if the data is too large for the LLM context?</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="513" src="https://tantusdata.com/app/uploads/2023/09/LLM2-1024x513.jpg" alt="" class="wp-image-1845" srcset="https://tantusdata.com/app/uploads/2023/09/LLM2-1024x513.jpg 1024w, https://tantusdata.com/app/uploads/2023/09/LLM2-300x150.jpg 300w, https://tantusdata.com/app/uploads/2023/09/LLM2-768x384.jpg 768w, https://tantusdata.com/app/uploads/2023/09/LLM2-1536x769.jpg 1536w, https://tantusdata.com/app/uploads/2023/09/LLM2-2048x1025.jpg 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>In the previous <a href="https://tantusdata.com/insights/what-data-format-is-suitable-for-llm/" target="_blank" rel="noreferrer noopener">article</a>, we covered extracting information from unstructured data. However, this is just the tip of the iceberg. Another problem can arise when you have long documents which don’t fit into the embedding model context length. The natural move in such a situation is splitting the documents into multiple parts. Another reason for using this technique is when the entire document does not create good enough embeddings. Last but not least, you might want to extract smaller chunks in order to lower the token usage.</p>



<p>The subject of the document or paragraph is usually at the beginning of the section. It does not show up in the latter parts of the document, so it is likely that when we just split the document into multiple parts, we end up with lots of documents that lack contextual information, for example.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="Subscription prices for 1 month:
20 USD / month


Subscription prices for 1 year:
200 USD / year
" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> month</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #B48EAD">20</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> year</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #B48EAD">200</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"></span></code></pre></div>



<p></p>



<p>In the snippet above, we see the price, but we lack information about what the price is for (for TV subscription, for broadband subscription)</p>



<p>When we query a vector database and provide the result to the LLM application, we will likely see that this document seems relevant to TV, broadband or mobile subscription requests. The reason is that we get a high cosine similarity score for any query related to the subscription price. So here we go:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from langchain.chains import RetrievalQA
from langchain.docstore.document import Document
from langchain.schema.retriever import BaseRetriever
from langchain.chat_models import ChatOpenAI


class ConstRetriever(BaseRetriever):
   def _get_relevant_documents(self, *args, **kwargs) -&gt; List[Document]:
       return [doc]
llm = ChatOpenAI(model_name=&quot;gpt-4&quot;)
retriever = ConstRetriever()
qa = RetrievalQA.from_llm(llm, retriever=retriever)


offers = [&quot;TV&quot;, &quot;Internet&quot;, &quot;Car&quot;, &quot;Gym membership&quot;]


for offer in offers:
   res = qa(f&quot;What is the {offer} subscription price for one year?&quot;)['result']
   print(res)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">langchain</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">chains</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">RetrievalQA</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">langchain</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">docstore</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">document</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">Document</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">langchain</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">schema</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">BaseRetriever</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">langchain</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">chat_models</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">ChatOpenAI</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">class</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">ConstRetriever</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">BaseRetriever</span><span style="color: #D8DEE9FF">):</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">_get_relevant_documents</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">self</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">*</span><span style="color: #8FBCBB">args</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">**</span><span style="color: #8FBCBB">kwargs</span><span style="color: #D8DEE9FF">) -&gt; </span><span style="color: #8FBCBB">List</span><span style="color: #D8DEE9FF">[</span><span style="color: #8FBCBB">Document</span><span style="color: #D8DEE9FF">]:</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">return</span><span style="color: #D8DEE9FF"> [</span><span style="color: #8FBCBB">doc</span><span style="color: #D8DEE9FF">]</span></span>
<span class="line"><span style="color: #8FBCBB">llm</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">ChatOpenAI</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">model_name</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">gpt-4</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">ConstRetriever</span><span style="color: #D8DEE9FF">()</span></span>
<span class="line"><span style="color: #8FBCBB">qa</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">RetrievalQA</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">from_llm</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">llm</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">offers</span><span style="color: #D8DEE9FF"> = [</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">TV</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Internet</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Car</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Gym membership</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">]</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">offer</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">in</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">offers</span><span style="color: #D8DEE9FF">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">res</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">qa</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">f</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">What is the {offer} subscription price for one year?</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">result</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">]</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">print</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">res</span><span style="color: #D8DEE9FF">)</span></span></code></pre></div>



<p></p>



<ul class="wp-block-list">
<li>The subscription price for one year is 200 USD.</li>



<li>The subscription price for one year is 200 USD.</li>



<li>The context does not provide information on the car subscription price for one year.</li>



<li>The context provided does not specify what the subscription prices are for, such as a gym membership. Therefore, I can&#8217;t provide the exact price for a gym membership subscription for one year.</li>
</ul>



<p>All those queries have ~0.85-0.9 cosine similarity with the example document. This document is ‘close enough’ and gets provided as input to the LLM. The LLM then has to decide how to answer the question. Moreover, if you think about it, the document content is not enough to say what the price is for, so the best you can expect is to say ‘I don’t know’ so at least it does not make information up, which it does not have in the first place. And that answer is not satisfying anyway &#8211; we do have the information about the prices, and we would like to chat to answer it. We just have to find a better way of providing it with the correct information.</p>



<h2 class="wp-block-heading">How do we tackle this problem?</h2>



<p>After splitting, the most straightforward solution is to include additional context for each part. For example, we can add “Details for the TV offer:” if those prices come from such an offer. It helps with solving hallucination problems, but the similarity score may remain high for such documents, which can prevent the retriever from fetching the most relevant documents. The model will answer that it does not have enough context to answer the question.&nbsp;</p>



<p>Another solution is to include document metadata and use a feature called self-query.</p>



<p>Here, instead of including context information directly in the document text, we set it as an additional filtering index and use LLM to produce a relevant query.</p>



<p>The difference is that even though the document has a high similarity score, it will not get fetched, and the retriever can provide genuinely relevant data sources. In other words, instead of relying on vector store to provide relevant documents only by their embeddings&#8217; similarity to the query, we add an extra index and provide the LLM with its description. The model can then decide whether to use it and with what arguments.&nbsp;&nbsp;</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Weaviate
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains.query_constructor.base import AttributeInfo


embeddings = OpenAIEmbeddings()


page_content=&quot;&quot;&quot;
Subscription prices for 1 month:
20 USD / month


Subscription prices for 1 year:
200 USD / year
&quot;&quot;&quot;


docs = [
   Document(
       page_content=page_content,
       metadata={
           &quot;product&quot;: &quot;TV&quot;,
       },
   ),
]
vectorstore = Weaviate.from_documents(
   docs, embeddings, weaviate_url=&quot;http://127.0.0.1:8080&quot;
)


metadata_field_info = [
   AttributeInfo(
       name=&quot;product&quot;,
       description=&quot;The name of the product for which the subscription prices are&quot;,
       type=&quot;string&quot;,
   ),
]
document_content_description = &quot;Details for all the products offers&quot;
llm = ChatOpenAI(model_name=&quot;gpt-3.5-turbo&quot;, temperature=0., verbose=True)
retriever = SelfQueryRetriever.from_llm(
   llm, vectorstore, document_content_description, metadata_field_info, verbose=True
)
qa_with_self_query = RetrievalQA.from_llm(llm, retriever=retriever, return_source_documents=True)


for offer in [&quot;TV&quot;, &quot;Internet&quot;, &quot;Car&quot;, &quot;Gym membership&quot;]:
    res = qa_with_self_query(f&quot;What is the {offer} subscription price for one year?&quot;)
    print(f&quot;n docs: {len(res['source_documents'])}, answer: {res['result']}&quot;)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">langchain</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">embeddings</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">openai</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">OpenAIEmbeddings</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">langchain</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">vectorstores</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">Weaviate</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">langchain</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">retrievers</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">self_query</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">base</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">SelfQueryRetriever</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">langchain</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">chains</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">query_constructor</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">base</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">AttributeInfo</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">embeddings</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">OpenAIEmbeddings</span><span style="color: #D8DEE9FF">()</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">page_content</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;&quot;&quot;</span></span>
<span class="line"><span style="color: #A3BE8C">Subscription prices for 1 month</span><span style="color: #D8DEE9">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">20 </span><span style="color: #8FBCBB">USD</span><span style="color: #D8DEE9FF"> / </span><span style="color: #8FBCBB">month</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">for</span><span style="color: #D8DEE9FF"> 1 </span><span style="color: #8FBCBB">year</span><span style="color: #D8DEE9FF">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">200 </span><span style="color: #8FBCBB">USD</span><span style="color: #D8DEE9FF"> / </span><span style="color: #8FBCBB">year</span></span>
<span class="line"><span style="color: #ECEFF4">&quot;&quot;&quot;</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #A3BE8C">docs = </span><span style="color: #D8DEE9">[</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">Document</span><span style="color: #D8DEE9FF">(</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">page_content</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">page_content</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">metadata</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">{</span></span>
<span class="line"><span style="color: #D8DEE9FF">           &quot;</span><span style="color: #8FBCBB">product</span><span style="color: #D8DEE9FF">&quot;: &quot;</span><span style="color: #8FBCBB">TV</span><span style="color: #D8DEE9FF">&quot;</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #ECEFF4">},</span></span>
<span class="line"><span style="color: #D8DEE9FF">   )</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">]</span></span>
<span class="line"><span style="color: #8FBCBB">vectorstore</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">Weaviate</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">from_documents</span><span style="color: #D8DEE9FF">(</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">docs</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">embeddings</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">weaviate_url</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">http://127.0.0.1:8080</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"><span style="color: #D8DEE9FF">)</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">metadata_field_info</span><span style="color: #D8DEE9FF"> = [</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">AttributeInfo</span><span style="color: #D8DEE9FF">(</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">name</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">product</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">description</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">The name of the product for which the subscription prices are</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">type</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">string</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">   )</span><span style="color: #ECEFF4">,</span></span>
<span class="line"><span style="color: #D8DEE9FF">]</span></span>
<span class="line"><span style="color: #8FBCBB">document_content_description</span><span style="color: #D8DEE9FF"> = </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Details for all the products offers</span><span style="color: #ECEFF4">&quot;</span></span>
<span class="line"><span style="color: #8FBCBB">llm</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">ChatOpenAI</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">model_name</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">gpt-3.5-turbo</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">temperature</span><span style="color: #D8DEE9FF">=0.</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">verbose</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">True</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">SelfQueryRetriever</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">from_llm</span><span style="color: #D8DEE9FF">(</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">llm</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">vectorstore</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">document_content_description</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">metadata_field_info</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">verbose</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">True</span></span>
<span class="line"><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #8FBCBB">qa_with_self_query</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">RetrievalQA</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">from_llm</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">llm</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">retriever</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">return_source_documents</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">True</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">offer</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">in</span><span style="color: #D8DEE9FF"> [</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">TV</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Internet</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Car</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Gym membership</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">]:</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #8FBCBB">res</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">qa_with_self_query</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">f</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">What is the {offer} subscription price for one year?</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #8FBCBB">print</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">f</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">n docs: {len(res[&#39;source_documents&#39;])}, answer: {res[&#39;result&#39;]}</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)</span></span></code></pre></div>



<p></p>



<p>This is the result produced by the code above:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="query='TV subscription price' 
filter=Comparison(comparator=<Comparator.EQ: 'eq'&gt;, attribute='product', value='TV') limit=Nonen docs: 1, 
answer: The TV subscription price for one year is 200 USD.
—-----------------------------------------------
query='Internet subscription price' filter=Comparison(comparator=<Comparator.EQ: 'eq'&gt;, attribute='product', value='Internet') limit=None
n docs: 0, answer: I'm sorry, but I don't have access to specific pricing information …
query='Car subscription price' filter=Comparison(comparator=<Comparator.EQ: 'eq'&gt;, attribute='product', value='Car') limit=None
n docs: 0, answer: I'm sorry, but I don't have enough information …
query='Gym membership subscription price' filter=Comparison(comparator=<Comparator.EQ: 'eq'&gt;, attribute='product', value='Gym membership') limit=None
n docs: 0, answer: I'm sorry, but I don't have access to specific pricing information for gym memberships. …" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">query</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">TV subscription price</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span></span>
<span class="line"><span style="color: #D8DEE9">filter</span><span style="color: #81A1C1">=</span><span style="color: #88C0D0">Comparison</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">comparator</span><span style="color: #81A1C1">=&lt;</span><span style="color: #D8DEE9">Comparator</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">EQ</span><span style="color: #D8DEE9FF">: </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">eq</span><span style="color: #ECEFF4">&#39;</span><span style="color: #81A1C1">&gt;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">attribute</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">product</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">value</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">TV</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">) </span><span style="color: #D8DEE9">limit</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">Nonen</span><span style="color: #D8DEE9FF"> docs</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span></span>
<span class="line"><span style="color: #D8DEE9FF">answer</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">The</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">TV</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">price</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">one</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">is</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">200</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #ECEFF4">.</span></span>
<span class="line"><span style="color: #D8DEE9FF">—</span><span style="color: #81A1C1">-----------------------------------------------</span></span>
<span class="line"><span style="color: #D8DEE9">query</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Internet subscription price</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">filter</span><span style="color: #81A1C1">=</span><span style="color: #88C0D0">Comparison</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">comparator</span><span style="color: #81A1C1">=&lt;</span><span style="color: #D8DEE9">Comparator</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">EQ</span><span style="color: #D8DEE9FF">: </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">eq</span><span style="color: #ECEFF4">&#39;</span><span style="color: #81A1C1">&gt;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">attribute</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">product</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">value</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Internet</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">) </span><span style="color: #D8DEE9">limit</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">None</span></span>
<span class="line"><span style="color: #D8DEE9">n</span><span style="color: #D8DEE9FF"> docs</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> answer</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">I</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">m sorry, but I don</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9">t</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">have</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">access</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">to</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">specific</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">pricing</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">information</span><span style="color: #D8DEE9FF"> …</span></span>
<span class="line"><span style="color: #D8DEE9">query</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Car subscription price</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">filter</span><span style="color: #81A1C1">=</span><span style="color: #88C0D0">Comparison</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">comparator</span><span style="color: #81A1C1">=&lt;</span><span style="color: #D8DEE9">Comparator</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">EQ</span><span style="color: #D8DEE9FF">: </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">eq</span><span style="color: #ECEFF4">&#39;</span><span style="color: #81A1C1">&gt;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">attribute</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">product</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">value</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Car</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">) </span><span style="color: #D8DEE9">limit</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">None</span></span>
<span class="line"><span style="color: #D8DEE9">n</span><span style="color: #D8DEE9FF"> docs</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> answer</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">I</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">m sorry, but I don</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9">t</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">have</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">enough</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">information</span><span style="color: #D8DEE9FF"> …</span></span>
<span class="line"><span style="color: #D8DEE9">query</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Gym membership subscription price</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">filter</span><span style="color: #81A1C1">=</span><span style="color: #88C0D0">Comparison</span><span style="color: #D8DEE9FF">(</span><span style="color: #D8DEE9">comparator</span><span style="color: #81A1C1">=&lt;</span><span style="color: #D8DEE9">Comparator</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">EQ</span><span style="color: #D8DEE9FF">: </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">eq</span><span style="color: #ECEFF4">&#39;</span><span style="color: #81A1C1">&gt;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">attribute</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">product</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">value</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Gym membership</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">) </span><span style="color: #D8DEE9">limit</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9">None</span></span>
<span class="line"><span style="color: #D8DEE9">n</span><span style="color: #D8DEE9FF"> docs</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> answer</span><span style="color: #ECEFF4">:</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">I</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">m sorry, but I don</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9">t</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">have</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">access</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">to</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">specific</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">pricing</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">information</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">gym</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">memberships</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF"> …</span></span></code></pre></div>



<p></p>



<p>The drawback of this approach is that it is significantly more expensive because additional calls are needed to provide this functionality. With LangChain OpenAICallback, we can easily monitor the API usage, and for the first solution, it is ~ 0.00015 $ per question, whereas for the second ~ 0.0015 $, so x10 increase.</p>



<p>We also have to keep in mind that creating metadata for splitted documents might not be trivial and may need human supervision.</p>



<p>All things considered, it’s not a surprise that LLM will be as good as the data you provide to it – the more detailed and relevant information you can provide, the higher the chance of getting a good response. Self-querying is a powerful technique which might be useful in the project you are working on. The exact decision on how to provide metadata and whether we should use self-querying depends on a specific business problem to be solved.</p>
<p>The post <a href="https://tantusdata.com/insights/what-if-the-data-is-too-large-for-the-llm-context/">What if the data is too large for the LLM context?</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>What Data Format is suitable for LLM?</title>
		<link>https://tantusdata.com/insights/what-data-format-is-suitable-for-llm/</link>
		
		<dc:creator><![CDATA[Bartek Sadlej]]></dc:creator>
		<pubDate>Tue, 26 Sep 2023 09:23:17 +0000</pubDate>
				<category><![CDATA[ChatBot]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[Embeddings]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[RAG]]></category>
		<guid isPermaLink="false">https://tantusdata.com/?post_type=insights&#038;p=1825</guid>

					<description><![CDATA[<p>LLM's Frontier: The Data Format Issue</p>
<p>The post <a href="https://tantusdata.com/insights/what-data-format-is-suitable-for-llm/">What Data Format is suitable for LLM?</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://tantusdata.com/app/uploads/2023/09/Data.format.LLM_.tantusData-1024x576.jpg" alt="" class="wp-image-1840" srcset="https://tantusdata.com/app/uploads/2023/09/Data.format.LLM_.tantusData-1024x576.jpg 1024w, https://tantusdata.com/app/uploads/2023/09/Data.format.LLM_.tantusData-300x169.jpg 300w, https://tantusdata.com/app/uploads/2023/09/Data.format.LLM_.tantusData-768x432.jpg 768w, https://tantusdata.com/app/uploads/2023/09/Data.format.LLM_.tantusData.jpg 1500w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading"><strong>Unpacking LLM: From Hype to Reality</strong></h2>



<p>Since ChatGPT and the recent release of Llama-v2 models, it is becoming increasingly popular to build context-aware LLM applications. One such use case is Question Answering over documents. Many focus on cool prototype examples where, after feeding lots of Wikipedia articles to the vector database, one can make sure that Joe Biden is indeed the president of the United States. Only a few focus on current limitations and unsolved problems, which make it challenging to arrive at production-ready applications.</p>



<p>At TantusData, we have been paying close attention to finding weak spots that need to be solved to provide desired functionality. In the upcoming articles, we will be presenting them.&nbsp;In this article, we will start with the challenges related to the data format.&nbsp;</p>



<p>The code examples currently use the most popular library for creating LLM applications: LangChain.</p>



<h2 class="wp-block-heading"><strong>Data Format</strong></h2>



<p>Imagine that you are building a chatbot to answer users&#8217; questions based on company offers.</p>



<p>The question we will be testing is:&nbsp;</p>



<p>“What are the prices for the internet subscription?”&nbsp;</p>



<p>Let’s assume for now that our database contains the document with relevant data:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="TV Prices:


Subscription prices for 1 month:
L - 40 USD / month
M - 30 USD / month
S - 20 USD / month


Subscription prices for 1 year:
L - 400 USD / year
M - 300 USD / year
S - 200 USD / year


Internet Prices:


Subscription prices for 1 month:
L - 50 USD / month
M - 25 USD / month
S - 10 USD / month


Subscription prices for 1 year:
L - 500 USD / year
M - 250 USD / year
S - 100 USD / year


Phone prices:


Subscription prices for 1 month:
L - 18 USD / month
M - 12 USD / month
S - 6 USD / month


Subscription prices for 1 year:
L - 180 USD / year
M - 120 USD / year
S - 60 USD / year" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">TV</span><span style="color: #D8DEE9FF"> Prices</span><span style="color: #ECEFF4">:</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> month</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">40</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">30</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">20</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> year</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">400</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">300</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">200</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Internet</span><span style="color: #D8DEE9FF"> Prices</span><span style="color: #ECEFF4">:</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> month</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">50</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">25</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">10</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> year</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">500</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">250</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">100</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Phone</span><span style="color: #D8DEE9FF"> prices</span><span style="color: #ECEFF4">:</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> month</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">18</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">12</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">6</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9">Subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> year</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">180</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">120</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">60</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span></code></pre></div>



<p></p>



<p>And when we provide it as a context to the question, we get the desired answer:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from langchain.chains import RetrievalQA
….


doc = Document(page_content=...)


class ConstRetriever(BaseRetriever):
   def _get_relevant_documents(self, *args, **kwargs) -&gt; List[Document]:
       return [doc]


llm = ChatOpenAI(model_name=&quot;gpt-3.5-turbo&quot;)
retriever = ConstRetriever()
qa = RetrievalQA.from_llm(llm, retriever=retriever)
print(qa(&quot;What are the prices for the internet subscription?&quot;)['result'])" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">from</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">langchain</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9">chains</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">RetrievalQA</span></span>
<span class="line"><span style="color: #D8DEE9FF">….</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">doc</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">Document</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">page_content</span><span style="color: #D8DEE9FF">=...)</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">class</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">ConstRetriever</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">BaseRetriever</span><span style="color: #D8DEE9FF">):</span></span>
<span class="line"><span style="color: #D8DEE9FF">   </span><span style="color: #8FBCBB">def</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">_get_relevant_documents</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">self</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">*</span><span style="color: #8FBCBB">args</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">**</span><span style="color: #8FBCBB">kwargs</span><span style="color: #D8DEE9FF">) -&gt; </span><span style="color: #8FBCBB">List</span><span style="color: #D8DEE9FF">[</span><span style="color: #8FBCBB">Document</span><span style="color: #D8DEE9FF">]:</span></span>
<span class="line"><span style="color: #D8DEE9FF">       </span><span style="color: #8FBCBB">return</span><span style="color: #D8DEE9FF"> [</span><span style="color: #8FBCBB">doc</span><span style="color: #D8DEE9FF">]</span></span>
<span class="line"></span>
<span class="line"></span>
<span class="line"><span style="color: #8FBCBB">llm</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">ChatOpenAI</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">model_name</span><span style="color: #D8DEE9FF">=</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">gpt-3.5-turbo</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">ConstRetriever</span><span style="color: #D8DEE9FF">()</span></span>
<span class="line"><span style="color: #8FBCBB">qa</span><span style="color: #D8DEE9FF"> = </span><span style="color: #8FBCBB">RetrievalQA</span><span style="color: #D8DEE9FF">.</span><span style="color: #8FBCBB">from_llm</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">llm</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF">=</span><span style="color: #8FBCBB">retriever</span><span style="color: #D8DEE9FF">)</span></span>
<span class="line"><span style="color: #8FBCBB">print</span><span style="color: #D8DEE9FF">(</span><span style="color: #8FBCBB">qa</span><span style="color: #D8DEE9FF">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">What are the prices for the internet subscription?</span><span style="color: #ECEFF4">&quot;</span><span style="color: #D8DEE9FF">)[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">result</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF">])</span></span></code></pre></div>



<p></p>



<p>The code results with the following answer from the chat:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="The prices for the internet subscription are as follows:

1 month:
L - 50 USD / month
M - 25 USD / month
S - 10 USD / month

1 year:
L - 500 USD / year
M - 250 USD / year
S - 100 USD / year" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9">The</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">prices</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">for</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">the</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">internet</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">subscription</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">are</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> follows:</span></span>
<span class="line"></span>
<span class="line"><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> month</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">50</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">25</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">10</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">month</span></span>
<span class="line"></span>
<span class="line"><span style="color: #B48EAD">1</span><span style="color: #D8DEE9FF"> year</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9">L</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">500</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">M</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">250</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span>
<span class="line"><span style="color: #D8DEE9">S</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">-</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">100</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">USD</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">/</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">year</span></span></code></pre></div>



<p></p>



<p>Great, it worked. So we are good to go? Well, not really. The hidden problem is that we usually don’t have the relevant documents in such a nice text format. Usually, the data comes from scraping web pages or parsing PDFs, and it might be originally displayed as a table.’When you think about it &#8211; the reasons are often quite natural. The idea often comes from a business unit which would like to limit customer service efforts. Customer service works with these documents &#8211; they are easy for humans to read. So, when we just mimic what a person does with the document, we might put ourselves in a tricky situation.</p>



<p>What we can do is pick one of the available loaders in LangChain, but we can end up with a text which is not so convenient to read by humans. Maybe it will be good enough for the model? Let’s see.</p>



<p>&nbsp;Let’s look at the example of a table in a pdf document.</p>



<figure class="wp-block-table">
    <table>
        <tbody>
            <tr>
                <td><strong>Service</strong></td>
                <td><strong>Period</strong></td>
                <td><strong>Subscription</strong></td>
                <td><strong>Price</strong></td>
            </tr>
            <tr>
                <td rowspan="6">TV</td>
                <td rowspan="3">Month</td>
                <td>S</td>
                <td>20 USD</td>
            </tr>
            <tr>
                <td>M</td>
                <td>30 USD</td>
            </tr>
            <tr>
                <td>L</td>
                <td>400 USD</td>
            </tr>
            <tr>
                <td rowspan="3">Year</td>
                <td>S</td>
                <td>200 USD</td>
            </tr>
            <tr>
                <td>M</td>
                <td>300 USD</td>
            </tr>
            <tr>
                <td>L</td>
                <td>400 USD</td>
            </tr>
            <tr>
                <td rowspan="6">Internet</td>
                <td rowspan="3">Month</td>
                <td>S</td>
                <td>10 USD</td>
            </tr>
            <tr>
                <td>M</td>
                <td>25 USD</td>
            </tr>
            <tr>
                <td>L</td>
                <td>50 USD</td>
            </tr>
            <tr>
                <td rowspan="3">Year</td>
                <td>S</td>
                <td>100 USD</td>
            </tr>
            <tr>
                <td>M</td>
                <td>250 USD</td>
            </tr>
            <tr>
                <td>L</td>
                <td>500 USD</td>
            </tr>
            <tr>
                <tr><td rowspan="6" style="border: 0;">Phone</td><td rowspan="3">Month</td><td>S</td><td>6 USD</td></tr>
        <tr><td>M</td><td>12 USD</td></tr>
        <tr><td>L</td><td>18 USD</td></tr>
        <tr><td rowspan="3" style="border: 0;">Year</td><td>S</td><td>60 USD</td></tr>
        <tr><td>M</td><td>120 USD</td></tr>
        <tr><td style="border: 0;">L</td><td style="border: 0;">180 USD</td></tr>
    </table>
    <figcaption class="wp-element-caption">Table 1</figcaption>
</figure>



<p>When we try to extract the text information from the table, the output depends on the pdf loader we selected:</p>



<figure class="wp-block-table wp-block-table--scrolled"><table><tbody><tr><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>UnstructuredPDFLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>PDFMinerLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>PDFPlumberLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>PyPDFLoader</strong></mark></td></tr><tr><td>Service Period<br>month<br>TV<br>year<br>month<br>Internet<br>year<br>month<br>Phone<br>year<br>Subscription<br>Price<br>20 USD<br>30 USD<br>400 USD<br>200 USD<br>300 USD<br>400 USD<br>10 USD<br>25 USD<br>50 USD<br>100 USD<br>250 USD<br>500 USD<br>6 USD<br>12 USD<br>18 USD<br>60 USD<br>120 USD<br>180 USD<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br></td><td>Service<br>Period<br>Subscription<br>Price<br>TV<br>Internet<br>Phone<br>month<br>year<br>month<br>year<br>month<br>year<br>S<br>M<br>L<br>S<br>M<br>L<br>S<br>M<br>L<br>S<br>M<br>L<br>S<br>M<br>L<br>S<br>M<br>L<br>20 USD<br>30 USD<br>400 USD<br>200 USD<br>300 USD<br>400 USD<br>10 USD<br>25 USD<br>50 USD<br>100 USD<br>250 USD<br>500 USD<br>6 USD<br>12 USD<br>18 USD<br>60 USD<br>120 USD<br>180 USD</td><td>Service Period <br>Subscription Price<br>S 20 USD<br>month M 30 USD<br>L 400 USD<br>TV<br>S 200 USD<br>year M 300 USD<br>L 400 USD<br>S 10 USD<br>month M 25 USD<br>L 50 USD<br>Internet<br>S 100 USD<br>year M 250 USD<br>L 500 USD<br>S 6 USD<br>month M 12 USD<br>L 18 USD<br>Phone<br>S 60 USD<br>year M 120 USD<br>L 180 USD<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br></td><td>Service Period <br>Subscription Price<br>TVmonthS 20 USD<br>M 30 USD<br>L 400 USD<br>yearS 200 USD<br>M 300 USD<br>L 400 USD<br>InternetmonthS 10 USD<br>M 25 USD<br>L 50 USD<br>yearS 100 USD<br>M 250 USD<br>L 500 USD<br>PhonemonthS 6 USD<br>M 12 USD<br>L 18 USD<br>yearS 60 USD<br>M 120 USD<br>L 180 USD<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br></td></tr></tbody></table><figcaption class="wp-element-caption">table 2 version selected</figcaption></figure>



<p>As you probably noticed, there are significant differences in the results.</p>



<p>It is probably not what we would expect. However, let&#8217;s check if the model can still get the correct answer:</p>



<figure class="wp-block-table wp-block-table--scrolled"><table><tbody><tr><td>loader</td><td><strong><mark style="background-color:rgba(0, 0, 0, 0);color:#2b3340" class="has-inline-color">gpt-3.5-turbo</mark></strong></td><td><strong><mark style="background-color:rgba(0, 0, 0, 0);color:#2b3340" class="has-inline-color">gpt-4</mark></strong></td></tr><tr><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>UnstructuredPDFLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">The prices for the internet subscription are as follows:<br>&#8211; 20 USD per month</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; 200 USD per year</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; 400 USD for 2 years</mark></td><td><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">The text doesn&#8217;t provide specific prices for an internet subscription.</mark><br><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color"><br></mark><br></td></tr><tr><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>PDFMinerLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">The prices for the internet subscription are as follows:<br>&#8211; Small (S) package: $20 per month or $200 per year</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; Medium (M) package: $30 per month or $300 per year</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; Large (L) package: $40 per month or $400 per year</mark></td><td><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">The prices for the internet subscription are:<br>&#8211; Small (S) size: 200 USD per month / 100 USD per year</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; Medium (M) size: 300 USD per month / 250 USD per year</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; Large (L) size: 400 USD per month / 500 USD per year<br></mark></td></tr><tr><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>PDFPlumberLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">The prices for the internet subscription are as follows:<br>&#8211; S: 100 USD per year or 6 USD per month</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; M: 250 USD per year or 12 USD per month</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">&#8211; L: 500 USD per year or 18 USD per month</mark></td><td><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">The prices for the internet subscription are:<br>For the S plan: 100 USD per year or 6 USD per month</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">For the M plan: 250 USD per year or 12 USD per month</mark><br><mark style="background-color:rgba(0, 0, 0, 0)" class="has-inline-color has-vivid-red-color">For the L plan: 500 USD per year or 18 USD per month<br></mark></td></tr><tr><td><mark style="background-color:rgba(0, 0, 0, 0);color:#c8855a" class="has-inline-color"><strong>PyPDFLoader</strong></mark></td><td><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">The prices for the internet subscription are as follows:<br>&#8211; For the S (Small) plan:&nbsp;&nbsp;&nbsp;</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; Monthly subscription: 10 USD&nbsp;&nbsp;</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; Yearly subscription: 100 USD<br>&#8211; For the M (Medium) plan:&nbsp;&nbsp;&nbsp;</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; Monthly subscription: 25 USD&nbsp;&nbsp;&nbsp;</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; Yearly subscription: 250 USD<br>&#8211; For the L (Large) plan:&nbsp;&nbsp;&nbsp;</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; Monthly subscription: 50 USD&nbsp;&nbsp;&nbsp;</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; Yearly subscription: 500 USD</mark></td><td><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">The prices for the internet subscription are:<br>For a month:</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; S: 10 USD</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; M: 25 USD</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; L: 50 USD<br>For a year:</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; S: 100 USD</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; M: 250 USD</mark><br><mark style="background-color:rgba(0, 0, 0, 0);color:#827a02" class="has-inline-color">&#8211; L: 500 USD</mark><br><br><br><br><br><br><br><br></td></tr></tbody></table><figcaption class="wp-element-caption">table 3 version selected</figcaption></figure>



<p>As we can see, only one of the four loaders managed to parse the file in a way the chat could understand it.</p>



<p>That is why it is often not straightforward to create a reliable data source for a question-answering model, and one should carefully investigate the format because it usually needs to be corrected. And it is easy to be overlooked because it does it silently when it fails.&nbsp;&nbsp;</p>



<h2 class="wp-block-heading">In summary &#8211; What can we do about the situation described?</h2>



<ul class="wp-block-list">
<li>First of all, careful engineering and spotting that the problem exists is a must &#8211; if you expected some shortcuts, I’m sorry to disappoint you. It’s very easy to build a system that hallucinates very convincingly.</li>



<li>Plan to test the result with domain experts.</li>



<li>Very likely, the pdf is created out of some billing system database, and relying on it will simplify the data extraction and make it much easier to follow updates.</li>



<li>Even if you rely on a database, you still should have data quality checks in place.</li>



<li>Last but not least &#8211; remember that Generative AI does not solve all possible computer science problems. In the case above, we are struggling with searching for documents in the first place. This problem still needs careful engineering &#8211; only when you provide some reasonable structure can you benefit from the LLM magic &#8211; human-like interactions and understanding how the data fits into the question being asked.</li>
</ul>
<p>The post <a href="https://tantusdata.com/insights/what-data-format-is-suitable-for-llm/">What Data Format is suitable for LLM?</a> appeared first on <a href="https://tantusdata.com">TantusData</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
