LongCut logo

Context Engineering 101: How to Build AI Agents That Actually Work

By myaicommunity || Learn AI PM with Mahesh

Summary

## Key takeaways - **Agents Break DAGs**: Coding agents like Lovable break the traditional directed acyclic graph (DAG) model by determining the next step at runtime based on user events, creating steps like building pages, getting pictures, and updating databases dynamically. [05:25], [06:24] - **Context Engineering Defined**: Context engineering is the art of managing and passing the right context—what has happened before—at each step of an agent to determine the next action successfully. [07:55], [08:04] - **Ingredients of Good Context**: Good context consists of memory, tools, signals, and knowledge provided at each agent step, such as HR policy for holidays, employee type signals, and search tools for booking. [10:20], [11:26] - **Prompt is Subset of Context**: Prompt engineering is just a small part of context engineering; prompts are user or system instructions, but full context includes step-by-step memory, knowledge, signals, and tools. [17:32], [18:05] - **NDA Context Failure Example**: Without proper context, the Legal Graph AI said an NDA 'looks mostly fine'; with full context like company policies and tools, it flagged unlimited indemnification that could bankrupt a company and perpetual auto-renewal. [21:06], [22:11] - **Context Overload Causes Drop**: Stanford research shows that as context grows to 18,000 tokens by the 60th step, accuracy drops 10 points to 122 tokens, forcing agents to restart due to confusion and distraction. [30:55], [31:18]

Topics Covered

  • Agents Break DAGs with Runtime Steps
  • Context Engineering Manages Agent Memory
  • Prompt Subset of Vast Context
  • Bad Context Causes 70% AI Failures
  • Context Explosion Drops Accuracy

Full Transcript

Before we go to context engineering, let's talk about short history of agentic AI with an example of how we reached here or what agents [music] are

promising with one good example of taking coding agents. What is needed to build agentic AI? That's where [music] where I'm trying to take you. uh what it

takes to build agentic solutions with an example of coding agent. Okay, let's go 60 years before. 60 years before if you wanted to write code, you have one

programming language called C. This is a paper I have a link for this is they're saying now use agents for context building too. So agentic context

building too. So agentic context engineering. uh this is from Stanford

engineering. uh this is from Stanford and they showed that at 60th step this token size is the context they were

giving 18,000 it drops to 122 and the accuracy drops 10 POINTS okay good morning I think it uh these days like at least first few weeks it

seems like uh in a new cohort it always feels like that uh I'm teaching the cohort class. Uh today is context

cohort class. Uh today is context engineering challenges and tips for PMs. I will talk about what context engineering is. If you have questions,

engineering is. If you have questions, I'm experimenting something new today. I

put a scramler for all of you. This is

for community. We do it in the cohort, but today I'm putting it here also. And

a lot of Puja and our team will make sure that you can put your questions.

You can put your questions in chat. Our

team can move it here. Ideally, you

should move it here. Okay. And let's

before we go to context engineering, let's talk about short history of agentic AI with an example of how we reached here or what agents are

promising with one good example of taking coding agents. Coding agents

anybody know coding agents? What are

coding agents? Give me examples.

>> Codeex >> can I say lovable? I think the abstraction I like them because they offer the highest level of abstraction.

>> So uh non-coders can be coders. So

that's I think I put them on the very top layer which is offers the highest level of abstraction in coding agent because they are writing the whole code for you and you don't even need to know but if you want to know they give you

the files you can see they're writing the code right so I will call them coding agent with a very high layer of abstraction and maybe the persona is not developers and that's what is giving me

a reaction on okay so we are doing context engineering but before we can do context engineering I am taking a droot to figure about what is this idea of

agentic AI with an example of coding agents. Agents

that can write code and we can apply this to anything to be honest like coding agents or chat bots or workflows that automates things in

your office. But how like let's take the

your office. But how like let's take the coding example and let's see like what is needed to build a gentic AI. That's

where I'm trying to take you. uh what it takes to build agentic solutions with an example of coding agent. Okay, let's go 60 years before. 60 years before if you

wanted to write code, you have one programming language called C. Basic.

Great. Great. Right. So, all you could do is if statements >> and then something came 20 years ago which is called DAG which is

>> anybody it's easy to call programming languages names. directed

names. directed graph >> directed as cyclic graph. What does that mean?

>> Uh it's a step it's a workflow where there are no cycles and >> great. So you cannot trace back the

>> great. So you cannot trace back the arrows in this. You can only go forward.

You can loop through things on an event.

You can re react and you can create a graph of nodes which do things and there is nothing back. And this is how most of

the work that any app that you see today ATMs, your Facebook, any code that you have seen is based on some kind of a

direct direct directed as cyclic graph which is a node of loops. If else and if then event happens then that thing happens too. Good. Then

happens too. Good. Then

first time ever we are saying that hey you know what this was fun but let's make a promise that on agents you don't

need DAG so we are saying that hey how can we code without doing or knowing a direct asy as a ascyclic graph or some

kind of this idea of writing if this comes then do this then do this and there's only one flow let's break that This is the first time these coding agents are able to break it. And how

they work is based on an event, you can have different blocks of code or different things that you can do loosely given

to some kind of thing which just determines the next step. Determine next step.

step. Determine next step.

And then when I go and I say hey create a website with lawnmowers checkouts then it says okay there's an event on

that I need to create a website so I will call this page this framework I need to create an index page okay on that index page once it is done

it submits that work then it says okay I have this but I need pictures okay then get pictures that generates creates another steps and

maybe that whole thing is called get pictures and then it submits it to work again and then it goes and says now

write buttons for checkout for action with a database update the database create and update the database then

comes back again and then it goes on and on till it has a website anybody should I show you loveable or everybody's comfortable knowing what loveable and how it works

Behind the scenes, there is nobody defining how these blocks need to be connected before the user comes in and asks. Based on the user event, it

asks. Based on the user event, it determines the next steps and creates whatever needs to be done at runtime.

This is how all of the agents that you have seen, whether they are doing code or no code are working behind the scenes.

And that's the world that you live in today. If you live in that today, live

today. If you live in that today, live in that world today, that breaks the dag.

So I'm showing you a kind of gif which does a better work than what I was able to do on the board. But the idea is same which is based on a user event. You come

in and you say hey you determine the next steps based on that you call an API based on the results you define this event has happened. I have called this

API. This is the result of that API. Now

API. This is the result of that API. Now

let's define the next step. And then you keep adding this. All of this is the context that is getting built. And now

as product managers or engineers, you have to deal with this context before you go to the next step because this determine determine next step requires this context of what has happened

before.

Good. Now figuring out how to pass this context or defining what context need to be passed or managing this context is

the art of context engineering.

Good. Any questions? So far we have just gone into history and we got figure out how the agents work and how on each step in a agent there is context that is

getting built.

>> Okay. Is is the context same as memory?

Like uh it's essentially >> good question. Let's we'll take it.

We'll take it. We'll take it. Sheila,

just hold on to that. I will take it in a second. Rudy,

a second. Rudy, >> um my question is how do you know that you've reached the final answer?

>> Yeah, good. So this determine next step, this is an orchestration thing which also has some kind of a way to know that it checks every time have I reached the goal. If I not reach the goal then it

goal. If I not reach the goal then it says okay that also is a system inside which also need the context of what user wants to do. It has figured out how what success looks like and it compares to

the success >> and then it comes up with yes or no and maybe just a prompt to an LLM saying hey I have I achieved the goal or not >> and if it says yes you say yes if you say no it is no

>> okay I will take later questions uh just hold on I just take two questions in between so next time just try to beat it there is a race to race to fun okay similarly what I did for the code the

same applies to the chat bots also right when you ask them a question which is like hey do a research on blah blah blah and if you can see when they say thinking they are doing many steps

they're saying I do this first then I go do this then I go do this then I go do this and eventually they combine and give you a report right but they are doing lot of things

and they are building this context while they are doing the next step they're building the context of the previous step so now all of us know how like have a basic idea of what's going behind the

scenes in the coding agents or any agents that you are playing with that it is not solving everything in a single step it's solving it in multi-step that's the only thing I want you to take away and if it does in multi-step then

it needs to know what happened before right and if you need to know what happened in before in this community we also know that agents have this kind of

a structure that they have some kind of intelligence layer we call them llms or foundation models or something that is the intelligence layer. They also have a

context of what are the four things here?

>> Knowledge, >> memory, signal, >> knowledge, >> tools, >> tools, tools, >> tools. Great. So each step I need to do

>> tools. Great. So each step I need to do all this, right? So I have the context of what. So then I can have a table

of what. So then I can have a table and I can define on each step what is the knowledge to answer this question.

Look here, do this. If I want to define like reverse engineer it then I know the perfect answer to a question or perfect thing at each step I need to define what knowledge or what database or what

documents to read. Let's take an example. The example if the person comes

example. The example if the person comes and says hey first figure out how many holidays are left for this year and for the exact number of holidays please book

a hotel for me in Cancun. My budget is $10,000 and I have two kids and I'm looking for a fully blah blah resort.

How many holidays are left? If how many holidays are left, I need to I can specify what knowledge I need to know my HR policy of holidays. I also need a

signal which is what kind of employee it is. Is it a permanent employee,

is. Is it a permanent employee, contractor, right? Memory. We will start with the

right? Memory. We will start with the customer request and then we have to build and then tools. I will need tools like in the first call I just need to do

maybe research the search tool that is it and then guardrails whatever guardrails you want to put in is whatever the user specified or our

system specified guardrails this is a context that is needed but this is not a one-step problem right first step is first determine I what is

the user's number of vacations left once I have that context. Then it checks and determines the next step which is do

research and find out rest uh resorts within less than 10,000 and rank them with the priority kids friendly in this area this area whatever I'm vegetarian

and all that those things right so whatever preferences I provided or it can have previous preferences stored in our signals with the previous things

they did with us so that context needs this context need to change now.

>> So does it sound like >> flowchart mah?

>> Yeah. So at each step as a product manager what you do is is what you can do is uh which is you have to figure out what are the

different context that our query will take based on a user request and then fit it inside that context window. What

is a context window?

>> Yeah. So some user needs to send a request >> and you send it to intelligence with all these things. The length

these things. The length >> context window is what you send the size of that all this information you are sending that can be called context and

the window is how much you can supply it in one request. What's the limit on that? Is there a limit on it? It's or

that? Is there a limit on it? It's or

unlimited. I can send as much as tokens or words to this every time I query. Can

I send everything or can I I have a limit >> on GPT5 GP5 limit GP tokens >> and we settle 256k was good enough by

the way >> was ask me about >> okay yeah so let's continue okay so 1 million tokens that correspond to maybe like some 500 pages the users you need

to just tell me that and I can on each step give 500 pages of information to get better responses. Now, a good agent will have those 500 pages or 100 pages

or 50 pages stuffed with the right context. Right context of memory, right

context. Right context of memory, right context of what tools you should call, right context of what knowledge you should extract and a bad agent can have 500 pages of just crap, right? And

that's what we are seeing in the world that so that's the idea that context engineering is the this art of filling the context window on each step not only

once on each step of that determining next step with the right context into this and what's what is the ingredient of that memory tools signals

your prompt your guardrails all of these are the tenants and you fill what is the value at that step or your system fills that. Okay. So all of us have a common

that. Okay. So all of us have a common understanding of what is context engineering at least some idea

>> if the the context change um each time there is an event or how do we update?

>> Yes. So every time there is a change in the event or even within a event one graph or one quick execution you do this

context need to change on each step inside that event also. So event is a customer comes and says book my Cancun trip right and gives you some instructions but on that event you will

build context not for everything you will first break it down and then say for executing the next step here is my con context which is the next step is first find out how many mations Mahes

have because maybe he has zero >> how does it impact the memory >> the memory is >> it impacts the memory >> yeah but but the model context so you to

see here right the context goes here so it's not only memory it's also what knowledge you will need what tools and what is the input to those tools and what was the output from the tool tool

in this case could be that I did some research on how many restaurants are in how many resorts in Cancun with

a kids friendly offerings under $10,000 so that would be your next step right so that's the context that came from that research that the tools Okay, then a

context to AI is everything here. The

user provided instructions, the guardrails, the memory, the tools, the signals, the knowledge. How is it different than prompt engineering?

Anyone?

This is fun, right? Because I can only ask questions. By the way, we're just

ask questions. By the way, we're just getting started in this cohort. Uh in

this session, uh my name is Mahish. If

you don't know me, this is my LinkedIn profile. Uh you can connect me. I accept

profile. Uh you can connect me. I accept

connections. I am happy to help you in your journey in becoming a good AIPM or build good builder in AI. Uh if you sign up this form, we will invite you to these sessions automatically. So you

need not to do it. We do these sessions every Friday and all these people who are so cool here have done our cohort or either been this community for long and we can do the same to you. There's no

privileges that they have, you can have the same by filling the form. Uh that's

it. Let's get back to then people talk about prompt engineering and context engineering is a new word in the prompt engineering world. Is there a difference

engineering world. Is there a difference anyone?

>> Uh so engineering context >> engineering I think context >> prompt is just one part of the context right?

>> Great. So prompt is what the user tells you which is books you book me ticket and then you can also have a system prompt which is hey act like this helper

or a concurge to these queries. So you

will have a system prompt and a user prompt but the context is much bigger which is what you build on step by step what knowledge you will need to solve this what memory you will need what

signals and all that. So prompt is a subset.

So prompt is a smaller thing. Context is

a larger thing. So I have written a lot of list that you can look at different tenants and read through it. The idea is that prompt engineering is part of context engineering. That's all I want

context engineering. That's all I want you to take away and you can read more when you have time. Okay. Then what is the impact of good versus bad context with real examples? So now you

understood what context is. Let's go a little bit deep and let's see what is the idea of good or bad context. Right?

So good or bad context what can it do?

Uh this is the CEO of box Aaron I like one thing about him is he write continuously. This is what he wrote one

continuously. This is what he wrote one week ago. He said that hey in the agent

week ago. He said that hey in the agent product management is just figuring out what is very smart person without any initial cons context whatsoever would need to perform the task successfully.

by the time every time they are debugging anything it's just figures it's in their agents when they're figuring out anything the challenge is that it's the wrong context so every

debugging becomes a context exercise and if you got the right context and LLM or intelligent layer or GPT files are not failing you it's filling the context

correctly is what failing you good so context is all you need if context is all you need then what's not working in context So rag is a way to get the right

knowledge and to extract the right knowledge you need the context of what happened before right. So I if I know that I need the

right. So I if I know that I need the right knowledge then I know what you told me or what I have done in first three steps and in this paper or this

source they figure out 20 to 40% drop in a retrieval accuracy. We can't even retrieve the right knowledge because we don't have the right context

from the previous steps. Similarly, 60%

of AI failures or even 70% is just because we don't have the right context.

And 45% of enterprise AI is failing because we have not thought a lot about context engineering. We thought a lot

context engineering. We thought a lot about, you know, whatever. I'm not going to go there today. Okay.

Let me give you a real example. So, this

is an app. We built this app. What it

can do is a little bit of our marketing.

I'm Mahish is built with so much marketing. So this is legal graph. This

marketing. So this is legal graph. This

is a company that we gave a name and what we do is we help lawyers or anybody out there to basically not sign a contract without knowing what is in it.

So what you can do is you can get our plug-in from word. You can upload any contract. This is an NDA. Uh you can

contract. This is an NDA. Uh you can click legal graph. It will go it will analyze this document. Will take one minute. Create a graph of what are the

minute. Create a graph of what are the clauses, what is important, what's not.

And then when you click start analysis, it will give you risks.

Right? So that's the tool. What happened

in this tool in our own journey is that when we did it on an NDA, we said hey go ahead and tell me what is the risks in

this NDA and the response was this looks mostly fine. There are some liability

mostly fine. There are some liability things but not too bad. Yes, there was something about renewal but did not see critical. should be okay but can revise

critical. should be okay but can revise if needed. This is a normal prompt and

if needed. This is a normal prompt and the prompt engineering was very good. We

said role goal instructions and everything.

Then we gave we improved our context and we gave it all the context. We said hey this is your system instructions. This

is the user input. This is what we have in your short-term or chat history. This

is the company policies. This is the retrieved knowledge from the knowledge bases. These are the two tools you can

bases. These are the two tools you can call. Actually, there is an

call. Actually, there is an indeification disclosure that has no limitation. What does that mean? That if

limitation. What does that mean? That if

you sign this NDA, you will indeinify them for whatever amount they want you.

There's no limit to it. Which is a very critical thing in an NDA. Any lawyer

will stop you. But our tool was not stopping it. But when we gave the right

stopping it. But when we gave the right context, it start doing that. Right? So

that means that they can bankrupt you.

Even if you just let's say this company signed an NDA with you and let's say you ended up saying their name or your employee ended up saying their name, they can just claim whatever damage.

Whatever damage means they can claim everything your company is worth and your life, right? So you need to negotiate a cap on

right? So you need to negotiate a cap on liability and similarly this agreement automatically renews successfully on oneear term. That's a very bad term.

oneear term. That's a very bad term.

That means this NDA applies to you for perpetuity.

That means even if you spoke to them once and then you never spoke to them in 20 30 years and then also somebody came and talked about it they will come and sue you and this is left and right in

these NDAs but without context our agent couldn't find it but with context it can find these real challenges and that's what a good and bad product differentiation is and that's why we

believe context engineering is the thing that will differentiate awesome agents versus bad or average agents.

And that's why I think it's super important for all of you to learn it. So

what is not working? What is such challenge? So if context engineer is

challenge? So if context engineer is everything then why can't we just do go do context engineering? This is the right placement team. Just feedback for our team. [laughter] They push these

our team. [laughter] They push these things in between. I put the deck and they push these things. This was after this. Okay. The next session is

this. Okay. The next session is transformers. If you liked the session

transformers. If you liked the session on neural networks that we did last week which is on YouTube now, please watch it. I'm doing a similar session to make

it. I'm doing a similar session to make you understand transformers. Neural

networks is much easier to understand.

Transformer paper, the attention is all you need paper is very hard to understand. And I will try to use my

understand. And I will try to use my magic or whatever you taught me so far to decode transformers for you and explain to you how we are able to encode

attention using machines. And that has unblocked everything you are seeing in GPTs or all this AI craziness. The

foundation of that is transformers and I will try to unbox that for you in a language that you can understand. So

that's our session next week. Uh if you have not joined our cohort, I want you to join our cohort. I think it's the best cohort. Everybody in the cohort say

best cohort. Everybody in the cohort say it's the best cohort. No, no, don't do that. But if you just begin, I think

that. But if you just begin, I think it's a really good time to join because uh if you join now, you have only missed one week. Uh and cohort people will tell

one week. Uh and cohort people will tell you you have missed everything. But I

think we can bring you back. But you

start now. I think you can use holidays and we can give you January also. So you

have this good window of learning. Uh we

will be doing an interview prep as soon as you come from vacations in January. I

come back from those four days and uh that is free and uh I believe just there is a lot going on and without this you will not be successful. So I highly

recommend you to join the cohort. It's

if you join now you get like almost three months with us. So please do that.

Now let's talk about problems. Okay, what are the main problems? If agentic

AI is figuring out automatically what steps we need to take to make jobs done or do things for

people and that is a step-by-step process on each step you have to provide lot of context to be successful and we just saw that everybody is saying if I

got the right context it's like uh who said it if you give me a long enough pulley I can move the Okay, >> I think that's Thank you. See a little

bit of me there. Uh let me know you there, not me. But now the new saying is if you give me the right context, I can solve every world problem. Okay, then

what is the problem in giving the right context? One is context confusion. It's

context? One is context confusion. It's

really hard to keep building this context because when you start building this context and passing it to the model step by step on one step you will say something on

the other it will say something and you will have most of this is also irrelevant information that will go into

this context window and that will confuse the model rather than helping the model.

that is causing especially which tools to call we saw only 30% accuracy the more context we

added this reduce from 100% to 30%. That

means if I give you tools like access my database, update my database, do search, book tickets,

which tool to use based on the user query or based on that next step, the accuracy of that dropped the more context I added because there was just confusion because

in the previous step it was this tool and if I bring it back then it was like okay maybe I need to call the search tool And because search is a solution to everything that comes with research and

now you start talking about it. That's

why there is a lot of that confusion and because of the confusion the tool calling started failing seven out of 10 times. So that's one problem. Ne second

times. So that's one problem. Ne second

is clash. What happens is there is just contradictory information in this. So

first step you will say that this is a contractor and in between you did this search and there is some information that said that hey we give discount to contractors

and full-time employees have this discount and then there was this contractor or full-time. So now I have two things full-time and contract and

then when I'm going to apply discounts I just applied the discount of full-time when I'm booking the tickets.

So just there's a clash of information and who wins that clash is not in user control. So the more information you put

control. So the more information you put the more probability there is that there is conflicting information in there and that is causing lot of problems

and then just the idea that there is lot of distraction.

The moment you start sending 500 pages of information with a user request of just 20 words, there's just too much information that the models don't even sometimes know

what they're talking about. You have not seen that problem in chat GPT but h behind the scenes it's happening. They

are just covering it up with their guardrails because they have put lot of checks and balances which you can't afford.

So in your agents if you just put blindly the context of everything and just start adding it you will see it will just take a very different route.

[clears throat] Yesterday somebody was explaining in our cohort that when they go to love or bolt or this then what they get is after 30 steps of building a

website after 31st step this bolt doesn't know what it is doing. It just

takes you and start doing some screw up stuff. You have messed the context

stuff. You have messed the context enough.

Hallucination is like a is like a small part. Distraction is you are just gone

part. Distraction is you are just gone on a very different tangent, right? It's

just like a super uh it's like uh it's like a supererset. I wanted to say somehow not in mathematical terms but I chose superset but the idea is it's a hallucination is like a very small thing

but at this time it's not even hallucinating because you have distracted it enough right so my >> distraction different than confusion in

confusion also we are giving lot of um material right so how is that >> so in this you have put confusing information

that is tends confusing This is you have put so much information that it's getting distracted >> just like abundance of information and this is more like there is a confusion

in the instructions that you have added in your context or some agent has added because the user is not adding it. The

user just said hey I want to book tickets to Cancun please check out how many vacations I have and then book them. But you have to take maybe 27

them. But you have to take maybe 27 steps behind the scenes. And by the time you reach 25th step, this is just 500 pages. And now I as LLM, I'm just don't

pages. And now I as LLM, I'm just don't know how to handle it because the larger the context. The larger the head, the

the context. The larger the head, the stronger the headache. The good example of this is this paper. I think Ricardo sent it to me. So I ended up reading it.

This is a paper. I have a link for this is they're saying now use agents for context building too. So agentic context engineering uh this is from Stanford and

they showed that at 60th step this token size is the context they were giving 18,000 it drops to 122 and the accuracy drops 10 points right so this happens

automatically so the more context you were providing as the context increases your accuracy was also increasing so the more context you were adding step by

step you had a good gain but then suddenly 60th step they saw it dropped significantly and after that you have to start from scratch. That is for all LM.

>> Sorry, sorry.

>> Is this for all LLMs or for GP5?

>> Means I think they put an experiment, right? And they were running

right? And they were running experiments. So they might have

experiments. So they might have validated for all or some. But the idea was that the larger the conversation goes, the higher chances of you filling

this context with something crap and eventually it breaks out and after that your LLM has to start from scratch in accuracy which for a user will be a very surprising event. Here you can just

surprising event. Here you can just looking at a graph but here suddenly it asked me hey how can I help you? This

happened with my son by the way yesterday. He was playing with Chad GPT.

yesterday. He was playing with Chad GPT.

We play in voice mode. He's just four year old. So first 15 minutes we play

year old. So first 15 minutes we play soccer and then next 15 minutes we do a quiz and we keep chat GPT in between and then we were asking a lot of quizzes questions and at some point what happens

is it just said hey how can I help you and he was like I was talking about fruits and monkeys and then they were like yeah fruits and monkeys are great they go together. He was like but we

were doing this question and there was no context left at that point. I was

like dude you have no clue what happened here behind the scenes. they forgot

everything that you have done. They are

going on a very different tangent. And

then he was like this is so stupid. I'm

like yeah this is so stupid because you think you're talking to somebody who is listening to you all along but here you're talking to somebody who just forgot or have too much that it has

taken a very different route in the next step and is going on a very different tangent and there's no way for me to bring it back here.

Right? So if you understood what is context engineering, if you understood what are the challenges then so what why should you care right let's do that in 5

minutes and then I leave you every time I do these sessions I want to focus on now what and I ended up focus on what a lot so hopefully one day we will reach there first step you need to do is you

need to write better requirements across these context specific things so you need to write requirements for memory management context size and structuring what is

tools retrieval what is observability and safety on your agents that's your responsibility nobody's coming and saving it from that so stop like saying that this is input output say for this

input and this output I need this context so provide a table in your requirements on each step take a sample conversation so when I was at this large search

engine company we were just debating how should we give the best requirements to engineering and we came up with this format of left hand side we will say

user asked this our response is this [clears throat] >> on left hand side on right hand side we created a table and in that table we said this is the knowledge you will need

to extract this is the tools response we want these tools needs to be called this is the response this is what ideally should go in memory and that's the kind

of detail we provided them on five most most occurring uh conversations on our agent.

We're building a support agent, right?

So, first we provide the requirement, then we provide examples too. So, at

least for one example or five examples, our coders can check are the tools retrieving the right thing? Is the

memory getting handled? Is the knowledge doing the right thing? And they are meeting us.

Then if you can define the requirements, you can define the criteria also.

Context aligns with the user goal and intent. How many time that happens? You

intent. How many time that happens? You

can check that in the conversations. You

can have logging and you can check the context and you can build another llm and say this context was relevant to this query and not. So when my son's

banana and roots example drops it will flag it and your helpfulness score will drop and then you can say okay it's dropping for these kind of conversations we need to change the code or assign a

bug to. So this is your criteria. This

bug to. So this is your criteria. This

is is the source being attri attri to you can attribute to a source or it's just some context that got built and now it's just an hanging thing

and then you can also define PII. I will

give you a detailed list of these. If

you can do that then you can set the correct scorecard because if you can't set the right dashboard or right scorecard then you can't make cross

functioning team towards the right goal.

Right now everybody's going towards the goal how can they launch the best next agent or how can they bring AI to their company but you say hey what is our intent alignment score of your agent

that's a matrix you follow how how will I know what is my agent aligned to the intent because I have done this one I will take the aggregation of these and

if it is not aligned then this score percentage of this is lower what is the noise ratio how much of the tokens We are fitting in the content context are

relevant and how many are just waste and your goal as an engineer is to increase your entire alignment and decrease your noise ratio.

Good. I put more here for you to think about. This is only I created. I don't

about. This is only I created. I don't

think it's still in textbooks and influencers don't have them yet unless they watch this recording and then they can create a fancy chart of it and then they will tell you that uh AI is

changing everything. This is the new way

changing everything. This is the new way of doing things and all that. If you

want to stay ahead these are the resources there are few papers I have read to just make sure that I can land these ideas better. This helped me design this content. So you can get the

source also and get more information. If

not then we talked about what is context engineering how is it different than prompt what is the impact of a good versus bad we said what is the requirements that you should be writing how can you evaluate and then how can

you set your scorecards by the way amah I think all of us even if we are small we can defeat the big guys in the new world and that's the mission I'm on uh I teach a course but I'm also start to

build a company I believe that small people can do big things if we align the technology to our values and do big things without being backed by big

investors who just throw stuff at you like in context engineering. Okay, then

I will see you again next Friday. We

will go back to the world and see how transformers work. Bye team.

transformers work. Bye team.

Loading...

Loading video analysis...