LLM Context Rot

LLMs are context-driven services. You feed them some context and they generate a response. The context represents the entire chat conversation (or as much of it as fits within the context window), and any additional files attached to that conversation.

Context rot is when the LLM context has degraded so much that it becomes unusable.

Excluding the response from the initial prompt, all other responses are a result of the current prompt plus the chat history. This means that the LLM is using its own previous responses as context for generating new responses.

context = []
while user_prompt := get_user_prompt():
    context.append({"role": "user", "content": user_prompt})
    prompt_response = get_llm_response(context)
    context.append({"role": "llm_assistant", "content": prompt_response})
    print(prompt_response)

It’s well known that LLMs sometimes hallucinate. I once read that LLMs, in fact, always hallucinate; it’s just that sometimes they also get things right. I like that definition more. The accuracy of their responses is probabilistically determined. When you include incorrect data in the context, the probability of getting incorrect responses increases.

Additionally, I’ve seen LLMs reuse previous incorrect approaches, even after explicitly being told they were incorrect.

Do this thing

Incorrect approach

That's incorrect, do it this way

Correct approach

Now do this other thing

Reuses the same initial incorrect approach

In my opinion, all LLM chats inevitably converge toward context rot as their context becomes increasingly polluted. I’ve had to ditch chats countless times because the LLM started spewing nonsense and no corrective prompts could put it back on track. I have to admit that my prompting skills definitely need improvement though.

The solution is simple, open a new chat and rework the context using the lessons learned from previous chats. Your prompts will likely be better this time around, and you’ll probably get better results.