Skip to main content

Async tracking and streaming

note

Since LangChain version 0.0.155, there have been some changes, and this special treatment for async is no longer necessary... all you need to do is wrap your chain calls in PromptWatch context.

Thus this document is not longer relevant

However, in async mode, there are some limitations where we can't capture all the data that we track in sync mode. This can be overcome by registering the prompt template.

Introduction

Async tracing is generally useful when you want to save time waiting for the trainer, but let's be honest, compared to the LLMs execution time, it is very likely insignificant.

Where it is necessary though is if you want to stream tokens as they become available (continuously write response, like in ChatGPT)

How to do async tracing

In general, if you want to use asynchronous tracing, you need to use AsyncCallbackManager, which comes with LangChain but is not the default callback manager. Therefore, we need to assign it manually to all the chains.

from langchain.callbacks.base import AsyncCallbackManager
from langchain.llms import OpenAI

template = """Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

{history}
Human: {human_input}
Assistant:"""


prompt_template = PromptTemplate(
input_variables=["history", "human_input"],
template=template
)

# create instance of AsyncCallbackManager
async_manager=AsyncCallbackManager()

#and set it to OpenAI LLM
streaming_llm = OpenAI(
streaming=True,
callback_manager=async_manager,
temperature=0,
)

# as well as into the chain it self
answer_chain = LLMChain(
llm=streaming_llm,
prompt=prompt_template,
callback_manager=async_manager
)

answer_chain.run("my question")

How to integrate PromptWatch with async handler

However, there are two important caveats to this approach, that causes that the most easiest way of integrate PromptWatch doesn't work here.

  1. since we swapped the default callback manager, we need to assign callback handlers for tracing manually
  2. we are leveraging inspect module to get to some information about the state of LLM that is not available through standard LangChain API... this however approach doesn't work with async function. This can be overcome though if you register your prompt. That way we can do some monkeypatching and wrap the functions we need to deliver full potential. This step is optional. If you don't need the LLM playground functionality of PromptWatch here, you don't need to register your template.
# create PromptWatch instance
pw = PromptWatch(tracking_project="chat")

async_manager=AsyncCallbackManager()
# 1. get promptwatch get_langchain_callback_handler and add it to async manager
async_manager.add_handler(pw.get_langchain_callback_handler())

# 2. highlight-next-line
pw.register_prompt_template("assistant_template", prompt_template)

so the final code would look like this:

from langchain.callbacks.base import AsyncCallbackManager
from langchain.llms import OpenAI
from promptwatch import PromptWatch

template = """Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

{history}
Human: {human_input}
Assistant:"""


prompt_template = PromptTemplate(
input_variables=["history", "human_input"],
template=template
)

# create instance of AsyncCallbackManager
pw = PromptWatch()

async_manager=AsyncCallbackManager()
# 1. get promptwatch get_langchain_callback_handler and add it to async manager
async_manager.add_handler(pw.get_langchain_callback_handler())

# 2. highlight-next-line
pw.register_prompt_template("assistant_template", prompt_template)

#and set it to OpenAI LLM
streaming_llm = OpenAI(
streaming=True,
callback_manager=async_manager,
verbose=True,
temperature=0,
)

# as well as into the chain it self
answer_chain = LLMChain(
llm=streaming_llm,
prompt=prompt_template,
callback_manager=async_manager
)