Web search
Running web searches for getting current information
Web searches matter especially for questions where the answer can change between when the language model was trained and today. For example:
What was the weather on this date?
What is the market cap of Google?
Who is the president of the United States?
If you run the last question using the question-answerer, you might get an answer like:
The current president of the United States is Donald Trump.Let’s start by simply providing the list of search results as additional context before answering a question. To do this, let’s write a helper function that uses SerpAPI to retrieve the search results. (You could similarly use the Bing API. In either case you need an API key.)
Running web searches
import httpx
from fvalues import F
from ice.recipe import recipe
def make_qa_prompt(context: str, question: str) -> str:
return F(
f"""
Background text: "{context}"
Answer the following question about the background text above:
Question: "{question}"
Answer: "
"""
).strip()
async def search(query: str = "Who is the president of the United States?") -> dict:
async with httpx.AsyncClient() as client:
params = {"q": query, "hl": "en", "gl": "us", "api_key": "e29...b4c"}
response = await client.get("https://serpapi.com/search", params=params)
return response.json()
recipe.main(search)Running python search_json.py returns a large JSON object:
Rendering search results to prompts
We add a method to render the search results to a string (remember to update the code below with your own API key):
Now the results are much more manageable:
Answering questions given search results
Now all we need to do is stick the search results into the Q&A prompt (remember to update the code below with your own API key):
If we run this file…
…we get:
Much better!

Choosing better queries
There’s still something unsatisfying—we’re directly searching for the question, but it could be better to let the model control what search terms we use. This is especially true for complex questions that we don’t expect to get a full answer to through Google, like:
Based on the weather on Sep 14th 2022, how many people do you think went to the beach in San Francisco?
Here it’s probably better to just research the weather on that date using Google, not to enter the whole question. So let’s introduce a choose_query method (remember to update the code below with your own API key):
If we run our question…
…we get:
The query chosen by the model was “beach weather san francisco september 12th 2022”. The results here may differ on each run. For another example, see this trace:

Exercises
It’s nice to look at search results, but often the results are in the actual web pages. Extend the recipe to add the text of the first web page.
Use the model to decide which of the search results to expand.
References
Nakano, Reiichiro, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, et al. WebGPT: Browser-Assisted Question-Answering with Human Feedback
Last updated