Chain of Thought
Let’s think step by step.
The idea behind chain-of-thought is trivially simple: Instead of directly asking the model to generate an answer, we add a prefix like “Let’s think step by step.”.
Step-by-step answers
Let’s start with the question-answerer and add a parameter to the prompt so that we can see the effect of different prefixes:
from fvalues import F
from ice.recipe import recipe
def make_chain_of_thought_prompt(question: str, answer_prefix: str = "") -> str:
return F(
f"""Answer the following question:
Question: "{question}"
Answer: "{answer_prefix}
"""
).strip()
async def chain_of_thought(
question: str = "What would happen if the average temperature in Northern California went up by 5 degrees Fahrenheit?",
answer_prefix: str = "Let's think step by step.",
) -> str:
prompt = make_chain_of_thought_prompt(question, answer_prefix)
answer = await recipe.agent().complete(prompt=prompt, stop='"')
return answer
recipe.main(chain_of_thought)Let’s first run the recipe without answer prefix:
We get an answer:
If we provide “Let’s think step by step.” as an answer prefix…
…we get a much more elaborate answer:
Step-by-step reasoning for concise answers
In the previous example chain-of-thought is used to elicit a more elaborate answer. However, often chain-of-thought is used in cases where all we want to do is improve the correctness of a final answer, without changing the answer format itself.
We can achieve this by separately eliciting the reasoning and the final answer, so that we can more directly compare the answer to the model without chain-of-thought:
If we now run this script:
We get a summary of the long reasoning chain:

Exercise
Let’s apply this to the math problem we saw in the chapter on checking reasoning steps:
Beth bakes 4x 2 dozen batches of cookies in a week. If these cookies are shared amongst 16 people equally, how many cookies does each person consume?
The answer:
Inspecting the reasoning, we see that something went wrong in step two:
Exercise: Combine generating reasoning chains with verifiers to generate more reliable reasoning.
References
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. Chain of Thought Prompting Elicits Reasoning in Large Language Models
Wang, Xuezhi, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, and Denny Zhou. Self-Consistency Improves Chain of Thought Reasoning in Language Models. March 21, 2022
Kojima, Takeshi, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large Language Models Are Zero-Shot Reasoners. May 24, 2022
Last updated