Let’s start by just classifying whether the first paragraph answers a question. To do this, we’ll use a new agent method, classify. It takes a prompt and a list of choices, and returns a choice, a choice probability, and for some agent implementations an explanation.
Our single-paragraph classifier looks like this:
paper_qa_class.py
from fvalues import Ffrom ice.paper import Paperfrom ice.paper import Paragraphfrom ice.recipe import recipedefmake_prompt(paragraph: Paragraph,question:str) ->str:returnF(f"""Here is a paragraph from a research paper: "{paragraph}"Question: Does this paragraph answer the question '{question}'? Say Yes or No.Answer:""" ).strip()asyncdefclassify_paragraph(paragraph: Paragraph,question:str) ->float: choice_probs, _ =await recipe.agent().classify( prompt=make_prompt(paragraph, question), choices=(" Yes", " No"), )return choice_probs.get(" Yes", 0.0)asyncdefanswer_for_paper(paper: Paper,question:str): paragraph = paper.paragraphs[0]returnawaitclassify_paragraph(paragraph, question)recipe.main(answer_for_paper)
Save it and run it on a paper:
pythonpaper_qa_class.py--paperpapers/keenan-2018.pdf--question"What was the study population?"
You should see a result like this:
0.024985359096987403
The trace looks simple:
According to the model, the first paragraph is unlikely to answer the question.
Classifying all paragraphs in parallel with map_async
To find the most relevant paragraphs, we map the paragraph classifier over all paragraphs and get the most likely ones.
For mapping, we use the utility map_async which runs the language model calls in parallel:
paper_qa_classes.py
from fvalues import Ffrom ice.paper import Paperfrom ice.paper import Paragraphfrom ice.recipe import recipefrom ice.utils import map_asyncdefmake_prompt(paragraph: Paragraph,question:str) ->str:returnF(f"""Here is a paragraph from a research paper: "{paragraph}"Question: Does this paragraph answer the question '{question}'? Say Yes or No.Answer:""" )asyncdefclassify_paragraph(paragraph: Paragraph,question:str) ->float: choice_probs, _ =await recipe.agent().classify( prompt=make_prompt(paragraph, question), choices=(" Yes", " No"), )return choice_probs.get(" Yes", 0.0)asyncdefanswer_for_paper(paper: Paper,question:str="What was the study population?"): probs =awaitmap_async( paper.paragraphs, lambdapar: classify_paragraph(par, question) )return probsrecipe.main(answer_for_paper)
You will now see a list of probabilities, one for each paragraph:
Now all we need to do is add a utility function for looking up the paragraphs with the highest probabilities:
paper_qa_ranker.py
from fvalues import Ffrom ice.paper import Paperfrom ice.paper import Paragraphfrom ice.recipe import recipefrom ice.utils import map_asyncdefmake_classification_prompt(paragraph: Paragraph,question:str) ->str:returnF(f"""Here is a paragraph from a research paper: "{paragraph}"Question: Does this paragraph answer the question '{question}'? Say Yes or No.Answer:""" )asyncdefclassify_paragraph(paragraph: Paragraph,question:str) ->float: choice_probs, _ =await recipe.agent().classify( prompt=make_classification_prompt(paragraph, question), choices=(" Yes", " No"), )return choice_probs.get(" Yes", 0)asyncdefanswer_for_paper(paper: Paper,question:str,top_n:int=3) -> list[Paragraph]: probs =awaitmap_async( paper.paragraphs, lambdapar: classify_paragraph(par, question) ) sorted_pairs =sorted(zip(paper.paragraphs, probs), key=lambdax: x[1], reverse=True )return [par for par, prob in sorted_pairs[:top_n]]recipe.main(answer_for_paper)
Running the same command again…
pythonpaper_qa_ranker.py--paperpapers/keenan-2018.pdf--question"What was the study population?"
…we indeed get paragraphs that answer the question what the study population was!
[ Paragraph(sentences=['A total of 1624 communities were eligible for inclusion in the trial on the basis of the most recent census (Fig. 1 ).', 'A random selection of 1533 communities were included in the current trial, and the remaining 91 were enrolled in smaller parallel trials at each site, in which additional microbiologic, anthropometric, and adverse-event data were collected.', 'In Niger, 1 community declined to participate and 20 were excluded because of census inaccuracies.', 'No randomization units were lost to follow-up after the initial census.'], sections=[Section(title='Participating Communities', number=None)], section_type='main'),
...]