# Interpreters

Sometimes the limitation isn’t factual knowledge, but ability to do computation.

For example, if we ask [the basic question-answerer](https://primer.ought.org/chapters/question-answering/q-and-a-without-context) “What is 578921 days \* 12312 miles/day?”:

```shell
python qa_simple.py --question "What is 578921 days * 12312 miles/day?"
```

we get:

```python
7223849252 miles
```

This is similar to the correct answer `7127675352 miles`, but not the same.

## Evaluating Python expressions

Let’s add a method for evaluating Python expressions:

{% code title="eval\_direct.py" %}

```python
from fvalues import F

from ice.recipe import recipe


def eval_python(expression: str) -> str:
    try:
        result = eval(expression)
    except Exception as e:
        result = F(f"Error: {e}")
    return str(result)


async def answer_by_computation(question: str):
    return eval_python(question)


recipe.main(answer_by_computation)
```

{% endcode %}

This works as expected for expressions that are literally Python code:

```shell
python eval_direct.py --question "1 + 1"
```

```
2
```

Of course, it doesn’t work for natural language questions that benefit from compute:

{% code overflow="wrap" %}

```shell
python eval_direct.py --question "What is 578921 days * 12312 miles/day?"
```

{% endcode %}

```
Error: invalid syntax (<string>, line 1)
```

So, we need to choose what to evaluate.

{% hint style="warning" %}
Evaluating arbitrary expressions is dangerous. Don’t use this approach outside of highly experimental code.
{% endhint %}

## Choosing what to evaluate

We make a prompt that asks the model what expression to enter into a Python interpreter to answer the question. We’ll also print out the result of evaluating this expression:

{% code title="eval\_selective.py" %}

```python
from fvalues import F

from ice.recipe import recipe


def make_computation_choice_prompt(question: str) -> str:
    return F(
        f"""You've been asked to answer the question "{question}".

You have access to a Python interpreter.

Enter an expression that will help you answer the question.
>>>"""
    )


def eval_python(expression: str) -> str:
    try:
        result = eval(expression)
    except Exception as e:
        result = F(f"Error: {e}")
    return str(result)


async def choose_computation(question: str) -> str:
    prompt = make_computation_choice_prompt(question)
    answer = await recipe.agent().complete(prompt=prompt, stop='"')
    return answer


async def eval_selective(question: str):
    expression = await choose_computation(question)
    result = eval_python(expression)
    return (expression, result)


recipe.main(eval_selective)
```

{% endcode %}

If we run this on our example…

{% code overflow="wrap" %}

```shell
python eval_selective.py --question "What is 578921 days * 12312 miles/day?"
```

{% endcode %}

…we get:

```
('578921 * 12312', '7127675352')
```

This is a helpful expression and result!

<figure><img src="https://393762053-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFqoUXVrYie7Ht7Fi4JrU%2Fuploads%2FwQyQyq65e5Kvx65bbToK%2FScreenshot%20p9X3OJla%402x.png?alt=media&#x26;token=efd6e494-54bd-4b4e-b8b2-baf3c262a96e" alt=""><figcaption><p>Execution trace (<a href="https://ice.ought.org/traces/01GE0XAYWSKX59VXRP0QQBFTQV">view online</a>)</p></figcaption></figure>

## Using the results of evaluation

Now all we need to do this provide this expression and result as additional context for the basic question-answerer.

{% code title="answer\_by\_computation.py" %}

```python
from fvalues import F

from ice.recipe import recipe


def make_computation_choice_prompt(question: str) -> str:
    return F(
        f"""You've been asked to answer the question "{question}".

You have access to a Python interpreter.

Enter an expression that will help you answer the question.
>>>"""
    )


def make_compute_qa_prompt(question: str, expression: str, result: str) -> str:
    return F(
        f"""A recording of a Python interpreter session:

>>> {expression}: {result}

Answer the following question, using the Python session if helpful:

Question: "{question}"
Answer: "
"""
    ).strip()


def eval_python(expression: str) -> str:
    try:
        result = eval(expression)
    except Exception as e:
        result = F(f"Error: {e}")
    return str(result)


async def choose_computation(question: str) -> str:
    prompt = make_computation_choice_prompt(question)
    answer = await recipe.agent().complete(prompt=prompt, stop='"')
    return answer


async def answer_by_computation(question: str):
    expression = await choose_computation(question)
    result = eval_python(expression)
    prompt = make_compute_qa_prompt(question, expression, result)
    answer = await recipe.agent().complete(prompt=prompt, stop='"')
    return answer


recipe.main(answer_by_computation)
```

{% endcode %}

Rerunning our test case…

{% code overflow="wrap" %}

```shell
python answer_by_computation.py --question "What is 578921 days * 12312 miles/day?"
```

{% endcode %}

…we get the correct answer:

```
7127675352 miles
```

Another example:

> If I have $500 and get 3.7% interest over 16 years, what do I have at the end?

Running this:

{% code overflow="wrap" %}

```shell
python answer_by_computation.py --question "If I have \$500 and get 3.7% interest over 16 years, what do I have at the end?"
```

{% endcode %}

We get:

{% code overflow="wrap" %}

```
If you have $500 and get 3.7% interest over 16 years, you will have $894.19 at the end.
```

{% endcode %}

In contrast, the basic question-answerer says “You would have $1,034,957.29 at the end.”

<figure><img src="https://393762053-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FFqoUXVrYie7Ht7Fi4JrU%2Fuploads%2FgemnL1EhCMdCzSzGC2X6%2FScreenshot%20rjRCMc4G%402x.png?alt=media&#x26;token=af30a547-fa5f-4c5d-b97a-3238fcbd5a15" alt=""><figcaption><p>Execution trace (<a href="https://ice.ought.org/traces/01GE0XFAVDNWSP5TNWZ944NWSW">view online</a>)</p></figcaption></figure>

## Exercises

1. Many questions can only be answered using longer algorithms in Python. Extend the code above to support multi-line Python programs ([example](https://twitter.com/sergeykarayev/status/1569377881440276481/photo/1)).
2. Another approach to (1) is to let the model “enter” multiple expressions into the interpreter. Extend the recipe to support this.

<details>

<summary>Get feedback on exercise solutions</summary>

If you want feedback on your exercise solutions, submit them through [this form](https://docs.google.com/forms/d/e/1FAIpQLSdNNHeQAT7GIzn4tdsVYCkrVEPMNaZmBFkZCAJdvTvLzUAnzQ/viewform). We—the team at Ought—are happy to give our quick take on whether you missed any interesting ideas.

</details>
