Loading paper text
Loading papers as structured data
Last updated
Loading papers as structured data
Last updated
ICE has built-in functionality for parsing and loading papers, and includes that you can download. Here’s a minimal recipe that loads a paper and prints out the first paragraph (often the abstract):
You can run the recipe as follows, providing the path to the downloaded paper as a keyword argument:
You’ll see a result like this:
Note that:
Papers are represented as lists of paragraphs.
Paragraphs are represented as lists of sentences.
Each paragraph has information about which section it’s from.
Try it with your own PDF papers!