(Re)generate: (find-pdf-like-intro)
Source code:  (find-eev "eev-intro.el" "find-pdf-like-intro")
More intros:  (find-eev-quick-intro)
              (find-eval-intro)
              (find-eepitch-intro)
This buffer is _temporary_ and _editable_.
It is meant as both a tutorial and a sandbox.




Note: this intro needs to be rewritten!
Ideally it should _complement_ the material in:
  (find-eev-quick-intro "9.3. Hyperlinks to PDF files")




PDF-like documents

Let's introduce a bit of (improvised!) terminology: we will say that a document is "PDF-like" when it is in a format like PDF, PostScript, DVI or DJVU - i.e., divided into pages. Emacs has a standard mode for viewing PDF-like documents, (find-enode "Document View") but we will see a more eev-like way of pointing to pages of PDF-like documents.

Two test documents

The following script creates two PDF-like documents - a DVI and a PDF - that we will use in the examples below. * (eepitch-shell) * (eepitch-kill) * (eepitch-shell) cd /tmp/ cat > /tmp/foo.tex <<'%%%' \documentclass[12pt,oneside]{book} \begin{document} \Huge \frontmatter a \newpage b \newpage c \newpage \mainmatter \chapter{One} \newpage foo \chapter{Two} \end{document} %%% latex foo.tex pdflatex foo.tex In these two documents the page _names_ do not correspond to the page _numbers_; the pages are named "i", "ii", "iii", "1", "2", "3", but their numbers are 1, 2, 3, 4, 5, 6. In a table: number name contents ---------------------- 1 i a 2 ii b 3 iii c 4 1 Chapter 1 - One 5 2 foo 6 3 Chapter 3 - Two

Using external viewers

The following sexps can be used to open the documents "/tmp/foo.dvi" and "/tmp/foo.pdf" on the first page of Chapter 1 - i.e., the page whose number is 4, and whose "name" is 1 - using two of my favorite viewers, xdvi and xpdf, and a low-level function, `find-bgprocess': (find-bgprocess '("xdvi" "+4" "/tmp/foo.dvi")) (find-bgprocess '("xpdf" "/tmp/foo.pdf" "4")) Alternatively, we can invoke these viewers like this, (find-xdvi-page "/tmp/foo.dvi" 4) (find-xpdf-page "/tmp/foo.pdf" 4) or, as they ignore extra arguments, like this, (find-xdvi-page "/tmp/foo.dvi" (+ 3 1) "Chapter 1") (find-xpdf-page "/tmp/foo.pdf" (+ 3 1) "Chapter 1") where the `(+ 3 1)' and the "Chapter 1" are just to make these links more readable by humans. The `3' is what we will call the "offset" of the document: a quantity that can be added to page "names" (outside the "front matter" of the document) to convert them to page "numbers". Let's introduce more terminology. Programs like xdvi and xpdf are "external viewers for PDF-like documents", but that's too long, so let's shorten this to "external PDF-like viewers", or "external viewers", or just "viewers"; `find-xdvi-page', `find-xpdf-page' and similar functions are "medium-level viewing words".

The high-level way

File names of PDF-like documents are often very long - especially for documents that we have "psne"-ed from the web. To avoid having to keep copies of these file names everywhere we can use `code-c-d'-like words - like these: (code-xdvi "fd" "/tmp/foo.dvi") (code-xpdf "fp" "/tmp/foo.pdf") (find-fdpage (+ 3 1) "Chapter 1") (find-fppage (+ 3 1) "Chapter 1") Each medium-level viewing word has an associated code-c-d-like word - that creates "high-level viewing words". In the example above, we used `code-xdvi' to create the high-level viewing word `find-fdpage', that invokes `find-xdvi-page', and `code-xpdf' to create the high-level viewing word `find-fppage', which invokes `find-xpdf-page', Note that the "fd" in `find-fdpage' stands for not only the filename - "/tmp/foo.dvi" - but also for the medium-level word to be used - `find-xdvi-page'; same for "fp".

Default external viewers

We saw that for each of the supported formats of PDF-like documents - DVI, PostScript, PDF, DJVU - there are medium-level and high-level viewing words that use specific programs; for example, for "xpdf" we have `find-xpdf-page' and `code-xpdf', and for "evince" we have `find-evince-page' and `code-evince'. But for each of the formats we also have words that use the current default viewer for that format: Format Medium-level High-level ---------------------------------------- DVI find-dvi-page code-dvi PostScript find-ps-page code-ps PDF find-pdf-page code-pdf DJVU find-djvu-page code-djvu The four `find-<formatname>-page' words above are aliases to `find-<viewername>-page' names, and to change a default viewer you should use a `defalias' on the `find-', like these: (defalias 'find-pdf-page 'find-evince-page) (defalias 'find-pdf-page 'find-xdpf-page) After running a `defalias' like the above all the high-level viewing words defined using `code-pdf' will automatically switch to the new default viewer (because words defined with `code-pdf' call `find-pdf-page').

PDF-like documents as text

Some PDF-like documents can be converted to text - usually uglily and imprecisely, but the result is often useful anyway - by external programs like "pdftotext" and "djvutxt". The medium-level sexps below invoke these programs on the given filenames and displays their output in an Emacs buffer: (find-pdftotext-text "/tmp/foo.pdf") (find-djvutxt-text "/tmp/foo.djvu") We can also use the correspondent generic medium-level words, that are aliases to the default converters: (find-pdf-text "/tmp/foo.pdf") (find-djvu-text "/tmp/foo.djvu") As the output of these converters is also divided into pages - with formfeeds as separators - it is easy to jump to specific pages in the output, and if the first argument after the file name is a number it is interpreted as a page number; string arguments coming after that are interpreted as strings to be search (forward) for. So these links make sense: (find-pdf-text "/tmp/foo.pdf" (+ 3 1)) (find-pdf-text "/tmp/foo.pdf" (+ 3 1) "Chapter 1") and note that the following pair of links make sense too - the first one calls an external viewer, the second one opens the conversion to text: (find-pdf-page "/tmp/foo.pdf" (+ 3 1) "Chapter 1") (find-pdf-text "/tmp/foo.pdf" (+ 3 1) "Chapter 1") Note that they both point to the same page... The argument "Chapter 1" is ignored in the first link, but when a pair of links like that appear on consecutive lines it is clear for human readers that they are both links to the same place, only rendered in different ways. Note that the passage from this: (find-pdf-text "/tmp/foo.pdf" (+ 3 1)) to this: (find-pdf-text "/tmp/foo.pdf" (+ 3 1)) (find-pdf-text "/tmp/foo.pdf" (+ 3 1) "Chapter 1") is a special case of "refining hyperlinks", an idea that we saw in: (find-eval-intro "Refining hyperlinks")

High-level hyperlinks to pdf-like documents

By executing (code-pdf "fp" "/tmp/foo.pdf") (code-pdf-text "fp" "/tmp/foo.pdf" 3) we can use shorter hyperlinks, like (find-fppage (+ 3 1) "Chapter 1") (find-fptext (+ 3 1) "Chapter 1") instead of the longer forms with `find-pdf-page' and `find-pdf-text'. This works exactly like `code-c-d', as explained here: (find-code-c-d-intro) Try these sexps to see the code that the `code-pdf' and the `code-pdf-text' above execute: (find-code-pdf "fp" "/tmp/foo.pdf") (find-code-pdf-text "fp" "/tmp/foo.pdf" 3) There is a wrapping comand for producing these `code-pdf'/`code-pdf-text' pairs quickly - `M-P'. Try it here: fp /tmp/foo.pdf

Producing and refining hyperlinks to pages

We also have something like this (find-eval-intro "Producing and refining hyperlinks") for pdf-like documents, that will let us produce hyperlinks to the current page of the current pdf-like document very quickly, but it depends on several hacks. Note that the functions `code-pdf', `code-pdf-text', `find-xxxpage', `find-xxxtext', set the global variables `ee-page-c', `ee-page-fname', and `ee-page-offset'. You can inspect their definitions with: (find-code-pdf "fp" "/tmp/foo.pdf") (find-code-pdf-text "fp" "/tmp/foo.pdf" 3) Here's how these variables are used. Try this: (code-pdf "fp" "/tmp/foo.pdf") (code-pdf-text "fp" "/tmp/foo.pdf" 3) (kill-new "Two") (eek "M-h M-p") You should get a page with several hyperlinks to the "current page" of the current pdf-like document, including some like these: (find-fppage 1) (find-fptext 1) (find-fppage (+ 3 -2)) (find-fptext (+ 3 -2)) (find-fppage 1 "Two") (find-fptext 1 "Two") (find-fppage (+ 3 -2) "Two") (find-fptext (+ 3 -2) "Two") Where did the "fp", the "1", the "3", the "-2" and the "Two" above come from? The page number, which in the links above is sometimes "1", sometimes "(+ 3 -2)", is obtained by counting the number of formfeeds before point; this makes sense only when we are visiting the buffer generated by "(find-fptext ...)". The "fp" is taken from the variable `ee-page-c', which was set by `(code-pdf-text "fp" ...)' or by `(find-fptext ...)'; same for "3", which is taken from the variable `ee-page-offset'. Finally, the "Two" is the last kill, from the top of the kill-ring; we usually set it by selecting a region of text from the `(find-fptext ...)' buffer and typing `M-w'. An alternative way to produce hyperlinks to pages, which, as the hack above, also uses `ee-page-c' and `ee-page-offset', is to prepare a series of lines with a page number followed by a text that will play a similar role to the "last kill", and then type `M-Q' on each line. Try this below, by first executing the `code-pdf-text' then typing four `M-Q's. (code-pdf "debt" "~/books/graeber__debt.pdf") (code-pdf-text "debt" "~/books/graeber__debt.pdf" 8) 1 1 On The Experience of Moral Confusion 21 2 The Myth of Barter 43 3 Primordial Debts 73 4 Cruelty and Redemption It is usually not hard to produce such page-number-plus-text lines for `M-Q' from the table of contents of a book. The ones above were extracted from (find-debttext 7 "Contents") with a bit of fiddling by hand and keyboard macros. Keyboard macros are VERY useful; if you don't use them yet, see: (find-enode "Keyboard Macros")