(Chapa 1)


Emacs and eev, or: How to Automate Almost Everything

Eduardo Ochs
(HTML version: 2006jan23)


Interacting with programs with command-line interfaces always involves a bit of line editing, and each CLI program tends to implement independently its own minimalistic editing features. We show a way of centralizing these editing tasks by making these programs receive commands that are prepared, and sent from, Emacs. The resulting system is a kind of Emacs- and Emacs Lisp-based “universal scripting language” in which commands can be sent to both external programs and to Emacs itself either in blocks or step-by-step under very fine control from the user.

Note: this is a working draft that has many pieces missing and needs urgent revision on the pieces it has. Current version: (see top). Newer versions are being uploaded to http://angg.twu.net/eev-current/article/, and two animations (in Flash) showing eev at work can be found at: http://angg.twu.net/eev-current/anim/channels.anim.html and http://angg.twu.net/eev-current/anim/gdb.anim.html.

Quick Index:

Installing eev involves changing some rcfiles (using eev-rctool) (find-eev "eev-rctool") (find-angg "TH/eev-article.blogme" "making-progs-receive-cmds") but unistalling and undoing the changes to rcfiles is very easy. (find-eev "INSTALL")

1. Three kinds of interfaces

Interactive programs in a Un*x system(1) can have basically three kinds of interfaces: they can be mouse-oriented, like most programs with graphical interfaces nowadays, in which commands are given by clicking with the mouse; they can be character-oriented, like most editors and mail readers, in which most commands are single keys or short sequences of keys; and they can be line-oriented, as, for example, shells are: in a shell commands are given by editing a full line and then typing “enter” to process that line.

It is commonplace to classify computer users in a spectrum where the “users” are in one extreme and the “programmers” are in the other; the “users” tend to use only mouse-oriented and character-oriented programs, and the “programmers” only character-oriented and line-oriented programs.

In this paper we will show a way to “automate” interactions with line-oriented programs, and, but not so well, to character-oriented programs; more precisely, it is a way to edit commands for these programs in a single central place --- Emacs --- and then send them to the programs; re-sending the same commands afterwards, with or without modifications, then becomes very easy.

This way (“e-scripts”) can not be used to send commands to mouse-oriented programs --- at least not without introducing several new tricks. But “programmers” using Un*x systems usually see most mouse-oriented programs --- except for a few that are intrinsically mouse-oriented, like drawing programs --- as being just wrappers around line-oriented programs than perform the same tasks with different interfaces; and so, most mouse-oriented programs “do not matter”, and our method of automating interactions using e-scripts can be used to automate “almost everything”; hence the title of the paper.

(1): Actually we are more interested in GNU systems than in “real” Unix systems; the reasons will become clear in the section nnn. By the way: the term “Unix” is Copyright (C) Bell Labs).

2. “Make each program do one thing well”

One of the tenets of the Unix philosophy is that each program should do one thing, and do it well; this is a good design rule for Unix programs because the system makes it easy to invoke external programs to perform tasks, and to connect programs.

Some of parts of a Unix system are more like “meta-programs” or “sub-programs” than like self-contained programs that do some clearly useful task by themselves. Shells, for example, are meta-programs: their main function is to allow users to invoke “real programs” and to connect these programs using pipes, redirections, control structures (if, for, etc) and Unix “signals”. On the other hand, libraries are sub-programs: for example, on GNU systems there's a library called GNU readline that line-oriented programs can use to get input; if a program, say, bc (a calculator) gets its input by calling readline(...) instead of using the more basic function fgets(...) then its line-oriented interface will have a little more functionality: it will allow the user to do some minimal editing in the current line, and also to recall, edit and issue again some of the latest commands given.

3. Making programs receive commands

Many line-oriented programs allow “scripting”, which means executing commands from a file. For example, in most shells we can say “source ~/ee.sh”, and the shell will then execute the commands in the file ~/ee.sh. There are other ways of executing commands from a file --- like “sh ~/ee.sh” --- but the one with “source” is the one that we'll be more interested in, because it is closer to running the commands in ~/ee.sh one by one by hand: for example, with “source ~/ee.sh” the commands that change parameters of the shell --- like the current directory and the environment variables --- will work in the obvious way, while with “sh ~/ee.sh” they would only change the parameters of a temporary sub-shell; the current directory and the environment variables of the present shell would be “protected”.

So, it is possible to prepare commands for a shell (or for scriptable line-oriented programs; for arbitrary line-oriented programs see the section nnn) in several ways: by typing them at the shell's interface --- and if the shell uses readline its interface can be reasonably friendly --- or, alternatively, by using a text editor to edit a file, say, ~/ee.sh, and by then “executing” that file with “source ~/ee.sh”. “source ~/ee.sh” is a lot of keystrokes, but that can be shortened if we can define a shell function: by putting

function ee () { source ~/ee.sh; }
in the shell's initialization file (~/.bashrc, ~/.zshrc, ...) we can reduce “source ~/ee.sh” to just “ee”: e, e, enter --- three keystrokes.

We just saw how a shell --- or, by the way, any line-oriented program in which we can define an `ee' function like we did for the shell --- can receive commands prepared in an external editor and stored in a certain file; let's refer to that file, ~/ee.sh, as a “temporary script file”. Now it remains to see how an external text editor can “send commands to the shell”, i.e., how to make the editor save some commands in a temporary script file in a convenient way, that is, without using too many keystrokes...

4. Sending commands

GNU Emacs, “the extensible, self-documenting text-editor” ([S79]), does at least two things very well: one is to edit text, and so it can be used to edit temporary scripts, and thus to send commands to shells and to line-oriented programs with `ee' functions; and the other one is to run Lisp. Lisp is a powerful programming language, and (at least in principle!) any action or series of actions can be expressed as a program in Lisp; the first thing that we want to do is a way to mark a region of a text and “send it as commands to a shell”, by saving it in a temporary script file. We implement that in two ways:

1: (defun ee (s e)
 2:   "Save the region in a temporary script"
 3:   (interactive "r")
 4:   (write-region s e "~/ee.sh"))
 6: (defun eev (s e)
 7:   "Like `ee', but the script executes in verbose mode"
 8:   (interactive "r")
 9:   (write-region
10:    (concat "set -v\n" (buffer-substring s e)
11:            "\nset+v")
12:    nil "~/ee.sh"))
`ee' (the name stands for something like `emacs-execute') just saves the currently-marked region of text to ~/ee.sh; `eev' (for something like `emacs-execute-verbose') does the same but adding to the beginning of the temporary script a command to put the shell in “verbose mode”, where each command is displayed before being executed, and also adding at the end an command to leave verbose mode.

We can now use `ee' and `eev' to send a block of commands to a shell: just select a region and then run `ee' or `eev'. More precisely: mark a region, that is, put the cursor at one of the extremities of the region, then type C-SPC to set Emacs's “mark” to that position, then go to other extremity of the region and type M-x eev (C-SPC and M-x are Emacs's notations for Control-Space and Alt-x, a.k.a. “Meta-x”). After doing that, go to a shell and make it “receive these commands”, by typing `ee'.

5. Hyperlinks

When we are using a system like *NIX, in a part of the time we are using programs with which we are perfectly familiar, and in the rest of the time we are using things that we don't understand completely and that make us have to access the documentation from time to time. In a GNU system the documentation is all on-line, and the steps needed to access any piece of documentation can be automated. We can use Emacs Lisp “one-liners” to create “hyperlinks” to files:

A: (info "(emacs)Lisp Eval")
B: (find-file "~/usrc/busybox-1.00/shell/ash.c")
C: (find-file "/usr/share/emacs/21.4/lisp/info.el")

These expressions, when executed --- which is done by placing the cursor after them and then typing C-x C-e, or, equivalently, M-x eval-last-sexp --- will (A) open a page of Emacs manual (the manual is a set of files in “Info” format), (B) open the source file `shell/ash.c' of a program called busybox, and (C) open the file `info.el' from the Emacs sources, respectively. As some of these files and pages can be very big, these hyperlinks are not yet very satisfactory: we want ways to not only open these files and pages but also to “point to specific positions”, i.e., to make the cursor go to these positions automatically. We can do that by defining some new hyperlink functions, that are invoked like this:

A': (find-node "(emacs)Lisp Eval" "C-x C-e")
B': (find-fline "~/usrc/busybox-1.00/shell/ash.c"
C': (find-fline "/usr/share/emacs/21.4/lisp/info.el"
                "defun info")

The convention is that these “extended hyperlink functions” have names like `find-xxxnode', `find-xxxfile', or `find-xxxyyy'; as the name `find-file' was already taken by a standard Emacs function we had to use `find-fline' for ours.

Here are the definitions of `find-node' and `find-fline':

14: (defun ee-goto-position (&optional pos-spec)
15:   "If POS-SPEC is a string search for its first
16:    occurrence in the file; if it is a number go to the
17:    POS-SPECth line; if it is nil, don't move."
18:   (cond ((null pos-spec))
19:         ((numberp pos-spec)
20:          (goto-char (point-min))
21:          (forward-line (1- pos-spec)))
22:         ((stringp pos-spec)
23:          (goto-char (point-min))
24:          (search-forward pos-spec))
25:         (t (error "Invalid pos-spec: %S" pos-spec))))
27: (defun find-fline (fname &optional pos-spec)
28:   "Like (find-file FNAME), but accepts a POS-SPEC"
29:   (find-file fname)
20:   (ee-goto-position pos-spec))
32: (defun find-node (node &optional pos-spec)
33:   "Like (info NODE), but accepts a POS-SPEC"
34:   (info node)
35:   (ee-goto-position pos-spec)))

Now consider what happens when we send to a shell a sequence of commands like this one:

# (find-node "(gawk)Fields")
seq 4 9 | gawk '{print $1, $1*$1}'
the shell ignores the first line because of the `#', that makes the shell treat that line as a comment; but when we are editing that in Emacs we can execute the `(find-node ...)' with C-x C-e. Hyperlinks can be mixed with shell code --- they just need to be marked as comments.

Note: the actual definitions of `eev', `ee-goto-position', `find-fline' and `find-node' in eev's source code are a bit more complex than the code in the listings above (lines 6--12 in the previous section and 14--35 in the current section). In all the (few) occasions in this paper where we will present the source code of eev's functions what will be shown are versions that implement only the “essence” of those functions, stripped down of all extra functionality. The point that we wanted to stress with those listings is how natural it is to use Emacs in a certain way, as an editor for commands for external programs, and with these plain-text hyperlinks that can be put almost anywhere: the essence of that idea can be implemented in 30 lines of Lisp and one or two lines of shell code.

(See also: Section \ref{e-scripts})

6. Shorter Hyperlinks

The hyperlinks in lines A'', B'' and C'', below,

A'': (find-enode "Lisp Eval" "C-x C-e")
B'': (find-busyboxfile "shell/ash.c" "void\nevalpipe")
C'': (find-efile "info.el" "defun info")
are equivalent to the ones labeled A', B', C' in Section 5, but are a bit shorter, and they hide details like Emacs's path and the version of BusyBox; if we switch to newer versions of Emacs and BusyBox we only need to change the definitions of `find-busyboxfile' and `find-efile' to update the hyperlinks. Usually not many things change from one version of a package to another, so most hyperlinks continue to work after the update.

Eev defines a function called `code-c-d' that makes defining functions like `find-enode', `find-busyboxfile' and `find-efile' much easier:

(code-c-d "busybox" "~/usrc/busybox-1.00/")
(code-c-d "e" "/usr/share/emacs/21.4/lisp/" "emacs")

The arguments for `code-c-d' are (1) a “code” (the “xxx” in a “find-xxxfile”), (2) a directory, and optionally (3) the name of a manual in Info format. The definition of code-c-d is not very interesting, so we won't show it here.

7. Keys for following hyperlinks and for going back

(Rewrite this; mention M-k, M-K, `to' and the (disabled) stubs to implement a `back' command)

It is so common to have Lisp hyperlinks that extend from some position in a line --- usually after a comment sign --- to the end of the line that eev implements a special key for executing these hyperlinks: the effect of typing M-e (when eev is installed and “eev mode” is on) is roughly the same of first going to the end of the line and then typing C-x C-e; that is, M-e does the same as the key sequence C-e C-x C-e(1).

(There are many other kinds of hyperlinks. Examples?)

(1) The main difference between M-e and C-e C-x C-e is how they behave when called with numeric “prefix arguments”: for example, M-0 M-e highlights temporarily the Lisp expression instead of executing it and M-4 M-e executes it with some debugging flags turned on, while C-x C-e when called with any prefix argument inserts the result of the expression at the cursor instead of just showing it at the echo area.

8. Dangerous hyperlinks

Note that these “hyperlinks” can do very dangerous things. If we start to execute blindly every Lisp expression we see just because it can do something interesting or take us to an interesting place then we can end up running something like:

(shell-c ommand "rm -Rf ~")
which destroy all files in our home directory; not a good idea. Hyperlinks ought to be safer than that...

The modern approach to safety in hyperlinks --- the one found in web browsers, for example --- is that following a hyperlink can execute only a few kinds of actions, all known to be safe; the “target” of a hyperlink is something of the form http://..., ftp://..., file://..., info://..., mailto:... or at worst like javascript:...; none of these kinds of actions can even erase our files. That approach limits a lot what hyperlinks can do, but makes it harmless to hide the hyperlink action and display only some descriptive text.

Eev's approach is the opposite of that. I wrote the first functions of eev in my first weeks after installing GNU/Linux in my home machine and starting using GNU Emacs, in 1994; before that I was using mostly Forth (on MS-DOS), and I hadn't had a lot of exposure to *nix systems by then --- in particular, I had tried to understand *nix's notions of user IDs and file ownerships and permissions, and I felt that they were a thick layer of complexity that I wasn't being able to get through.

Forth's attitude is more like ``the user knows what he's doing''; the system is kept very simple, so that understanding all the consequences of an action is not very hard. If the user wants to change a byte in a critical memory position and crash the machine he can do that, and partly because of that simplicity bringing the machine up again didn't use to take more than one minute (in the good old days, of course). Forth people developed good backup strategies to cope with the insecurities, and --- as strange as that might sound nowadays, where all machines are connected and multi-user and crackers abound --- using the system in the Forth way was productive and fun.

*NIX systems are not like Forth, but when I started using them I was accustomed to this idea of achieving simplicity through the lack of safeguards, and eev reflects that. The only thing that keeps eev's hyperlinks reasonably safe is transparency: the code that a hyperlink executes is so visible that it is hard to mistake a dangerous Lisp expression for a “real” hyperlink. Also, all the safe hyperlink functions implemented by eev start with `find-', and all the `find-' functions in eev are safe, except for those with names like `find-xxxsh' and `find-xxxsh0: for example,

(find-sh "wget --help" "recursive download")
executes “wget --help”, puts the output of that in an Emacs buffer and then jumps to the first occurrence of the string “recursive download” there; other `find-xxxsh' functions are variations on that that execute some extra shell commands before executing the first argument --- typically either switching to another directory or loading an initialization file, like ~/.bashrc or ~/.zshrc. The `find-xxxsh0' functions are similar to their `find-xxxsh' counterparts, but instead of creating a buffer with their output they just show it at Emacs's echo area and they use only the first argument and ignore the others (the pos-spec).

9. Generating Hyperlinks

Do we need to remember the names of all hyperlinks functions, like find-fline and find-node? Do we need to type the code for each hyperlink in full by hand? The answers are “no” and “no”.

Eev implements several functions that create temporary buffers containing hyperlinks, that can then be cut and pasted to other buffers. For example, `M-h M-f' creates links about an Emacs Lisp function: typing `M-h M-f' displays a prompt in a minibuffer asking for the name of an Elisp function; if we type, say, `find-file' there (note: name completion with the TAB key works in that prompt) we get a buffer like the one in figure 1.

|# (find-efunction-links 'find-file)                      |
|                                                         |
|# (where-is 'find-file)                                  |
|# (describe-function 'find-file)                         |
|# (find-efunctiondescr 'find-file)                       |
|# (find-efunction 'find-file)                            |
|# (find-efunctionpp 'find-file)                          |
|# (find-efunctiond 'find-file)                           |
|# (find-eCfunction 'find-file)                           |
|# (find-estring (documentation 'find-file))              |
|# (find-estring (documentation 'find-file t))            |
|                                                         |
|# (Info-goto-emacs-command-node 'find-file)              |
|# (find-enode "Command Index" "* find-file:")            |
|# (find-elnode "Index" "* find-file:")                   |
|                                                         |
|                                                         |
|                                                         |
|--:**  *Elisp hyperlinks*   All L18    (Fundamental)-----|

 Figure 1: the result of typing M-h M-f find-file

The first line of that buffer is a hyperlink to that dynamically-generated page of hyperlinks. Its function --- `find-efunction-links' --- has a long name that is hard to remember, but there's a shorter link that will do the same job:

(eek "M-h M-f find-file")

The argument to `eek' is a string describing a sequence of keys in a certain verbose format, and the effect of running, say, (eek "M-h M-f find-file") is the same as of typing `M-h M-f find-file'.

((M-h is a prefix; (eek "M-h C-h") shows all the sequences with the same prefix.))

((Exceptions: M-h M-c, M-h M-2, M-h M-y. Show examples of how to edit hyperlinks with M-h M-2 and M-h M-y.))

((Mention hyperlinks about a key sequence? (eek "M-h M-k C-x C-f")))

((Mention hyperlinks about a Debian package? (eek "M-h M-d bash")))

10. Returning from Hyperlinks

((Mention M-k to kill the current buffer, and how Emacs asks for confirmation when it's a file and it's modified))

((Mention M-K for burying the current buffer))

((Mention what to do in the cases where a hyperlink points to the current buffer (section 16); there used to be an “ee-back” function bound to M-B, but to reactivate it I would have to add back some ugly code to `to'... (by the way, that included Rubikitch's contributions)))

((Web browsers have a way to “return” from hyperlinks: the “back” button... In eev we have many kinds of hyperlinks, including some that are unsafe and irreversible, but we have a few kinds of “back”s that work... 1) if the hyperlink opened a new file or buffer, then to kill the file or buffer, use M-k (an eev binding for kill-this-buffer); note that it asks for a confirmation when the buffer is associated to a file and it has been modified --- or we can use bury-buffer; M-K is an eev binding for bury-buffer. ((explain how emacs keeps a list of buffers?)) Note: if the buffer contains, say, a manpage, or an html page rendered by w3m, which take a significant time to generate, then M-K is better is than M-k. 2) if the hyperlink was a `to' then it jumped to another position in the same file... it is possible to keep a list of previous positions in a buffer and to create an `ee-back' function (suggestion: bind it to M-B) but I haver never been satisfied with the implementations that I did so we're only keeping a hook in `to' for a function that saves the current position before the jump))

((dto recommended winner-undo))

11. Local copies of files from the internet

Emacs knows how to fetch files from the internet, but for most purposes it is better to use local copies. Suppose that the environment variable $S is set to ~/snarf/; then running this on a shell

mkdir -p $S/http/www.gnu.org/software/emacs/
cd       $S/http/www.gnu.org/software/emacs/
wget http://www.gnu.org/software/emacs/emacs-paper.html

# (find-fline "$S/http/www.gnu.org/software/emacs/emacs-paper.html")
# (find-w3m   "$S/http/www.gnu.org/software/emacs/emacs-paper.html")
creates a local copy of emacs-paper.html inside ~/snarf/http/. The two last lines are hyperlinks to the local copy; `find-w3m' opens it “as HTML”, using a web browser called w3m that can be run either in standalone mode or inside Emacs; `find-w3m' uses w3m's Emacs interface, and it accepts extra arguments, which are treated as a pos-spec-list.

Instead of running the `mkdir', `cd' and `wget' lines above we can run a single command that does everything:

psne http://www.gnu.org/software/emacs/emacs-paper.html
which also adds a line with that URL to a log file (usually ~/.psne.log). It is more convenient to have a `psne' that changes the current directory of the shell than one that doesn't, and for that it must be defined as a shell function.

Eev comes with an installer script, called eev-rctool, that can help in adding the definitions for eev (like the “function ee () { source ~/ee.sh; }” of section 3) to initialization files like ~/.bashrc (such initialization files are termed “rcfiles”). Eev-rctool does not add by default the definitions for `psne' and for $S to rcfiles; however, it adds commented-out lines with instructions, which might be something like:

# To define $S and psne uncomment this:
#   . $EEVTMPDIR/psne.sh
# (find-eevtmpfile "psne.sh")

((See: http://lists.gnu.org/archive/html/eev/2005-06/msg00000.html))

12. Glyphs

Emacs allows redefining how characters are displayed, and one of the modules of eev --- eev-glyphs --- uses that to make some characters stand out. Character 15, for example, is displayed on the screen by default as '^O' (two characters, suggesting “control-O”), sometimes in a different color from normal text(3).

Eev changes the appearance of char 15 to make it be displayed as a red star. Here is how: Emacs has some structures called “faces” that store font and color information, and `eeglyphs-face-red' is a face that says “use the default font and the default background color, but a red foreground”; eev's initialization code runs this,

(eev-set-glyph 15 ?* 'eev-glyph-face-red)
which sets the representation of char 15 to the “glyph” made of a star in the face eeglyphs-face-red.

For this article, as red doesn't print well in black and white, we used this instead:

(eev-set-glyph 15 342434)
this made occurrences of char 15 appear as the character 342434, `*' (note that this is outside of the ascii range), using the default face, i.e., the default font and color.

Eev also sets a few other glyphs with non-standard faces. The most important of those are `«' and '»', which are set to appear in green against the default background, with:

(eev-set-glyph 171 171 'eev-glyph-face-green)
(eev-set-glyph 187 187 'eev-glyph-face-green)
There's a technical point to be raised here. Emacs can use several “encodings” for files and buffers, and `«' and `»' only have character codes 171 and 187 in a few cases, mainly in the `raw-text' encoding and in “unibyte” buffers; in most other encodings they have other char codes, usually above 255, and when they have these other codes Emacs considers that they are other characters for which no special glyphs were set and shows them in the default face. This visual distinction between the below-255 `«' and `' and the other `«' and `»'s is deliberate --- it helps preventing some subtle bugs involving the anchor functions of section \ref{anchors}.

(3). Determined by the “face” escape-glyph-face, introduced in GNU Emacs in late 2004.

13. Compose Pairs

To insert a `*' in a text we type `C-q C-o' --- C-q “quotes” the next key that Emacs receives, and `C-q C-o' inserts a “literal C-o”, which is a char 15. Typing `«' and `»'s --- and other non-standard glyphs, if we decide to define our own --- involves using another module of eev: eev-compose.

Eev-compose defines a few variables that hold tables of “compose pairs”, which map pairs of characters that are easy to type into other, weirder characters; for example, `eev-composes-otheriso' says that the pair "<<" is mapped to "«" and that ">>" is mapped to "»", among others. When we are in “eev mode” the prefix `M-,' can be used to perform the translation: typing `M-, < <' enters `«', and typing `M-, > >' enters `»'.

The variable `eev-composes-accents' holds mappings for accented chars, like "'a" to "á" and "cc" to "ç"; `eev-composes-otheriso' takes care of the other mappings that still concern characters found in the ISO8859-1 character set, like `«' and '»' as above, "_a" to "ª", "xx" to "×", and a few others; `eev-composes-globalmath' and `eev-composes-localmath' are initially empty and are meant to be used for used-defined glyphs. The suffix `math' in their names is a relic: Emacs implements its own ways to enter special characters, which support several languages and character encodings, but their code is quite complex and they are difficult to extend; the code that implements eev's `M-,', on the other hand, takes about just 10 lines of Lisp (excluding the tables of compose pairs) and it is trivial to understand and to change its tables of pairs. `M-,' was created originally to enter special glyphs for editing mathematical texts in TeX, but it turned out to be a convenient hack, and it stuck.

14. Delimited regions

Sometimes it happens that we need to run a certain (long) series of commands over and over again, maybe with some changes from one run to the next; then having to mark the block all the time becomes a hassle.

One alternative to that is using a variaton on `M-x eev': `M-x eev-bounded'. It saves the region around the cursor up to certain delimiters instead of saving what's between Emacs's “point” and “mark”.

The original definition of eev-bounded was something like this:

(defun eev-bounded ()
  (eev (ee-search-backwards "\n#*\n")
       (ee-search-forward   "\n#*\n")))
the call to `ee-search-backwards' searches for the first occurrence of the string "\n#*\n" (newline, hash sign, control-O, newline) before the cursor and returns the position after the "\n#*\n", without moving the cursor; the call to ee-search-forward does something similar with a forward search. As the arguments to `eev' indicate the extremities of the region to be saved into the temporary script, this saves the region between the first "\n#*\n" backwards from the cursor to the first "\n#*\n" after the cursor.

The actual definition of `eev-bounded' includes some extra code to highlight temporarily the region that was used; see Figure \ref{fig:F3}. Normally the highlighting lasts for less than one second, but here we have set its duration to several seconds to produce a more interesting screenshot.

____________________ emacs@localhost _______________________
|                                                _________ xterm __________
|#*                                             |/home/edrx(edrx)# ee      |
|# Global variables                             |# Global variables        |
|lua50 -e '                                     |lua50 -e '                |
|  print(print)                                 |  print(print)            |
|  print(_G["print"])                           |  print(_G["print"])      |
|  print(_G.print)                              |  print(_G.print)         |
|  print(_G)                                    |  print(_G)               |
|  print(_G._G)                                 |  print(_G._G)            |
|'                                              |'                         |
|#*                                             |function: 0x804dfc0       |
|# Capture of local variables                   |function: 0x804dfc0       |
|lua50 -e '                                     |function: 0x804dfc0       |
|  foo = function ()                            |table: 0x804d420          |
|    local storage                              |table: 0x804d420          |
|    return                                     |/home/edrx(edrx)#         |
|      (function () return storage end),        |__________________________|
|      (function (x) storage = x; return x end)              |
|  end                                                       |
|  get1, set1 = foo()                                        |
|  get2, set2 = foo()               -- Output:               |
|  print(set1(22), get1())          -- 22 22                 |
|  print(set2(33), get1(), get2())  -- 33 22 33              |
|'                                                           |
|#*                                                          |
|                                                            |
|-:--  lua5.e   91% L325    (Fundamental)--------------------|

  Figure 2: sending a delimited block with F3
  (find-fline "ss-lua.png")
  (find-eevex "screenshots.e" "fisl-screenshots")

Eev binds the key F3 to the function `eeb-default', which runs the current “default bounded function” (which is set initially to `eev', not `eev-bounded') on the region between the current default delimiters, using the current default “highlight-spec”; so, instead of typing `M-x eev-bounded' inside the region to save it we can just type F3.

All these defaults values come from a single list, which is stored in the variable `eeb-defaults'. The real definition of `eev-bounded' is something like:

(setq eev-bounded
  '(eev ee-delimiter-hash nil t t))

(defun eev-bounded ()
  (setq eeb-defaults eev-bounded)

Note that in Emacs Lisp (and in most other Lisps) each symbol has a value as a variable that is independent from its “value as a function”: actually a symbol is a structure containg a name, a “value cell”, a “function cell” and a few other fields. Our definition of `eev-bounded', above, includes both a definition of the function `eev-bounded' and a value for the variable `eev-bounded'.

Eev has an auxiliary function for defining these “bounded functions”; running

(eeb-define 'eev-bounded 'eev 'ee-delimiter-hash nil t t)
has the same effect as doing the `setq' and the `defun' above.

As for the meaning of the entries of the list `eeb-defaults', the first one (`eev') says which function to run; the second one (`ee-delimiter-hash') says which initial delimiter to use --- in this case it is a symbol instead of a string, and so `eeb-default' takes the value of the variable `ee-delimiter-hash'; the third one (nil) is like the second one, but for the final delimiter, and when it is nil `eeb-default' considers that the final delimiter is equal to the initial delimiter; the fourth entry (t) means to use the standard highlight-spec, and the fifth one (t, again) tells `eeb-default' to make an adjustment to the highlighted region for purely aestethical reasons: the saved region does not include the initial "\n" in the final delimiter, "\n#*\n", but the highlighting looks nicer if it is included; without it the last highlighted line in Figure 2 would have only its first character --- an apostrophe --- highlighted.

Eev also implements other of these “bounded” functions. For example, running `M-x eelatex' on a region saves it in a temporary LaTeX file, and also saves into the temporary script file the commands to process it with LaTeX; `eelatex-bounded' is defined by

(eeb-define 'eelatex-bounded 'eelatex
  'ee-delimiter-percent nil t t)
where the variable `ee-delimiter-percent' holds the string "\n%*\n"; comments in LaTeX start with percent signs, not hash signs, and it is convenient to use delimiters that are treated as comments.

((The block below ... tricky ... blah. How to typeset `*' in LaTeX. Running eelatex-bounded changed the defaults stored in eeb-defaults, but ee-once blah doesn't.))

% (eelatex-bounded)
% (ee-once (eelatex-bounded))
  \hbox to \wd0{\hss#1\hss}%
\catcode`*=13 \def*{\myttbox{$\bullet$}}

...for example eelatex, that saves the region (plus certain standard header and footer lines) to a “temporary LaTeX file” and saves into the temporary script file the commands to make `ee' run LaTeX on that and display the result. The block below is an example of (...)

...The block below shows a typical application of eev-bounded:

# (find-es "lua5" "install-5.0.2")
# (find-es "lua5" "install-5.0.2" "Edrx's changes")

# (code-c-d "lua5" "/tmp/usrc/lua-5.0.2/")
# (find-lua5file "INSTALL")
# (find-lua5file "config" "support for dynamic loading")
# (find-lua5file "config")
# (find-lua5file "")
rm   -Rv ~/usrc/lua-5.0.2/
mkdir -p ~/usrc/lua-5.0.2/
tar   -C ~/usrc/ \
   -xvzf $S/http/www.lua.org/ftp/lua-5.0.2.tar.gz
cd       ~/usrc/lua-5.0.2/
cat >> config <<'---'
DLLIB= -ldl
EXTRA_LIBS= -lm -ldl
make test  2>&1 | tee omt
./bin/lua -e 'print(loadlib)'
it unpacks a program (the interpreter for Lua), changes its default configuration slightly, then compiles and tests it.

((Comment about the size: the above code is “too small for being a script”, and the hyperlinks are important))

gdb (here-documents, gcc, ee-once)

(alternative: here-documents, gcc, gdb, screenshot(s) for gdb)

15. Communication channels

The way that we saw to send commands to a shell is in two steps: first we use M-x eev in Emacs to “send” a block of commands, and then we run `ee' at the shell to make it “receive” these commands. But there is also a way to create shells that “listen” not only to the keyboard for their input, but also to certain “communication channels”; by making Emacs send commands through these communication channels we can skip the step of going to the shell and typing `ee' --- the commands are received immediately.

_________emacs@localhost____________   ___________channel A______________
|                                    | |/tmp(edrx)# # Send things to port |
|* (eechannel-xterm "A") ;; create   | | 1234                             |
|* (eechannel-xterm "B") ;; create   | |/tmp(edrx)# {                     |
|# Listen on port 1234               | |>   echo hi                       |
|netcat -l -p 1234                   | |>   sleep 1                       |
|*                                   | |>   echo bye                      |
|* (eechannel "A") ;; change target  | |>   sleep 1                       |
|# Send things to port 1234          | |> } | netcat -q 0 localhost 1234  |
|{                                   | |/tmp(edrx)#                       |
|  echo hi                           | |/tmp(edrx)#                       |
|  sleep 1                           | |__________________________________|
|  echo bye                          |  ___________channel B______________
|  sleep 1                           | |/tmp(edrx)# # Listen on port 1234 |
|} | netcat -q 0 localhost 1234      | |/tmp(edrx)# netcat -l -p 1234     |
|                                    | |hi                                |
|-:--  screenshots.e   95% L409   (Fu| |bye                               |
|_Wrote /home/edrx/.eev/eeg.A.str____| |/tmp(edrx)#                       |
                                       |                                  |
 Figure 3: sending commands to two xterms using F9
 (find-eevex "screenshots.e" "fisl-screenshots")
 (find-eevfile "article/ss-f9.png")

The screenshot at Figure 3 shows this at work. The user has started with the cursor at the second line from the top of the screen in the Emacs window and then has typed F9 several times. Eev binds F9 to a command that operates on the current line and then moves down to the next line; if the current line starts with `*' then what comes after the `*' is considered as Lisp code and executed immediately, and the current line doesn't start with `*' then its contents are sent through the default communication channel, or though a dummy communication channel if no default was set.

The first F9 executed `(eechannel-xterm "A")', which created an xterm with title “channel A”, running a shell listening on the communication channel “A”, and set the default channel to A; the second F9 created another xterm, now listening to channel B, and set the default channel to B.

The next two `F9's sent each one one line to channel B. The first line was a shell comment (“# Listen...”); the second one started the program netcat, with options to make netcat “listen to the internet port 1234” and dump to standard output what it receives.

The next line had just `*'; executing the rest of it as Lisp did nothing. The following line changed the default channel to A.

In the following lines there is a small shell program that outputs “hi”, then waits one second, then outputs “bye”, then waits for another second, then finishes; due to the “| netcat...” its output is redirected to the internet port 1234, and so we see it appearing as the output of the netcat running on channel B, with all the expected delays: one second between “hi” and “bye”, and one second after “bye”; after that last one-second delay the netcat at channel A finishes receiving input (because the program between `{' and `}' ends) and it finishes its execution, closing the port 1234; the netcat at B notices that the port was closed and finishes its execution too, and both shells return to the shell prompt.

There are also ways to send whole blocks of lines at once through communication channels; see Section \ref{bigmodular}.

15.1. The Implementation of Communication Channels

Communication channels are implemented using an auxiliary script called `eegchannel', which is written in Expect ([L90] and [L95]). If we start an xterm in the default way it starts a shell (say, /bin/bash) and interacts with it: the xterm sends to the shell as characters the keystrokes that it receives from the window manager and treats the characters that the shell sends back as being instructions to draw characters, numbers and symbols on the screen. But when we run `(eechannel-xterm "A")' Emacs creates an xterm that interacts with another program --- eegchannel --- instead of with a shell, and eegchannel in its turn runs a shell and interacts with it.

Eegchannel passes characters back and forth between the xterm and the shell without changing them in any way; it mostly tries to pretend that it is not there and that the xterm is communicating directly with the shell. However, when eegchannel receives a certain signal it sends to the shell a certain sequence of characters that were not sent by the xterm; it “fakes a sequence of keystrokes”.

Let's see a concrete example. Suppose than Emacs was running with process id (“pid”) 1000, and running (eechannel-xterm "A") in it made it create an xterm, which got pid 1001; that xterm ran eegchannel (pid 1002), which ran /bin/bash (pid 1003). Actually Emacs invoked xterm using this command line:

xterm -n "channel A" -e eegchannel A /bin/bash
and xterm invoked eegchannel with “eegchannel A /bin/bash”; eegchannel saw the `A', saved its pid (1002) to the file ~/.eev/eeg.A.pid, and watched for SIGUSR1 signals; every time that it (the eegchannel) receives a SIGUSR1 it reads the contents of ~/.eev/eeg.A.str and sends that as fake input to the shell that it is controlling. So, running
echo 'echo $[1+2]' > ~/.eev/eeg.A.str
kill -USR1 $(cat ~/.eev/eeg.A.pid)
in a shell sends the string “echo $[1+2]” (plus a newline) “through the channel A”; what Emacs does when we type F9 on a line that does not start with `*' corresponds exactly to that.

16. Anchors

The function `to' can be used to create hyperlink to certain positions --- called “anchors” --- in the current file. For example,

# Index:
# «.first_block»        (to "first_block")
# «.second_block»       (to "second_block")

# «first_block»  (to ".first_block")
echo blah
# «second_block»  (to ".second_block")
echo blah blah

What `to' does is simply to wrap its argument inside `«' and `»' characters and then jump to the first occurrence of the resulting string in the current file. In the (toy) example above, the line that starts with “# «.first_block»” has a link that jumps to the line that starts with “# «first_block»”, which has a link that jumps back --- the anchors and “(to ...)”s act like an index for that file.

The function `find-anchor' works like a `to' that first opens another file. For example,

(find-anchor "~/.zshrc" "update-homepage")
does roughly the same as:
(find-fline "~/.zshrc" "«update-homepage»")

Actually `find-anchor' consults a variable, `ee-anchor-format', to see in which strings to wrap the argument. Some functions modify `ee-anchor-format' temporarily to obtain special effects; for example, a lot of information about the packages installed in a Debian GNU system is kept in a text file called /var/lib/dpkg/info/status; (find-status "emacs21") opens this file and searches for the string "\nPackage: emacs21\n" there --- that string is the header for the block with information about the package emacs21, and it tells the size of the package, description, version, whether it is installed or not, etc, in a format that is both machine-readable and human-readable.

17. E-scripts

The best short definition for eev that I've found involves some cheating, as it is a circular definition: “eev is a library that adds support for e-scripts to Emacs” --- and e-scripts are files that contain chunks meant to be processed by eev's functions. Almost any file can contain parts “meant for eev”: for example, a HOWTO or README file about some program will usually contain some example shell commands, and we can mark these commands and execute them with M-x eev; and if we have the habit of using eev and we are writing code in, say, C or Lua we will often put elisp hyperlinks inside comment blocks in our code. These two specific languages (and a few others) have a feature that is quite convenient for eev: they have syntactical constructs that allow comment blocks spanning several lines --- for example, in Lua, where these comment blocks are delimited by “--((” and “--))”s, we can have a block like

# This file: (find-fline "~/LUA/lstoindexhtml.lua")
# A test:
cd /tmp/
ls -laF | col -x \
  | lua50 ~/LUA/lstoindexhtml.lua tmp/ \
  | lua50 -e 'writefile("index.html", io.read("*a"))'
in a Lua script, and the script will be at the same time a Lua script and an e-script.

When I started using GNU and Emacs the notion of an e-script was something quite precise to me: I was keeping notes on what I was learning and on all that I was trying to do, and I was keeping those notes in a format that was partly English (or Portuguese), partly executable things --- not all of them finished, or working --- after all, it was much more practical to write

rm -Rv ~/usrc/busybox-1.00/
tar -C ~/usrc/ -xvzf \
cd ~/usrc/busybox-1.00/
cp -iv ~/BUSYBOX/myconfig .config
make menuconfig
make       2>&1 | tee om
than to write
Unpack BusyBox's source, then run "make menuconfig"
and "make" on its main directory
because if I had the second form in my notes I would have to translate that from English into machine commands every time... So, those files where I was keeping my notes contained “executable notes”, or were “scripts for Emacs”, and I was quite sure that everyone else around were also keeping notes in executable formats, possibly using other editors and environments (vi, maybe?) and that if I showed these people my notes and they were about some task that they were also struggling with then they would also show me their notes... I ended up making a system that uploaded regularly all my e-scripts (no matter how messy they were) to my home page, and writing a text --- “The Eev Manifesto” ([O99]) --- about sharing these executable notes.

Actually trying to define an e-script as being “a file containing executable parts, that are picked up and executed interactively” makes the concept of an e-script very loose.

Note that we can execute the Lua parts in the code above by running the Lua interpreter on it, we can execute the elisp one-liner with M-e in Emacs, and we can execute the shell commands using F3 or M-x eev; but the code will do nothing by itself --- it is passive.

A piece of code containing instructions in English on how to use it is also an e-script, in a sense; but to execute these instructions we need to invoke an external entity --- a human, usually ourselves --- to interpret them. This is much more flexible, but also much more error-prone and slow, than just pressing a simple sequence of keys like M-e, or F9, or F3, alt-tab, e, e, enter.

18. Splitting eev.el

When I first submittted eev for inclusion in GNU Emacs, in 1999, the people at the FSF requested some changes. One of them was to split eev.el --- the code at that point was all in a single Emacs Lisp file, called eev.el --- into several separate source files according to functionality; at least the code for saving temporary scripts and the code for hyperlinks should be kept separate.

It turned out that that was the wrong way of splitting eev. The frontier between what is a hyperlink and what is a block of commands is blurry:

man foo
man -P 'less +/bar' foo
# (eev "man foo")
# (eev "man -P 'less +/bar' foo")
# (find-man "foo" "bar")

The two `man' commands above can be considered as hyperlinks to a manpage, but we need to send those commands to a shell to actually open the manpage; the option "-P 'less +/bar'" instructs `man' to use the program `less' to display the manpage, and it tells `less' to jump to the first occurrence of the string “bar” in the text, and so it is a hyperlink to a specific position in a manpage. Each of the two `eev' lines, when executed, saves one of these `man' commands to the temporary script file; because they contain Lisp expressions they look much more like hyperlinks than the `man' lines. The last line, `find-man', behaves much more like a “real” hyperlink: it opens the manpage inside Emacs and searches for the first occurrence of `bar' there; but Emacs's code for displaying manpages was tricky, and it took me a few years to figure out how to add support for pos-spec-lists to it...

So, what happens is that often a new kind of hyperlink will begin its life as a series of shell commands (another example: using `gv --page 14 file.ps' to open a PostScript file and then jump to a certain page) and then it takes some time to make a nice hyperlink function that does the same thing; and often these functions are implemented by executing commands in external programs.

There's a much better way to split conceptually what eev does, though. Most functions in eev take a region of text (for example Emacs's own “selected region”, or the extent of Lisp expression coming before the cursor) and “execute” that in some way; the kinds of regions are

Emacs's (selected) region   | M-x eev, M-x eelatex (sec. 4)
last-sexp (Lisp expression  | C-x C-e, M-E         (sec. 5)
at the left of the cursor)  |
sexp-eol (go to end of      | C-e C-x C-e, M-e     (sec. 7)
line, then last-sexp)       |
bounded region              | F3, M-x eev-bounded,
                            | M-x eelatex-bounded (sec. 14)
bounded region around       | (ee-at [`` anchor] ...)
anchor                      |                     (sec. 20)
current line                | F9                  (sec. 15)
no text (instead use the    | F12                 (sec. 19)
next item in a list)        |

Actions (can be composed): * Saving a region or a string into a file * Sending a signal to a process * Executing as Lisp * Executing immediately in a shell * Start a debugger

((Emacs terminology: commands))

19. Steps

((Simple examples))

((writing demos))

((hyperlinks for which no short form is known))

((producing animations and screenshots))

20. Big Modular E-scripts

A shell can be run in two modes: either interactively, by expecting lines from the user and executing them as soon as they are received\footnote{except for multi-line commands.}, or by scripts: in the later case the shell already has access to the commands, and executes them in sequence as fast as possible, with no pause between one command and the next.

When we are sending lines to a shell with F9 we are telling it not only what to execute but also when to execute it; this is somewhat similar to running a program step-by-step inside a debugger --- but note that most shells provide no single-stepping facilities.

We will start with a toy example --- actually the example from Section \ref{anchors} with five new lines added at the end --- and then in the next section we will see a real-world example that uses these ideas.

Figure 4: sending a block at once with eevnow-at
  (find-fline "ss-modular.png")

  Figure 5: single-stepping through a C program
  (find-fline "ss-gdbwide.png")

((Somewhere between a script and direct user interaction))

((No loops, no conditionals))

((Several xterms))

21. Internet Skills for Disconnected People

Suppose that we have a person P who has learned how to use a computer and now wants to learn how the internet works. That person $P$ knows a bit of programming and can use Emacs, and sure she can use e-mail clients and web browsers by clicking around with the mouse, but she has grown tired of just using those things as black boxes; now she wants to experiment with setting up HTTP and mail servers, to understand how data packets are driven around, how firewalls can block some connections, such things.

The problem is that P has never had access to any machine besides her own, which is connected to the internet only through a modem; and also, she doesn't have any friends who are computer technicians or sysadmins, because from the little contact that she's had with these people she's got the impression that they live lifes that are almost as grey as the ones of factory workers, and she's afraid of them. To add up to all that, $P$ has some hippie job that makes her happy but poor, so she's not going to buy a second computer, and the books she can borrow, for example, Richard Stevens' series on TCP/IP programming, just don't cut.

One of eev's intents isto make life easier for autodidacts. Can it be used to rescue people in positions like $P$'s(4)? It was thinking on that that I created a side-project to eev called “Internet Skills for Disconnected People”: it consists of e-scripts about running a second machine, called the “guest”, emulated inside the “host”, and making the two talk to each other via standard internet protocols, via emulated ethernet cards. Those e-scripts make heavy use of the concepts in the last section ((...))

Figure 6: a call map
  (find-fline "iskidip.png")
  (find-eimage0 "./iskidip.png")

% (find-eevex "busybox.e" "bb_chroot_main")
% (find-eevex "busybox.e" "bbinitrd-qemu-main")
% (find-eevex "busybox.e" "iso-qemu-main")
% (find-eevex "busybox.e" "iso-qemu-main-2")

(4). by the way, I created $P$ inspired on myself; my hippie job is being a mathematician.

22. Availability and Resources

Eev can be downloaded from the author's homepage, http://angg.twu.net/. That page also contains lots of examples, some animations showing some of eev's features at work, a mailing list, etc.

Eev is in the middle of the process of becoming a standard part of GNU Emacs; I expect it to be integrated just after the release of GNU Emacs 22.1 in mid-2005. Eev's copyright has already been transferred to the FSF; it is distributed under the GPL license.

23. Acknowledgments

I would like to thank David O'Toole, Diogo Leal and Leslie Watter for our countless hours of discussions about eev; many of the recent features of eev --- almost half of this article --- were conceived at our talks.

((Thank also the people at #emacs, for help with the code and for small revision tips))

24. References

[L90] - Libes, D. - Expect: Curing Those Uncontrollable Fits of Interaction. 1990. Available online from http://expect.nist.gov/.

[L95] - Libes, D. - Exploring Expect. O'Reilly, 1995.

[O99] - Ochs, E. - The Eev Manifesto (http://angg.twu.net/eev-manifesto.html).

[S79] - Stallman, R. - EMACS: The Extensible, Customizable Display Editor. (http://www.gnu.org/software/emacs/emacs-paper.html)