Warning: this is an htmlized version!
The original is here, and the conversion rules are here. |
This is the file `INTERNALS' of blogme. This file describes the internals of blogme2, that are quite different from the internals of (the old) blogme; blogme2 is MUCH cleaner. Author: Eduardo Ochs <edrx@mat.puc-rio.br> Version: 2005sep09 License: GPL Site: <http://angg.twu.net/> The main tables used by blogme ============================== _G: Lua globals (<http://www.lua.org/manual/5.0/manual.html#predefined>) _W: blogme words _P: low-level parsers _A: argument-parsing functions for blogme words _AA: abbreviations for argument-parsing functions (see `def') _V: blogme variables (see "$" and `withvars') Blogme words (the tables _W and _A) =================================== Let's examine an example. When blogme processes: [HREF http://foo bar] it expands it to: <a href="http://foo">bar</a> When the blogme evaluator processes a bracketed expression it first obtains the first "word" of the brexp (called the "head" of the brexp), that in this case is "HREF"; then it parses and evaluates the "arguments" of the brexp, and invokes the function associated to the word "HREF" using those arguments. Different words may have different ways of parsing and evaluating their arguments; this is like the distinction in Lisp between functions and special forms, and like the special words like LIT in Forth. Here are the hairy details: if HREF is defined by HREF = function (url, str) return "<a href=\""..url.."\">"..str.."</a>" end _W["HREF"] = HREF _A["HREF"] = vargs2 then the "value" of [HREF http://foo bar] will be the same as the value returned by HREF("http://foo", "bar"), because _W["HREF"](_A["HREF"]()) will be the same as: HREF(vargs2()) when vargs2 is run the parser is just after the end of the word "HREF" in the brexp, and running vargs2() there parses the rest of the brexp and returns two strings, "http://foo" and "bar". See: (info "(elisp)Function Forms") and: (info "(elisp)Special Forms") The blogme parsers (the table _P) ================================= Blogme has a number of low-level parsers, each one identified by a string (a "blogme pattern"); the (informal) "syntax" of those blogme patterns was vaguely inspired by Lua5's syntax for patterns. (See: <http://www.lua.org/manual/5.0/manual.html#pm>). In the table below "BP" stands for "blogme pattern". BP Long name/meaning Corresponding Lua pattern -----+----------------------+-------------------------- "%s" | space char | "[ \t\n]" "%w" | word char | "[^%[%]]" "%c" | normal char | "[^ \t\n%[%]]" "%B" | bracketed expression | "%b[]" "%W" | bigword | "(%w*%b[]*)*" (but not the empty string!) The low-level parsing functions of blogme are of two kinds (levels): * Functions in the "parse only" level only succeed or fail. When they succeed they return true and advance the global variable `pos'; when they fail they return nil and leave pos unchanged (*). * Functions in the "parse and process" level are like the functions in the "parse only" level, but with something extra: when they succeed they store in the global variable `val' the "semantic value" of the thing that they parsed. When they fail they are allowed to garble `val', but they won't change `pos'. See: (info "(bison)Semantic Values") These low-level parsing functions are stored in the table `_P', with the index being the "blogme patterns". They use the global variables `subj', `pos', `b', `e', and `val'. An example: running _P["%w+"]() tries to parse a (non-empty) series of word chars starting at pos; running _P["%w+:string"]() does the same, but in case of success the semantic value is stored into `val' as a string -- the comment ":string" in the name of the pattern indicates that this is a "parse and process" function, and tells something about how the semantic value is built. (*): Blogme patterns containing a semicolon (";") violate the convention that says that patterns that fail do not advance pos. Parsing "A;B" means first parsing "A", not caring if it succeds or fails, discarding its semantic value (if any), then parsing "B", and returning the result of parsing "B". If "A" succeds but "B" fails then "A;B" will fail, but pos will have been advanced to the end of "A". "A" is usually "%s*".