12.1 - Serialization

Frequently we need to serialize some data, that is, to convert the data into a stream of bytes or characters, so that we can save it into a file or send it through a network connection. We can represent serialized data as Lua code, in such a way that, when we run the code, it reconstructs the saved values into the reading program.

Usually, if we want to restore the value of a global variable, our chunk will be something like varname = <exp>, where <exp> is the Lua code to create the value. The varname is the easy part, so let us see how to write the code that creates a value. For a numeric value, the task is easy:

    function serialize (o)
      if type(o) == "number" then
        io.write(o)
      else ...
    end
For a string value, a naive approach would be something like
    if type(o) == "string" then
      io.write("'", o, "'")
However, if the string contains special characters (such as quotes or newlines) the resulting code will not be a valid Lua program. Here, you may be tempted to solve this problem changing quotes:
    if type(o) == "string" then
      io.write("[[", o, "]]")
Do not do that! Double square brackets are intended for hand-written strings, not for automatically generated ones. If a malicious user manages to direct your program to save something like " ]]..os.execute('rm *')..[[ " (for instance, she can supply that string as her address), your final chunk will be
    varname = [[ ]]..os.execute('rm *')..[[ ]]
You will have a bad surprise trying to load this "data".

To quote an arbitrary string in a secure way, the format function, from the standard string library, offers the option "%q". It surrounds the string with double quotes and properly escapes double quotes, newlines, and some other characters inside the string. Using this feature, our serialize function now looks like this:

    function serialize (o)
      if type(o) == "number" then
        io.write(o)
      elseif type(o) == "string" then
        io.write(string.format("%q", o))
      else ...
    end