An MUA (Mail User Agent) featuring an innovative approach to email message storage is now under development, and it bodes well for those email users who number their messages in the thousands. Gmail (http://gmail.linuxpower.org/), the brainchild of Wayne Schuller, a young Australian programmer, uses the MySQL database server, and retrieves, sorts and displays messages via SQL, the well known "Structured Query Language." It became possible to develop Gmail as an Open Source project when, earlier this year, MySQL changed its licensing to the GNU GPL. Schuller is a advocate: "I'm a big supporter of the GNU project and the importance of GPL licensing; that pushed me in the [direction of] Gnome two years ago." (He also reminded this reviewer that Gmail is not the only MUA using a SQL database; Pronto (http://www.muhri.net/pronto/), written in Perl, is very much a going concern.)
Gmail in "three pane" mode. (Click here for larger image)
It isn't hard to accumulate thousands of email messages. If one makes some careful choices among the myriad mailing lists that abound across the Internet, messages by the hundreds will pour in daily, and add up to thousands as the days go by. Given that hard drive space has become quite inexpensive, anyone can now create their own archives of lists that are of interest to them. If you've ever struggled with those online "searchable" archives of email lists, accessed with a web browser, then you can appreciate how such personal archives can have a use, sometimes a life-saving use, if the solution to a quandry can be quickly found because those archives were created and maintained.
Gnomes, Trees, and QueriesStoring thousands of pieces of information in a fashion that permits easy retrieval is a task computers have always been eminently qualified to undertake. Over the past ten years or so the development of computer databases has converged on SQL as the preeminent tool for manipulating the contents of these vast warehouses or "mines" of data. SQL is common to practically all the major commercial database products. The historical timeline stretches from origins in the Ingres and Postgres databases and forward to the current Open Source incarnations of PostgreSQL and MySQL. Gmail takes advantage of this mature technology to "nail down" the software requirements for storing, sorting, and viewing thousands upon thousands of messages.
Like many Open Source projects, Gmail began as a way to learn programming in a particular environment, in this case, the Gnome graphical desktop. In keeping perhaps with the trend set by pine and elm, several MUAs written for the Gnome desktop have adopted names of trees, such as Balsa, Spruce, Mahagony and Eucalyptus. Initially Schuller wanted to join the Balsa project, already well underway:
"The project [Gmail] allowed me to learn how to code in gtk and the gnome technologies. I did want to contribute to Balsa, but after many hours effort I found I couldn't really understand the code. Starting from scratch allowed me to learn all the gtk/gnome stuff."
Although still very much in "beta" phase, Gmail is already on its way to becoming a full-featured email client. For receiving mail it supports POP downloads from multiple mail accounts. APOP and IMAP have not been added yet, but APOP accounts can be utilized via a combination of fetchmail and a POP server (for instance qpopper) running on the local host. There is drag and drop support for utilizing the Gnome addressbook to manage email addresses. Messages stored in the MySQL tables can be exported to a standard Unix mbox file format, and, as noted, Unix inbox spool files can be read into Gmail with a POP server. Sound events can be configured via the Gnome Control Center, and printing is provided by the Gnome-Print facility. For users of dialup connections, outgoing messages can be composed offline and queued for sending later. Just in the past month or so Gmail's mime support has made great strides. For the true "hacker" (not to be confused with computer security vandals) Gmail's user interface can be custom tuned with a little proficiency in the use of Gnome's "glade" facility.
Virtual Folders: Mileage Out of SQLDespite this impressive list of features, Gmail's real strength, what sets it apart from the pack, is its use of SQL queries to create and maintain "virtual folders." Many email clients have built-in facilities for sorting incoming mail into various "folders," or separate files according to rules defined by the user. The old Unix workhorse for this purpose, procmail, is still widely used, but many newcomers shy away from procmail due to "regex phobia," fear of the arcane syntax used by the "regular expressions" that make up procmail's rules for sorting.
For users who need only a few folders, and don't anticipate accumulating thousands of messages, a few separate "folder" files will not present any problems. But for the industrial strength email list addict (or anyone else who finds it necessary to shepherd thousands of messages) these separate files soon become both numerous and large, factors which make the work of backing up or archiving messages increasingly tedious. At a certain point the diminishing returns of the file folders method become apparent, and it makes more sense to maintain (and eventually archive) *one* file that contains all the messages received in a certain period of time. Instead of individual files ("folders") containing the messages of each email list (or subject, or sender, or whatever criteria are used to define the folders...more on which later) Gmail creates virtual folders which are simply SQL "views" of the message database. From Gmail's User Guide:
"It's main feature is the powerful vfolder mechanism. Imagine all your emails are sitting in the one big folder, but you can look at it in different ways (queries). These queries happen at run-time. So you can have a virtual folder (vfolder) which gives you all messages with 'ham' in the subject. And you can have another vfolder which gives you all messages with 'pork' in the subject. And if someone sends you a message with 'pork and ham' in the subject, it will appear under both queries!"
No new files are created, but from the user's point of view, these virtual folders appear to be identical in appearance and use to those that are based on a set of individual folder files. When it comes time to archive the MySQL tables (two are used) that hold Gmail's messages, one has the option of exporting the tables to an mbox format file. Then all sorts of possibilities are open, such as indexing that file with a tool like "glimpse" (http://www.webglimpse.org), or compressing it with either gzip or bzip2 and then using grepmail (http://grepmail.sourceforge.net) to read out messages matching a given condition or set of keywords. Or, other MySQL tools can be utilized (or created!) to access and retrieve messages directly from the tables. The use of MySQL tables means the keyword for Gmail's message storage is: power.