Adapting or creating a site for a global audience is a significant challenge. You need to localize the content, be ready to answer requests in foreign languages and, if you sell online, comply with local regulations in terms of privacy, taxation and more.
From a design standpoint, you need to integrate the localized content in the navigation. You want foreign visitors to find the localized content that was so hard to create!
HTTP can help with the navigation thanks to "content negotiation." The principle is the following: when the client requests a resource from the server, it adds information about the user preference, including the languages that he or she speaks. The server analyses the header and returns the most appropriate localized version.
The header looks like:
Accept-Language: fr-be, fr; q=0.8 en; q=0.5
I.e. it lists acceptable locales (according to RFC 1766). The optional q "quality" parameter establishes an hierarchy. The higher q are to be prefered so, in the above example, the user would want the Belgian French version, failing the generic French version, failing that the English version.
The Apache web server implements content negotiation through the mod_negotiate module. By default the module is built with Apache.
mod_negotiate supports two working modes: type-map files and multi views. Type-map are more flexible and I prefer to use it. To enbable type-map, uncomment the following line in httpd.conf and restart the server:
AddHandler type-map var
Alternatively you can add the statement to an .htaccess file, e.g. for virtual servers.
Next you must prepare a type-map for localized resources. The type-map associates files and languages. Listing 1 is an example:
Listing 1: about.var
URI: about URI: special.cgi Content-Language: fr, fr-be, fr-ca, fr-ch, fr-lu, fr-mc Content-type: text/html Description: Français URI: about-german.html Content-Language: de, de-ch, de-at, de-lu, de-li Content-type: text/html Description: Deutsch URI: about.html Content-Language: en, en-au, en-bz, en-ca, en-ie, en-jm, en-nz, en-ph, en-za, en-tt, en-gb, en-us, en-zw Content-type: text/html Description: English
The type-map is broken down in sections. The first section gives the URI for the multi-lingual ressource. In the above example, if the client accesses http://domain/about, the server will return localized content.
The following associates language tags to a specific resource. If the client requests one of the French locales, then the server will call the special.cgi script, if the user requests a German locale, the server returns about-german.html and if the user request an English page, the server returns about.html.
You can add more sections to support more languages. The URIs can point to CGI scripts, redirections, servlets, anything.
The Language Tag Problem
Unfortunately content negotiation is another great protocol that is hampered by poor browser implementations.
Technically, the user can request either a generic language, e.g. en for English, or a variation of the language, e.g. en-us for American English or en-gb for British English. In practice, users who read a variation understand the generic language as well.
Therefore browsers are encouraged to request both the specific variation and the more generic language. By default, the major browsers request only specific language, e.g. American English to the exclusion of generic English.
Since few users bother to change the default configuration, it pays to list all the variations in the type-map.
If the browser requests a local that is not available, mod_negotiate generates a (rather unfriendly) error 406. You can trap the error page through the usual ErrorDocument directive or by adding the following URI section immediately under URI:about
URI: none.html Content-type: text/html Description: Default
I like type-maps because they are very flexible. Depending on the locale, a resource can point to HTML pages, CGI scripts, servlets or even redirection (send them to a country-specific site).
If you localize many pages, you may not want to write all the type maps. mod_negotiate also support multi views which let it extract the language from the file name. See the Apache manual for information on multi views.
Content negotiation greatly simplifies the navigation of localized web site. Don't bother asking the user for his or her language preferences, get them from the browser.
About the Author
Benoît Marchal is a Belgian writer and consultant. He is the author of XML by Example and other XML books. He is currently developing new training material on UML modelling and XML.