Computers have vastly enhanced our ability to collect large amounts of data. It's perfectly feasible for most retail establishments to hook up their network to every cash register these days, and capture the time, date, and details of every purchase. With customer rewards cards, these can be matched back to consumer details. Throw in basic weather information, the checker's name, the store number, and you can have terabytes of data piling up in no time.
But there's a problem with all that data: it's hard for people to figure out what it all means. That's fueled a rise in Business Intelligence (BI) software: software designed to help extract patterns from large amounts of data. Most BI software lets you summarize large amounts of data into simple tables and graphs. A new entrant in the market, Tableau, raises the bar a bit further. The vendor refers to Tableau as "visual analysis" software. The goal of Tableau is to take masses of numeric data and turn it into charts and graphs with minimal user intervention, letting you spot patterns via color, shape, and placement: skills that most of us are pretty good at. I used version 1 of Tableau to explore some data, and overall the experience was a good one.
Getting Started with Tableau
Like an Excel pivot chart, a Tableau sheet starts blank, with a batch of places you can drag data fields to. You connect to a data source (which can be Access, Excel, Microsoft Analysis Services, MySQL, or various other things) and it takes a guess which fields are dimensions and which are measures. You can change the guess if it's wrong, but it's usually pretty accurate. Figure 1 shows a new Tableau sheet connected to an Excel worksheet full of sample data. The left side of the screen shows the various dimensions and measures, and the right side is waiting for these fields to be dragged to it.
To analyze the data, you just drag and drop. You can drag to a "shelf" to designate fields for columns, or rows, or filters - but you can also designate fields for marker shapes, or colors, or sizes. And depending on what you drag where, and which menu choices you make, Tableau will generate standard text crosstabs, or amazingly clever graphics.
For a quick start, drag the Date dimension to the Columns shelf, the Sales dimension to the Rows shelf, and the Product Type dimension to the color shelf. Tableau turns this into the bar chart shown in Figure 2, complete with legend. With most BI products, turning a crosstab into a chart would be a several-step process; Tableau effectively short-circuits the process to let you create the charts directly.
But just knowing sales by product by year is a very gross measure of performance. It's easy to drill in and get more information out of Tableau by adding more fields to the worksheet. Drag the Market field on the Columns shelf, and it adds subcolumns to the layout. Similarly, drag the Sales field to the Rows shelf and the Profit measure to the Size shelf. Figure 3 shows the result.
Let's take a moment and look at some of the information that you cna see on this worksheet now, after just half a dozen drag-and-drop operations:
- In decaf beverages, the profit is all in smoothies.
- Almost all the profit in decaf is coming from the Central and West regions.
- There are no tea sales in the South region.
- The East region sells a lot of espresso but doesn't manage to make any money doing it.
These facts would be there whether you were looking at a numeric spreadsheet or even the raw data, of course. The nice thing about Tableau is that the automatic use of factors such as size and color makes this sort of information jump out at the human eye. We're very good at spotting such visual patterns.
Going Beyond Simplicity
Tableau's capabilities don't end with drag-and-drop chart construction, impressive though that is. For example, you can easily get the details for that mysteriously underperforming espresso in the East region. Just use Ctrl-click to select the two bar segments representing that data, then right-click and select View Underlying Data. Tableau opens up a data sheet, as shown in Figure 4, with the actual data represented by that portion of the chart. You can also choose to export the data behind any piece of a Tableau worksheet to an Access database. Or, if you're preparing a presentation, you can copy the table as an image for easy pasting into your slides.
Filtering data is easy as well. You can use the dropdown arrow next to any field that you've dragged to the worksheet to filter on that field, selecting one, some, or all values from the field to display. You can also drag any other field to the Filters shelf, and use that field to filter the data without it otherwise affecting the display. Done with a filter? Just drag it off the Filters shelf to get rid of it.
For finer distinctions, you can drag a field to the Level of Detail shelf. For example, drop the Product field there and each bar in the bar charts becomes segmented to show the contribution of individual products to its size and height. Hovering the cursor over any segment brings up a little window (similar to a tooltip) with the full details about what the segment represents.
In addition to displaying and working with data contained in the original data source, Tableau can also create calculated data fields. There's a reasonably rich set of operators for creating calculated fields, including numeric, date, and string functions. You can also create binned dimensions, which is useful for building histograms from raw numeric data.
All in all, I'm favorably impressed with Tableau as a data analysis tool. For its core purpose - picking out patterns from large masses of denoramlized data - it works very well indeed. The tool is easy to learn, and a gallery of sample charts in the help file makes it easy to figure out how to produce everything from Gantt charts to simple line graphs from straightforward data sets. I tested with data up to a few hundred thousand rows on my local LAN, and on that amount of data Tableau's response was very fast.
Of course, no tool is perfect, especially in its first release. Perhaps the most claring limit here is the lack of support for a wide variety of data sources. Oracle is notably missing from the list of supported databases, and even the supported types are constrained in which version you can use. SQL Server 7.0, for example, is not supported (only SQL Server 2000 can be used with Tableau). If your data warehouse is in an unsupported format you're looking at a potentially time-consuming and annoying conversion to pull it into a SQL Server, Access, or other supported database.
You may also find yourself having to massage your data a bit before moving to Tableau, since Tableau itself expects to work with a single table, view, or OLAP cube at a time. If the data in question is spread across a batch of normalized tables, you need to design the view to denormalize it before your Tableau session. This is only a minor nuisance, but a nuisance nonetheless.
Still, I think these annoyances are far outweighed by the integration of graphics with analysis. In most BI products, you massage the data in a purely numeric grid, and then turn it into a graphic when you're satisfied with your results. This means that you spend a lot of time staring at rows and columns of numbers, trying to make sense out of them and spot patterns. WIth Tableau, you can use the graphics to help spot and refine the patterns even as you're slicing and dicing the data. This makes a huge difference in the ability to find useful patterns in the first place, and should justify the purchase price for many organizations.
Pricing and Specifications
Tableau 1.0 is available in three editions. The $999 Standard Edition can connect to Microsoft Excel, Microsoft Access, or plain text files. The $1299 Professional (MySQL) edition adds MySQL to the list of supported data sources, while the $1799 Professional edition extends the list to include Microsoft SQL Server and Microsoft SQL Server Analysis Services. If you have the full Professional Edition, you can also purchase a separate server-based product to add connectivity to Hyperion EssBase and IBM DB2 OLAP Server databases, though they don't publicize the pricing for that product. All prices include a year of maintenance.
Tableau runs on Windows XP or Windows 2000. You'll need 128MB of RAM (though, as always, more is better) and 50MB of free disk space to install the product. If you're interested in evaluating Tableau with your own data, you can apply for a 30-day free trial at the Tableau Software Web site.
About the Author
Mike Gunderloy is the author of over 20 books and numerous articles on development topics, and the lead developer for Larkware. Check out his latest books, Coder to Developer (from which this article was partially adapted)and Developer to Designer, both from Sybex. When he's not writing code, Mike putters in the garden on his farm in eastern Washington state.