Sunday, April 26, 2015

Fast, Easy, Free Workbook Documentation and Management with the TWB Gem

Finally.
After years of asking Tableau to implement mechanisms for documenting Workbooks in the Workbooks, and for automated ways to manage Workbook and their contents the wait is over.

We have the technology
to build apps that can access, interrogate, and manipulate Tableau Workbooks in ways that haven't previously been possible. And the technology is free, as in speech and beer.

Up until now

there have been ways to document your Workbooks, including:

and to manage them, e.g.

  • Ruby scripts I've published here (free)
  • Ruby scripts I've created for clients (unavailable to the public)
  • Interwork's tools ($$$)
  • others (?)

TWIS and The Workbook Auditor have been around for a number of years, and provide useful and valuable information about the workbooks they're asked to look at. Andy's Auditor is very polished, with good coverage, and works very well for most people. TWIS is more of working tool that goes pretty deep, uncovering lots of rich detail, e.g. field calculations, and still works well but it's a monolithic Java app. I've taken a different tack, which has led to this post and am not planning on maintaining TWIS.

I'm a bit familiar with Interwork's tools, but haven't had much use for them as I build apps that cover my needs, and can quickly create new ones on demand. And their tools are expensive.

The existing solutions are deficient
for a number of reasons.

There's a mismatch between the nature of the information in Workbooks and Tableau's structural model of analyzable data.

  • Workbooks are complex multi-segmented, deep XML files with cross-links between the branches, with many individual paths, most of which can be only partially populated.
  • Tableau is limited to accessing flat, two-dimensional tabular data. It has no concept of higher-level structures, even something as simple as a two-level master-detail hierarchy, as is discussed here. Further, Tableau shows no sign of recognizing the existence of these structures, or of the value of enabling their transparent analysis. Sadly, we had tools that could do this thirty years ago.

In building TWIS it became clear that any single app that was able to capture the multitude of two-dimensional slices through Tableau workbooks becomes very, very complex very quickly. Building one app to pull all of the interesting data about workbooks out of the them became the chase of a rapidly receding horizon, a pursuit one person, at least this person, couldn't keep up in the long run.

On top of this, the analyses and documentation generated by these approaches is external to the Tableau Workbooks themselves. As a result, there's a lot of friction in using them — one needs to run the tool and examine the documentation separately from Tableau. This is a suprisingly large hurdle to helping people understand their individual workbooks, which is what most people are concerned with.


A better way.

Several years ago I began writing simple, straightforward Ruby programs to access, analyze, and manipulate Workbooks according to the specific needs at that time. These were rooted in the experience gained from creating TWIS, proved to be good solutions for those cases TWIS didn't cover. Some of them I've published here, many others I haven't.

The TWB Ruby gem.

After writing lots of these Ruby apps, and writing the same boilerplate code multiple times I decided to create a library that models Workbooks and their components. Being Ruby, it was natural to create this as a Ruby gem, and so Twb, the gem, was born. My hope and ambition is that the Twb gem can and will be used by others to create apps to access, interrogate, and manipulate Workbooks.

Twb on RubyGems.org – ready for use.
Twb is freely available from RubyGems.org. Just like any other Ruby gem, it's easy to install and start using. Assuming that you have Ruby and RubyGems installed on your computer, installing Twb is this simple (in Windows):

That's about as simple as can be. One of the attractions of using Ruby is its emphasis on making things simple, easy and straightforward, with a minimum of fuss and bother.


Twb in use.

Once installed, Twb can be used to make writing useful and valuable apps simple and straightforward. Or at least as simple as what you're trying to do can be—the things one wishes to know about and do with Workbooks are as complex as the information models of the Workbook elements one's interested in, and what one wants to do with them. Twb's ambition is to make these apps easy to create and transparent to use.


What Versions are your Workbooks?

At the simple end of the spectrum is a simple app that scans a set of Workbooks and reports the Version and Build information for each. Version and Build are related to the Tableau version last used to save the Workbook. Build is the technical release number that is also available in Tableau's "Help | About" menu option. Version is loosely but not strictly tied to the Tableau version – 8.2, 8.3, 9.0, etc. – I've not puzzled out the real relationship, largely because I've not had the real need to.

WorkbookVersion.rb

This Ruby script is available online from Github here.


#  Copyright (C) 2014, 2015  Chris Gerrard
#  (GNU General Public License v3 notice here)

require 'twb'

puts "Identifying the Version and Build of these Workbooks:"

$csv = File.open("WorkbookVersion.csv", "w")
$csv << "Workbook,Version,Build\n"

path = if ARGV.empty? then '*.twb' else ARGV[0] end
Dir.glob(path) do |fname|
  twb = Twb::Workbook.new(fname)
  puts sprintf("  %-20s %5s  %s", twb.name, twb.version, twb.build)
  $csv.puts "#{twb.name},#{twb.version},#{twb.build}"
end

When run, the script scans for Workbooks and upon finding some does two things with each:

  • Prints the Workbook's Name, Version, and Build to the console.
  • Creates a CSV record in the file "WorkbookVersion.csv" containing the Name, Version, and Build.

This is accomplished with only 10 lines of Ruby code, leveraging Twb to handle the heavy Workbook-parsing lifting.

Here's the script in use, in a directory containing the Sample Workbooks from Tableau versions 8.3 and 9.0:

This Tableau Public Dashboard shows a viz of the CSV file generated above.

It's an example of the sort of analysis that can be achieved with TWIS, The Workbook Auditor, etc.

The Twb gem doesn't really do anything new here, but it does make doing exactly the thing you need done easy.

WorkbookVersion.rb can be instructed to scan specific Workbooks by passing the file name pattern on the command line, e.g.:

  • WorkbookVersion.rb 1.twb
    — will only scan the single workbook Science.twb
  • WorkbookVersion.rb '**/*.twb'
    — will scan all of the Workbooks in the current directory, and in all subdirectories.

Running WorkbookVersion.rb in a directory containing all of your Workbooks makes it very easy to see which Tableau versions created how many of them, and which ones, by building a couple of very simple vizzes.


What's next? We're just getting started.

I've already used Twb to create a wide variety of apps, which I'll be publishing them here, and their source code to Github, in future posts. Some of them are brand new, some of them are existing apps re-implemented to take advantage of Twb. I hope they're all useful.


The Grand Ambition.

My intention is to create a vibrant ecosystem of tools for managing Tableau workbooks, and hope that it's a fertile ground for the nurturing a community that uses and advances the tools.

Twb and the Tableau Tools will be released as Open Source projects, as soon as I can get the ducks in a row.

Everybody's welcome.

No matter what your level of experience with Tableau, Ruby, workbook XML munging, open source projects, etc., the Twb gem and related apps are available for your use, and hopefully contribution. Please feel free to join in.

Let the fun begin.

3 comments:

  1. Great job!
    Here is another little server tool
    http://www.tableau.com/about/blog/2013/2/tableau-server-click-away-chrome-extension-21413

    ReplyDelete
    Replies
    1. Thanks for pointing that out, Alexander. It really is a handy tool, and hasn't had nearly the recognition it deserves.

      Delete