or, Is my 'Profit' the same as your 'Profit'?
It's helpful to be able to identify what fields are being calculated in Tableau. From an enterprise governance perspective it's essential.
Unfortunately Tableau doesn't provide this information except through manual inspection of the fields' calculations in the calculation dialog. This is a serious deficiency, one that hinders Tableau's adoption in the Enterprise, where senior management is reluctant to let Tableau loose "in the wild" for fear of having dueling metrics without transparency.
Fortunately, there's an easy solution. This post presents a simple Ruby script I whipped up that examines Tableau workbooks and creates a CSV file containing the relevant information about the calculated fields. This CSV file is then usable with Tableau to provide the transparency necessary to tell whether or not two metrics are in fact the same.
Example: the Tableau Sample Workbooks' Calculated Fields
As an illustration of somewhat real world calculated fields, here are those found in the Tableau v8.2 sample workbooks. They range from simple to reasonably complex.
Introducing TTC_CalculatedFields.rb
Here's how it works in the simplest case: you copy/paste the code below into file named oh, say "TTC_CalculatedFields.rb
" — it's then a bona fide Ruby script. Then from a command prompt run it in from the directory containing the TBWs you want the calculated fields from.
The usual way to run it is with
"{pathtoRuby}\Ruby" TTC_CalculatedFields.rb
It will then find all of the TWBs in the directory, and all the sub-directories, and record all of their calculated fields.
The usual caveats apply.
-
You need to have Ruby and the nokogiri and open-uri gems installed for it to work. This is very simple and straightforward, and most organizations have someone who's already using Ruby to help if you need it.
-
It only examines TWBs, not packaged workbooks. It's mechanically not difficult to add the functionality but it introduces complexities in handling name collisions and I wanted to keep it simple but useful.
-
There are technical properties of calculated fields that are meaningful to deeper analysis but don't mean much on the surface.
-
Groups are a specialized calculated field, but their internal structure is very different from the other types, making it difficult to add them to the properties/fields of the other classes. I decided not to force them so they show without formulae - this is a fairly common problem in analyzing Tableau workbooks and other hierarchical and network data structures, and Tableau isn't designed to support analysis of data like this.
-
It carries no warranty or guarantees that it'll work.
On the other hand, it should be pretty easy to address any shortcomings it has. Ruby is an elegant language that makes fixing and improving things generally simple and straightforward.
# TTC_CalculatedFields.rb - this Ruby script Copyright 2013, 2014 Christopher Gerrard
require 'nokogiri'
require 'open-uri'
$recNum = 0
$CSVHeader = <<CSVHEADER.gsub("\n",'')
Record Number,
Workbook,
Workbook Dir,
Data Source Name,
Data Source Caption,
Data Source Name (tech),
Name,
Caption,
Name (tech),
Data Type,
Role,
Type,
Class,
Scope Isolation,
Formula Code,
Formula,
Formula Comments
CSVHEADER
def init
$f = File.open("TTC_CalculatedFields.csv",'w')
$f.puts $CSVHeader unless $f.nil?
end
def parseCalculatedFields twbWithDir
twb = File.basename(twbWithDir)
twbDir = File.dirname(File.expand_path(twbWithDir))
puts "Found '#{twb}' in #{twbDir}"
doc = Nokogiri::XML(open(twb))
calculationNodes = doc.xpath("//workbook/datasources/datasource/column/calculation").to_a
calculationNodes.each do |d|
#-- Data Source --
dsTechName = d.xpath('../../@name').text
dsCaption = d.xpath('../../@caption').text
dsName = if dsCaption == '' then dsTechName else dsCaption end
#-- Field --
caption, = d.xpath('../@caption').text
techName = d.xpath('../@name').text
name = if caption == '' then getName(techName) else caption end
dataType = d.xpath('../@datatype').text
role = d.xpath('../@role').text
type = d.xpath('../@type').text
#-- Formula --
formulaCode = d.xpath('./@formula').text.gsub(/\r\n/, ' ')
formulaLines = d.xpath('./@formula').text.split(/\r\n/)
formula = getFormula( formulaLines )
comments = getComments( formulaLines )
fClass = d.xpath('./@class').text
scopeIsolation = d.xpath('./@scope-isolation').text
#-- emit CSV
putCSV(twb,
twbDir,
dsName,
dsCaption,
dsTechName,
name,
caption,
techName,
dataType,
role,
type,
fClass,
scopeIsolation,
formulaCode,
formula,
comments
)
end
end
def getName techName
techName.gsub(/^\[/,'').gsub(/\]$/,'')
end
def getFormula lines
formula = ''
lines.each do |line|
line.strip
formula += ' ' + line unless line =~ /[ ]*\/\//
end
return formula.strip
end
def getComments lines
comments = ''
lines.each do |line|
line.strip
comments += ' ' + line.strip if line.start_with?('//')
end
return comments.strip
end
def putCSV(twb,
twbDir,
dsName,
dsCaption,
dsTechName,
name,
caption,
techName,
dataType,
role,
type,
fClass,
scopeIsolation,
formulaCode,
formula,
comments
)
$recNum += 1
csvLine = <<CSV.gsub("\n",'')
"#{$recNum}",
"#{twb.gsub('"','""')}",
"#{twbDir.gsub('"','""')}",
"#{dsName.gsub('"','""')}",
"#{dsCaption.gsub('"','""')}",
"#{dsTechName.gsub('"','""')}",
"#{name.gsub('"','""')}",
"#{caption.gsub('"','""')}",
"#{techName.gsub('"','""')}",
"#{dataType.gsub('"','""')}",
"#{role.gsub('"','""')}",
"#{type.gsub('"','""')}",
"#{fClass.gsub('"','""')}",
"#{scopeIsolation.gsub('"','""')}",
"#{formulaCode.gsub('"','""')}",
"#{formula.gsub('"','""')}",
"#{comments.gsub('"','""')}"
CSV
$f.puts csvLine unless $f.nil?
end
init
Dir.glob("**/*.twb") {|twb| parseCalculatedFields twb }
$f.close unless $f.nil?
An interesting exercise would be to round up the various workbooks that have been published about Table Calculations to see what that population looks like. Maybe something to do during the conference this week.
I hope you find this little utility useful. If so, please leave a comment. If it doesn't work, please leave a comment — I may or may not be able to help (time, other demands upon, etc.). If you extend it in some useful way, please leave a comment.
Or just leave a comment.