Saturday, September 29, 2012

Reconsidering previous IFNULL post

This post is an extension to my original post on IFNULL.

The comments it received (thanks to everyone) got me thinking a little differently and somewhat deeper (I think). The upshot is that I propose this as a workable improvement to the current IFNULL function:

new IFNULL

IFNULL(expr1, expr2, expr3)

Its structure matches the language:

IF expr1 IS NULL
THEN expr2
ELSE expr3

Benefits

  • The logic is clear and transparent.
  • It's more powerful and flexible, with the opportunity to provide an explicit non-NULL value when expr1 IS NULL.
  • It avoids backwards compatibility problems—a new function, it doesn't conflict with existing calculations.

Drawbacks

  • In the normal (maybe) case, expr3 will be the same as expr1; this will be a bit if a cognitive burden, but not all that much.

About the current IFNULL

new - Mapping IFNULL parsing

I'm adding this section because I'm pretty sure that my objection to the current IFNULL operation is clear. So, pictures being better than words in some circumstances, I mapped it out:

Even though these formulations are logically equivalent, I'm confident that the second one is clearer than the first, and it matches the current IFNULL documentation in the function editor.

The problem with the second reading is that the function's name is the reading's negation. It's this collision that I'm urging be fixed.

I understand that the current formulation is readable by programmers and technical people who are experienced in interpreting these things (I've been at it since 1974). But, and this really is the whole point of this blog, Tableau was invented to move data analysis and visualization from the technical-only realm and open it up to non-technical people who need to understand their data but shouldn't be burdened with the arcane technical aspects. Every single thing that's not as clear and transparent as it could be is an impediment to real human people. Forcing non-technical people to adapt to the technical conventions is contrary to the whole idea of eliminating the friction between between people and the information and insights in their data.

addendum to original post

Note: in my original post's comments, this section was unclear because I used angle brackets—< and >, and these were interpreted as invalid tags and not shown. Sorry about that.

In a nutshell, my objection to the current IFNULL(expr1,expr2) function is that from a human standpoint, the reader has to figure out:

  • what, exactly, is being evaluated to see if it's null, and depending upon the assessment:
    • what positive action is being taken when it is null, and
    • what positive action is being taken when it's not null.

If I have it right (it's been a long time since undergraduate logic class), the two semantic constructions for {FUNCTION}(expr1,expr2) are:

  1. If {IS NOT NULL -expr1-}, then return expr1 else return expr2
  2. If {IS NULL     -expr1-}, then return expr2 else return expr1

  1. parses strictly right to left and is easy to understand
  2. requires a forward leapfrog when -expr1- IS NULL, followed by a backreference when -expr1- IS NOT NULL, and is, at least for me, twisty and unclear

I believe that:

  1. 1. is better than 2., and
  2. the documentation describes 1., although it's not as clearly worded as it could be.

I understand the problems associated with 1., most notably that a negative expression is typically harder to parse and a negative–null coupling is doubly difficult to wrap one's head around, and changing the expression causes backward compatibility problems (and Tableau hasn't yet implemented self-analysis).

Footnote: I'm sure someone could do a better job on the logical formalisms.

Thursday, September 20, 2012

Putting Measures in the Middle of a Viz

The Tableau Public-published dashboard "Proposed Placement of Measures in Interior Viz Columns" (below) illustrates one of my favorite alternate vizzes, shown in the "Measures in the Middle" worksheet in the top of the dashboard.

In it, the sums of the measures of interest are placed in the body of the viz, not hanging out way over at the right end, as is the standard Tableau layout.

There's a lot of value in promoting the measures to the more cognitively valuable left-hand side of the via, e.g. they're picked up more quickly in the standard visual scanning pattern; and they're closer to and thus more intimately linked to the Name as ID of the Dwarf to whom they belong.

Even though "Measures in the Middle" and "Measures to the Right - Tableau's Layout." contain the same data there's a very large visual difference between them the effect of moving the Measure's bars from the right side to near the left side is dramatic.

I had intended to write up some more about this (published it late in the evening). As luck would have it, Joe Mako commented with an alternate example worksheet that's visually cleaner than my faked-up dashboard. In responding to his example and comment I covered much or most of what I intended to add here. So I'm pulling it up from the comments to make it a little easier to see. But you should see Joe's example.

My emphasis is on how to make Tableau better by making it easy to do reasonable things. And I believe that liberating the measures' presentation from its current rightmost-only position in the viz is reasonable and valuable.

I wrote "there's no data-structural reason that this placement can't occur." and I stand by this.

There's nothing in the structure of the source data, nor in the structure of the data in the viz that precludes what I want Tableau to do. The elements on display are identical in my two views - they're all at the same level in the data hierarchy. (this is a simple case, because there's only one record per Dwarf, but the principle holds for multi-dimensional data that's aggregated to a common level)

Re: Tableau's VizQL foundation

I wish I knew how VizQL works. I don't. I've never seen documentation on it. Not a grammar, not a write-up, nada. One of the things that drew me to Tableau was its basis in a data management and presentation language, but Tableau has seemingly stopped promoting, even surfacing, VizQL as its underpinning.

I do know very well the original 4GLs that invented the data structuring and presentation space. FOCUS was created for exactly these purposes, and had all of the data-operational constructs necessary to access, organize, aggregate, and present data, some of which were much clearer than Tableau's analogs. Unfortunately, since it originated in the age of mainframes, COBOL reporting, and data processing, it lacked Tableau's deliberate central focus on data visualization capabilities. I've often wondered what the structural similarities, and dissimilarities, are between them.

I don't think that VizQL is somehow incapable of accomplishing what I'm asking for, either as-is or with a modicum of adjustment. I see no structural barrier for it, and think I see the shape of the problem domain well enough to be confident in my assessment. If nothing else, if VizQL isn't capable of being improved in this way, or Tableau is unwilling or unable to do it, it opens the door for the innovator who will.

Is this visualization, or is it reporting?

Where's the bright line between what Tableau does and what reporting software does? I don't see one. Tableau is a data visualization product. Reports are data visualizations.

The biggest differences I see between Tableau tables and normal reporting software reports are things I consider to be deficiencies in Tableau and hope to see corrected soon.

Some examples:

  • The strict left-right ordering of dimensions in the same row is a big waste of space compared to dimensional folding, with each succeeding dimension underneath and slightly offset from its predecessor.
  • Strict sub-totaling for every most-granular dimension (when sub-totaling is on), even those with only one record. This is worse than useless, it dramatically increases the cognitive load on the user who has to sort out the true sub-totals from simple restatements of a single data value.
  • Granular formatting of the elements; it would be extremely helpful to be able to provide richer formatting, with control over the individual formats for each structural element in the viz.

Warts on a toad.

Are you still reading this?

Why? Most people don't care about warts on a toad. And why should they? Toads are warty; it's their natural state, normal and expected. If, when handling a toad, one encounters a wart, it's no surprise. If a toad grows another wart, nobody really cares; after all, what's another wart on a creature already festooned with them?

(I know that toads' skin lumps aren't really warts; I'm invoking poetic license)

An Ode to Warts

–or–

A Fractured BI Fairy Tale

The traditional BI products are warty. Toady. Awkward, even unpleasant to handle, they're difficult to use, putting lots and lots of crufty operational controls, gadgets, layers, and other lumpy stuff on top of your data, making it very difficult to get at, and even harder to make sense of.

And yet, in a world populated only by toads, the population of people seeking to understand their data thought that toadiness was an essential characteristic of their BI tools—all the tools were like that, so it must be good, eh? They even came to embrace their essential wartiness, even to the point or proclaiming that wartiness/toadiness was a desirable characteristic of BI tools. In advanced cases some kool-aid drinking souls grew to love their tools' toady wartiness and declared it a virtue without which BI could not exist, much less flourish.

Wart-Free BI

–or–

Enter Tableau

When Tableau came upon the BI scene it was revolutionary. Designed from the outset to make it as easy as possible to connect to, organize, and quantitatively visualize data, it was virtually wart-free. Devoid of the usual lumpiness and bumpiness, it provided a smooth unblemished palette upon which data could be painted in a dynamic model that encouraged and fostered exploration of the data.

Close but not quite.

–or–

Why Tableau's warts are so irksome.

Tableau has its warts. Very few compared to it's enterprise BI brethren, but there are some. This blog is in one sense an inventory of the warts I encounter that vex me enough to keep track of.

Because Tableau is so remarkably unwarty those that it does have stand out like, well a wart on the tip of someone's nose. They're prominent because of their rarity, sometimes disproportionally so in relationship to their real effect.

Encountering a Tableau wart is jarring. It interrupts the normally fluid user-Tableau interaction and imposes upon the user the cognitive cost of recognizing and interpreting it, and the effort to adjust and accommodate the wart and its effect.

Why we need to worry about warts.

–or–

Removing warts is important.

There are two aspects to addressing Tableau's warts: guarding against the introduction of new warts should be a primary principle of designing and implementing Tableau's new features, and existing warts must be aggressively hunted down and eliminated.

Evolving a commercial software product takes a lot of time, energy and attention. Adding new features, expanding the breadth and depth of functionality are the whole point. When adding the new features it's critical to be constantly on the lookout for new warts and ruthlessly eliminate them. As Tableau has evolved it appears that this principle hasn't always been given the attention it needs.

Guarding against the introduction of new warts isn't enough. It's even more important to relentlessly focus on hunting down and removing the existing warts. Doing this will continue to improve the product in a different dimension than adding new features, but one that's arguably more important to the overall product quality. Releasing a new version that consists only of wart removal is good and valuable, and sends a clear signal to the product's users that their investment in the product is well made and worth continuing.

Evolving a software product by only considering the new features and functionality, and not also considering fixing existing problems—removing existing warts, is shortsighted and inevitably leads to an increasingly warty product. Eventually, the product evolves into a toad.

Tableau's future.

–or–

Tableau's future?

As Tableau continues to evolve it's becoming increasingly warty.

If the trend continues Tableau will become a toad. If this happens Tableau will have become one of the products it successfully competed against by being simple, easy, and transparent in use, and with enough depth and breadth of functionality to let people achieve their data analysis goals with a minimum of friction. At this point Tableau will be vulnerable to new products that provide the same basic functionality without Tableau's warts. Tableau the disruptive BI tool will become itself displaced by newer, less warty competitors.

A request for Tableau Software.

Don't let Tableau become a toad.

My Tableau Wish List

With TCC 2012 coming up, there's not enough time to write separate blog posts for all the individual friction points and areas where Tableau should (IMHO) make improvements, so I'm listing all the big ones here. I'll elaborate on them as time and opportunity permit

Have something to add to the list? Leave a comment or sent me a message and I'll add it in if it fits, with attribution unless you wish not.

Eliminate all the friction identified in this blog –or– Don't let Tableau become a toad..
(naturally)

Provide a dashboard layout manager capable of producing real publication-quality material.
Tableau's improved the dashboard layout manager with each release, but it's still like drawing with crayons.

Provide Tableau Desktop automation.
Routine repetitive tasks shouldn't require us to do work a trained monkey could handle.

Incorporate a true programming language, preferably Ruby or Python.
For calculated field coding, automation (per previous point), data munging etc.
Along with this and automation, add the ability for Tableau to access data from a standard pipe, this would enable a tremendous expansion of the high-value Tableau Use Cases.

Create and publish a bona fide Tableau philosophy.
It would explain the guiding principles behind the product, and be the standard to hold the product to.

Add the ability to access hierarchical, multi-path data.
This is a significantly harder problem to solve from the user experience perspective, but Tableau's inventors cracked this nut with single-path data visualization, to their eternal credit. This ability is necessary for Tableau to implement the next item in this list:

Add introspection and self-analysis.
Tableau has no general ability to expose its internals for analysis.
Workbooks are collections of data sources, worksheets, dashboards, real fields, calculated fields, etc., that have multiple interrelationships. It's extremely helpful to be able to see, from a data perspective, what's in a workbook, and how the parts are related.
With this ability we'd be able to answer questions such as:
  •"Is this data source used, and if so where?"
  •"Which dashboards does this worksheet appear in"
  •"How many different ways is Profit calculated in my workbooks, and in which worksheets and dashboards do they appear?"
  •"If I change this database, how much work will I need to do to update my Tableau workbooks?"
I built The Tableau Companion (TCC, formerly TWIS) to be able to answer these questions, but it would be far, far better to have it be an integral part of Tableau.

Add visualizations for hierarchical, multi-path data.
Extensions of the previous items, this ability is at least useful in visualizing the structure and content of Tableau workbooks. A visualization of the Tableau Science Sample Workbook can be seen here, and other examples can be seen here.

Hire someone to ensure that Tableau honors the Tableau philosophy.
Someone responsible for ensuring that the various Tableau products adhere to the Tableau philosophy in their broad strokes and details. Ideally, this would be me.

...

Tuesday, September 18, 2012

IFNULL - is not "IF NULL", is "IF NOT NULL"

Here's one that really twists my head around.

I frequently need to assign a placeholder value to a field when the Tableau interprets the data value as NULL. In most cases it's assigning a zero to a null value. Using a calculated field as a proxy for the real data field allows me to do this, and all I need is a way to assign the value zero (0) to the calculated field when the data value is null. The logic of this is:
    to the calculated field:
        if the data field is null assign zero
        else assign the data field's value

The Tableau IFNULL()function does this, and in exactly the way that it should. Here's the Tableau calculated field dialog showing IFNULL() (edited for space):

Spot the problem?

The description of the function is the accurate description of the function's behavior. When the data field's value is NOT null, its value is used for the calculated field, otherwise the value of expr2 provides the value. The function's description and behavior match each other, and are exactly what we need.

But the function's name is wrong.

It should be IFNOTNULL
—NOT—
IFNULL

This misnaming is at least confusing, and has the potential to lead people to making poor or outright wrong decisions when they rely on the results of employing it based on the semantics of its name.

Oddly, the logically inverted naming makes it harder for experienced programmers to find the right function to use than for inexperience people.

Monday, September 17, 2012

Why this blog?

I loved Tableau when I found it better than 5+ years ago because it embraced a novel approach: removing the friction from the process of accessing, organizing, and analyzing data.

Compared to the products that were dominant in the marketplace it was an absolute revelation. But it's not perfect—there are lots of little grains of sand in the gears that might not be too bad individually, but in combination impose a substantial cognitive load.

As Tableau moves forward adding functionality it's going to be very interesting to see how it evolves. Will time and attention be spent finding and removing the sand, or will the pressure to add new features dominate, leaving the existing irritations in place, and adding new ones.

So far it feels like new features and functionality are being added by individual developers, each perhaps reasonable in isolation, but out of synch enough with the other parts to be just a little, and sometimes a fair bit, jarring.

I started writing this blog as a place to keep track of all the little things that irritate me as I use Tableau.

Part of the reason is because I'm something of a purist—I believe that the best tools should be as good as they can be.

Partly it's because I have an attachment to Tableau—it's what made BI interesting again after a decade and a half of boredom listening to the droning on and on of data warehousing's blathering at the alter of big-Big-BIGGER-BIGGEST tooling as the only way to make sense out of data.

Some of it was a shameless attempt to get Tableau to hire me to come and help fix their product—if I had a good, solid comprehensive list of things I could fix I could talk my way into getting them to let me do it. I've talked to a fair number of Tableau people, many of whom are very sympathetic. But nobody at Tableau seems interested in hiring someone who's only going to be cleaning and tidying up. I get the sense that the emphasis is on moving the product forward by adding more to it.

But why do I care, really?

Because I believe that computers should help us, and get out of the way absolutely as much as possible.

I once worked for a company that had a product much like Tableau, that busted down the barriers between people and data and made it easy and straightforward to explore and find information and insights in the data. But that company forgot what made then product special - its simplicity and ease of use - and went down the path of adding more and more cruft and crap onto the basic product, layering more and more friction onto it until it lost its way and failed to live up to its promise.

Saturday, September 8, 2012

Dashboard Text Kerning Problems

This embedded Tableau Public-published dashboard reveals a problem with Tableau's text kerning. The spacing bordering bolded and italicized text is greater than it should be, sometimes much greater. With bolding, it appears that the character immediately preceding the bolded text influences the magnitude of the problem.