How the EXCEL do I start spatial analysis in Tableau? 5 calculations and a lot of fun!

Got spatial data and don’t know where to start? Here’s how to hone in those coordinates to a tidy cartographic work of art.

We’ll start off with a use case: our favorite cafe rocked its socks off and launched a second cafe. We figured those closest to it would take advantage of the newer closer location and our informal surveys have proven this. Now, let’s get to the data.

Our first analysis shows 80% of homes are closer to the primary cafe. Let’s dig into this and see what we can do to get real solutions.

Points, lines, and other marks

Tableau includes a number of ways to plot data based on known geographic elements – city, state, county, and a host of other options. When you do so, it usually looks like this:

The data gets aggregated to the geographic level selected using generated lat/long fields. In this example, the points are geographically averaged to the center of the city and sized by total counts of houses. HOORAY and Tableau toddler approved! I can see I have several homes across a number of cities, as well as 31 unknown plot points. This could be a non-unique city name (solved by adding state) or a city that’s not known in Tableau (manually add those coordinates). All great things, but this type of analysis won’t help me today.

I have hard-coded coordinate data that gets down to the street level. You’ll notice Tableau chooses to average these marks. Putting Street address on detail means they’re displayed at the row level.

This helps, but doesn’t get me all the functionality I want. It’s time to make points!

Points are spatially wrapped data bites. Pick your favorite appetizer and that geo-data is wrapped all in a fluffy pastry. It makes it better to analyze and sates hunger.

Here’s how to do it:

MAKEPOINT([Lat],[Long])

Latitude goes first. It has major attitude and won’t take being in second place. Notice what happens when we toss this out in Tableau (double click and watch it go!).

Here’s why real point data rocks: It’s “aggregated” but every point is visible. It reports as one unit (hover and it will tell you the total number) but displays broken down without fuzzy averages obscuring the spatial truth.

Within my data, I have two types of entities: my two restaurants and homes. I don’t want to play data shaping games, so I am going to calc the EX**L out of this.

I need to be able to compare distance between the restaurants and the homes. I only have 2 cafes, so I can calculate them out into their own spatial columns.

MAKEPOINT(
{MAX
//This is a hidden Level of detail calculation
(If [Row]=17 THEN [Lat] END)
//Conditional statement ONLY gets the Latitude for Row 17 or my primary cafe. Use a name instead of row or whatever else here.
},
{MAX(If [Row]=17 THEN [Long] END)}
)

I’ve visually split this calculation out. Punctuation matters a lot here. The hidden level of detail (LOD) calculation is key. I absolutely do not want to play ETL tricks, so this calculation lets me pass go quickly. Rather than my primary cafe’s point being on 1 row, the LOD puts it on every row so I can compare distances. Without it, I’m stuck with aggregate/non-aggregate errors or playing Cartesian union and join games (not fun). I’ll repeat this calculation for my second cafe.

Here comes the fun part: real grown up spatial analysis! Anecdotally, I know people go to the cafe closest to them from talking to them. How much could this be affecting my business?

At a basic level, I have this calc:

MAKELINE([Geopoint],[Primary Hospital Geo])

This encodes the space between my points as a line – a real solid line I can quantify and use for analysis.

By itself, this gives me a picture of life before I opened the second restaurant. And, maybe, a future logo…

Going the distance

We’ve got our points, we have our lines, but to really make this shine, we need distance. This calculation gives a bird’s eye view of how far my houses are.

DISTANCE([Geopoint],[Primary Cafe Geo],'miles')
//'km' also works, and other things do too

We can now see superfans will drive over 100 miles to enjoy our fare. Yes, it’s that good! Now, to kick this up a notch, I’m going to plot paths by the closest restaurant.

IF DISTANCE([Geopoint],[Primary Geo],'miles')<=DISTANCE([Geopoint],[Secondary Geo],'miles')
//If the distance is the same, I'm assuming they're going to the primary cafe
THEN MAKELINE([Geopoint],[Primary Geo])
ELSE MAKELINE([Geopoint],[Secondary Geo]) END

This gives us the exact analysis we want. From here, we can also calculate percentage of homes that are most likely to frequent a particular cafe due to distance.

Hungry for part 2? The next post explores several ways to do scenario testing with spatial analysis.