PL-300 – Section 6: Part 1 Level 5: Other visualizations
Hello and welcome to level five. In this level, we’re going to have a look at some of the other visualisations that we’ve not previously looked at. So, in level two, we had a look at the table visualisation. Level three, the matrix and the bar and the line charts. Level four, we’ve had to look at adding more control, more user interaction into your visualisations. So now, we’re going to continue to expand our repertoire of visualisations.
So first of all, we’re going to create a line visualisation and it’s one that we’ve created in the past. We’re going to put the date and we’re going to put the sales volume as the value, and we’re going to separate this in terms of the different region names. And you saw that we had a problem last time. The problem is that what is fifth and what is second keeps changing.
So, here for instance, we’ve got West Yorkshire being second, this is in 2006, but when we get to 2010, West Yorkshire is now third. And it’s fairly easy to see with, perhaps, these six, but not so easy when you’ve got more than these six. To say, what is happening, how the interactions with the jockeying for position, who’s first, who’s second. Well, we can solve that dilemma by using a ribbon chart.
Now, I just want to look at the stacked area chart, just to have another look. Here now we can’t even see which is first, second, or third but we can see the grand total. So, both of these have got advantages and disadvantages. The ribbon chart takes some of the more interesting aspects from each.
So, let’s have a look. Here, you can see that we’ve got West Yorkshire starting off third and then it comes up to second. Then goes down to third again, similarly you can see very clearly that South Yorkshire starts at fifth and then becomes fourth in 2001, before slipping back down. So, the ribbon chart here, allows you to see the volume with the total. So, it’s equivalent to the stacked that we had previously, but it doesn’t display them in a fixed order like the stacked charts do. With West Yorkshire always at the top, Greater Manchester always at the bottom. Instead, it puts at the top the one which is the biggest and at the bottom the one which is the smallest. And you can see, if you hover over these things, all of these values that we’ve the rank as well. So here we can see the 2005 and 2006 rank is four. Here we’ve gone from five to four, up one place. And we can also see this sales volume change which is also quite interesting to have, all at the tips of your fingers. I didn’t need to do any additional programming to get that from the line chart. All I needed to do is change it to a ribbon chart.
Now, let’s just have a look at the formatting for the ribbon and you will see that there is a special category here called ribbons. First of all, we’ve got the spacing, are there any gaps between the various categories? So, now you can see, being exploded a bit. Now, this percentage is or this number, is a percentage of the total high, so the entirety of the Y-axis. Do the series colours need to match throughout? So, red here is entirely South Yorkshire. Without that, we would have a lot of greys in between, so these are the actual ribbons where it can flow in-between. I can’t really see the use of that. So, always have this on. Transparency, well fairly obvious, how transparent do you want the in-between bits, the ribbon, to be? So, that gives you a bit of shading. And then a border, do you want a border to be at the top and bottom of each of the strands, each of the ribbons, each of the lines?
Now, you may also notice, there’s a lack of a Y-axis here. Part of this reason is because you can space it out and therefore, up here, is not really meant to be the total of everything. It’s not stacked in the sense of being able to give you a total answer. So, this is when your data labels might come in use.
Now, you see, they can’t actually be shown at the moment, there’s too much data on the screen. But if I just reduce the text size of the labels, here we can see, 42K, 51K, and so forth. And we have the usual options here for the data labels. If you want to change the data colours, that’s also available under the data colours.
So, ribbons, useful when you want to know the ranking of individual categories, together with their size, but it can’t unless you’ve got, them not exploded, you can’t take for a fact that the totality of this equal is directly related to the totality of the, in this case, a sales volume. If they’re not exploded, yes you can, but if they are exploded, then it’s more of a picturesque view, rather than being something that’s, strictly speaking, 100% accurate. So, it serves a different purpose to the line charts and the stacked area charts, but it does make for an interesting topic of conversation.
Now, we’re looking at a lots of house sales, but have we actually totalled them up?
So, if I have a new page and I’ll just click on sales volume, just add that to new visualisation. You can see that’s in total, we have got 3.8 million homes sold. So, how many homes did we sell by the end of 2004? Okay, I can put on a filter for that. So, I’ll drug date down into the creative level photo or maybe I’ll add a slicer. So, I want is on or before, and I can take this in the 12:31, 2004 apply filter. So, you can see we’ve now got 1.976 million booked, what about the end of 2005 and so forth? Can we have a running tally. So, what am going to do is instead of applying this as a filter, I’m going to put it into the access. So, here we can see an access of all of the sales volume per individual year, but it’s not cumulative. I wanted to all add up. So, I’m going to change it to a waterfall chart. Look what happens.
So now each year’s sales volume gets added to the previous year. So, you could see eventually this is how the 3.8 million has been calculated. And so you can see it, each individual point 2004 were around the 2 million mark sold.
Now, what if you put a region name to this. Well, we don’t add it to the category or the Y-axis. We added to something called the Breakdown. So, if I add in the region name into the breakdown, you can see that in 1995 we have West Midlands, Greater Manchester, West Yorkshire, and each of those gets added into all of the years.
So, here we have a few negatives so it goes up, but we have a few negative sales as well. It goes down. And when we get to 2004-2005, we’ve got some really negative sales volume compared to the previous year. And then 2008 drops off completely.
Now, I should point out this accesses starts at a 100 K. So when we have a figure all the way down here, it’s not actually down to zero. If I wanted to change that, I would go into the formatting the X-axis. And so the Y-axis. And I would change the minimum from auto to zero, which doesn’t make it look quite as bad, but let’s just change it back for now.
Now, you’ll notice there are some yellow colours as well as red and green. Green means that these are your big advances. Red means these are your biggest decliners. Yellow. Well, what’s happens is I can say, okay, I may have 10, 20, 30 products here, I don’t want to see all 10, 20, 30 I want to see a maximum of five as a default.
So, if I go into formatting of this visualisation and go to breakdown, you’ll see that it says how many breakdowns do I want. So, if I want two breakdowns and it’s only going to show me the most significant two, and then all the us are going to be wrapped up into other. So, it still allows me to see what’s the biggest bottom and the biggest top for instance. But everything else is included in this other section. You’ll also notice that as soon as I put in a breakdown, it’s no longer curative. So, the end figure is not the totality of 1995 to 2016 it’s just the year 2016. So, you have a choice. You can either have it all curative or you can take each year as the final figure. And, we look at 2003 down here at the bottom, and see what has caused that to change the most. So, what’s caused it to change in 2003 we can see that, Mercy side has got up 2000 units and West Midlands has gone down by 2000 units. So, if I take another example, let’s have a look at a line chart of the average house price change. So, we have a 12-month percentage change and I’m going to show that, by year.
Now, we’ve got a problem in that, there isn’t really a percentage change of 2016. So, I’m going to change that from some to average. There are always make sure the answer looks realistic, and we’ve seen previously how we are able to change the modelling so that the default summarization is average for instance. So, you can see that we start off at around 4% go all the way up to plus 28%, and then all the way down to minus nine.
Now, if I was to put it to the waterfall, you can see roughly how that is all curative. It doesn’t work exactly in terms of the actual marks, because a 12-month percentage change is a exponential figure. So, this doesn’t really pay real weight to all of the drops that it should do. But it’s an interesting graph, but much better is if I add in the breakdown. So, if I add in the region name, and then change the region breakdown to a maximum of two. So, we can see that in the between 2001-2002 the biggest raisers South Yorkshire and Tyne and Wear. But then between 2004 and 2005, while other things were going up or other very least recovering, we have got huge declines in tannin ware and mercy side.
So, it just allows a different view of your data. It allows you to see how it’s been going over time in this particular case. But what the biggest contributors are, what’s really have been driving this, and equally if you don’t want to break down, then you can see it at a more vacuum active level.
So, waterfall charts, they’re great if you’ve got positive and negative figures. For instance, you might be calculating a company’s profits and seeing what the most profitable items were. And equally those which made the most loss. You can be auditing the major changes. So, for instance, that particular regions contributed heavily to changes. It could be looking at the change in totals. So, for instance, suppose this was a headcount number of employees, then you can see how many new people minus leavers have arrived, or you can put it into how much money you make and spend each month. So, what’s full charts, they’re fairly specialised, but I hope you can see there can be quite useful when you’re want to have an allocated cumulative account, or if you want to drill down into the significant figures.
Now, in this video, I’m going to want to answer the question, how does the sales volume vary according to price inflation?
So, if houses are rising quickly, do we have more sales volume? So, people are trying to buy the houses before the prices get too high and if it’s going down, do we have a reduction in the sales volume? Are people frightened and not wanting to buy? So, here, instead of what we’ve done previously, we are comparing two numbers together. We’re comparing sales volume and house price inflation and for that, we use a scatter graph. So, if I add a new page, add a scatter graph, which looks like these little circles, we can see that we’ve got lots of axis. So, let’s just think about the question again. I want sales volume against house pricing inflation. So, sales volume, I will put that onto the x-axis and house price inflation, I’m going to take the 12m%Change and put that in the y-axis. Doesn’t really look much of a scatter graph. It’s only it got the one item. Well, what’s happened is that it’s given the sum of the sales volume against the average of the 12m%Change, overall. Okay, I don’t want it overall. I want it let’s say per year. So, I’m going to drag the year into the details and now it’s getting more like a scatter graph.
We’re really starting to get some additional dots. Okay, I’m going to add into that the region name. So, we’ve got 22 years, we’ve got six regions. So, we should have 132 dots by the time we’re finished. So, let’s add in the region name. Doesn’t look different and that’s because we’re back to these drill levels. So, you could expand the next level and you’ll see all 132 but really you want this to be the default view, rather than it being drilled down and drilled up. Then you need some sort of unique identifier here. Something that combines date and region name, and we’ll have a look at how this could be done in the modelling section later on. But for now, just know we can only choose one series unless we are going to be drilling down and expanding to the next level. So, what I’m going to do instead is focus on the region name.
Now, the region is all in the one colour. When I got, when expand it to the next level, it’s all in the one colour. So, can we have it in lots of different colours and we can, if a drag region name to legend. So, now they’re all differently coloured. So, you can see the difference in sales volume for the various regions. Okay, but what if I wanted to concentrate on one year at a time? Well, I can drag year down from details to the Play Axis. Oh, that’s a new axis, we haven’t seen that before. What does that do?
Now, let’s drag the date into the Play Axis. This is going to give us all the dates, I want to change that to the year. So, I’m going to click on the date hierarchy and we’ve got at the bottom, this little play button. If I click on it, you’ll see that we have all of these dots for each year and so we can see how they change over time. So, we can see now that things are going down negative 10, that the sales volume has really collapsed.
Now, we’ve started to have a little bit of confidence that the sales volume goes up. So, this gives us a bit of flexibility in what we’re going to do. So, do we want to all of the dates to be shown at once or do we want it to be shown more as a presentation?
Now, we’ve got another property here. Another thing we can fill in, which is the size and what I’m going to do is I’m going to fill that with sales of volume again and you can see the bigger the sales volume, the bigger the size. So, let’s play this again. So, the further right we go, the more the circles get bigger, but as soon as we get to the left hand side, they really start reducing in size. So, the idea about this is generally to have three independent variables numeric. So, we’ve got the sales volume, we’ve got the 12% change. So, let’s have a different, a third value for the size. So, let’s have the average price as your size. So, now we can see that the cumulative effect of all of the house price inflation, but when we get to around 2007, this is when the house prices were at their maximum and then this slightly declined but they’re still fairly big, even though there is some negative inflation and even though sales values are down, we still have much higher house prices in 1996 or much lower house prices in 1996, compared to the peak of the negative house price inflation around 2009. So, as you can see, the circle’s much bigger. So, this allows us to compare three different values. Now, it’s all been grouped gather on something called the scatter visualisation, the scatter chart. But when we add this third value, it becomes technically a bubble chart.
There is a third variant of this called the Dot Plot and this doesn’t actually use on the X-axis a number, it uses anything else, a category. So, I could put the area code on the X-axis, for instance. You can see, we can’t use the Play Axis to do that. So, get rid of the play axis and now for each of these, we have got all of these individual plots. I think this is probably the most underused version of the scatter plot, largely because you can fairly well replace it with a column chart but if you don’t want the actual column, you just want the dot, then it can be used. But most of the time it is used with two numeric values for the scatter and three numeric values for the bubble.
So, I’ll just undo a few of these changes to get back to where I was, there we go. So, let’s have a look at some of the formatting that we can do with scatter plots. First of all, we can change the shape. So, it needn’t be actual round circles, it could be squares, it could be diamonds or it could be triangles and if you wanted one particular item to be something different, then you can have that as well. So, useful if you wanted to highlight, say greater Manchester, make it a different shape. You can also change the shape size as well, relative to whatever else variables you’ve got. You can add in the category labels if you so wish, so quite useful when everything’s moving.
So, you can visually see which is going to Manchester and which is Tyne and Wear, for instance. You can add borders, so adding slightly darker shades of the shape around it, and that’s it and that’s a shame really because one thing I haven’t been able to spot is how you can actually change the speed of this playback because there isn’t actually a Play Axis category and that for me is a bit of a shame, however, scatter plots or scatter graphs or bubble plots, these are useful for being able to add multiple items, dots, squares when comparing two numeric values and a third numeric item if you want to add the size, and just to confirm, there is currently no way to actually change the play speed but hopefully that will be something that Power BI will implement at some point.
They do change Power BI every single month, new features come out, so hopefully we’re getting better and better product each month and finally, one other way of highlighting a particular series is by clicking on it and you’ll see that a trail happens, so you can see where it has been and this is also quite useful when you play it. So, if I start playing this, then you can see the trail disappears and then it starts again. So, you can see how it progresses over the years.
So, this is how you can create a scatter, a bubble or a dot chart.
Popular posts
Recent Posts