PL-300 – Section 9: Part 1 Level 8: Other Visualization Items Part 4
So, so far, we’ve got quite lots of information about what happened. So, we know that for this particular point, we have a sales volume of 5,400, and an average price, 135,000. This is called descriptive analytics; it tells you what has happened in the past. So, you can see what has happened in the past in terms of year-on-year sales, compared to the previous year. This doesn’t tell me why, and that is not descriptive analytics, but diagnostic. So, we’re going to try and find a root cause for variations in the sales volume. And we can do this using the key influencers of visualisation.
Now, the text can be quite small on there, so make sure you blow it up as much as you want. So there are three various fields that we can put in. Analyse. So, this is the metric, the measure we want to analyse in. So, I want to know about sales volume. Now, what are going to be the key factors in sales volume?
Well, I’m going to say it could be the year. It could be the month. And let’s have a look, we’ve got what influences sales volume to increase. Well, the start of the new millennium, when the year is between 2000 and 2007, the average sales volume increases by 1,230, on average. And you can see that the computer has done some bins. They’ve grouped together some of the values. And here we have at 2000, 2000 or below, 2000 to 2007, and some over year values, and comparing them with the average.
So, we’ve seen how we can do all of these lines in our own visualisations, but here, the computer has just done it for us automatically. So, what about what causes sales volume to decrease?
Well, we’re after 2008, or we’re in the period, 2007 to 2008, or the month is January. So, sales volume decreases by around 648 units, compared to all of the other values of months. So, it could be February, March, et cetera, but January is particularly lower. Or when the year is over to the year 2012. So, here you can see an analysis. So, you can just hover over this analysis and the computer gives you some information.
Now, top segments. So, this is what happens when the computer divides up the data into various segments. So, we’ve got two segments here, one we’ve 298 roles, and the other 360. So, let’s have a look at the 298, I’ve just clicked on it. And this is a segment of in between 2009 and 2012. And you can see that the sales volume is 939 units lower than the overall average. Similarly, we have one way, it is less than or equal to 2008, or is greater than 2012. So, not in that period, and year is greater than 2007. So, it defines it fairly oddly. And then what do we find sales volume being high in? We’ll find it high in greater than 2000 and less than or equal to 2007.
Now, we can then experiment. We can say, well, maybe the region name has something to do with this. And we can see that region name is Greater Manchester that influences the sales volume to increase by an average of 1.3000 units. And region name being West Midlands or West Yorkshire also does. And what causes it to decrease, the other regions. But again, we still have the month of January being a factor, and we have Merseyside, South Yorkshire, and Tyne and Wear, are being factors. Now, that we’ve got more than one item, we can click on each of these to get a different graph.
Now, suppose, we didn’t want the region name to be considered at all as an influencer? Well, we could just get rid of it and we’ll go back to year and month. The problem with that is now we have got our sales volume grouped by year and month. When we had it grouped by region name, we had six times as many groupings, six times as many roles for the computer to consider.
But if I drag is into the Explain by, it considered it as a key influencer. If I want to have it grouped by region names, have the date group by region name, thus, giving the computer more data to use, but I don’t want region name to be considered as a key influencer, I can drag it into the Expand by. However, that is a problem. When Analyse sales volume is not summarised, it always runs at the role level. Well, I currently, I’m not summarising sales volume.
Suppose, I now summarise it by Sum, and then groups. So, what we’re having is a table. So, if I just have a table that has the sum of the sales volume at one level, and then by year, then by month, well, that gives me all of that data, but now I want it to be also by region name as well. So that gives me a lot more data. You can see that instead of just having one data point for 1995 January, I now have six.
But I need to have this sales volume aggregated. I need to have it in some sort of calculation of sum or average, or min or max. So, if I change this analysed from Don’t summarise, which therefore means it looks at all of the roles to sum. Then it has the equivalent of this table. It’s not considering region name as an influencer, but it is taking it into account When it looks at the granularity. If I didn’t have this, then it would be like giving the computer this much detailed to have a look at. If I do drag in region name, I’m saying, don’t use it as a key analyser, but I do want you to use it when you’re looking at the amount of data that you can investigate.
So, this is the key influencer of visualisation. It allows you to see what factors might influence a particular measure and have a look at the relative importance of these, are some more important than others.
Now, just before we end this video on key influences, let’s just have a quick look at the formatting.
Now, we’ve got things like the visual colours, both for the drill and for the analysis. But in the analysis section, near the bottom, there is Enable counts. Now, what this does. Have a look at the circle, have a look at the outside of the circle. It allows you to see how many roles are affected by any single statement. So, if I Spotlight on this, you can see this particular circle is around the quarter of the data. There’s very little, so you can see which one has more impact over the entirety of the data.
Now, the account type, absolute and relative, if you change it to relative, then the number of items on the page, which is the biggest, will be shown as hundred percent and everything will be shown as a percentage of that.
So, in absolute terms, this is only a quarter of the data, maybe even a fifth of the data. And so, this is a much smaller fraction, in relative terms, out of all of this data, this is a quarter of that data. So, this is probably the most interesting of the options that you’ve got available here in the format section for key influencers.
Now, in the previous video, we had a look at these key influencers, so we can see as a whole, what is happening to our data. But what if you’ve got one particular point you want to question.
So, for example, suppose, you wanted to know why this 2016 point was so much higher than 2015. Well, what we can do is right and click on it, and go to Analyse and Explain The Increase or Explain The Decrease. So, it’s going to use the same artificial intelligence that power the key influencers to explain the difference between that point and the previous point. So, the 6.04% increase.
Now, some of the explanations are not going to be helpful. For example, North Northwest and Midlands had the most significant impact. Well, that’s the only three areas we’ve got. However, if I scroll down to here, you can see West Midlands and Greater Manchester are the ones which had the most significant impact among the region names. So, we can see here, the West Midlands going up from an average price of 153,000 up to 166,000. So, an increase of 13,000.
Similarly, with Greater Manchester, it’s gone up from 152,000 to 163,000.
So, if we were to look at another one, West Yorkshire, it’s gone up by 9,000 as opposed to the 13,000 or the 11,000 for the Greater Manchester. So, it has that the most significant impact. It doesn’t mean that it’s the only impactful thing, but this may be something to have a look at in terms of why there’s an increase. You can also see that we’ve got a dash line representing the average for 2015, so you can see how much they’ve increased, and you can see also, if you click on any of these buttons that Greater Manchester or Western Midlands will be highlighted.
Now, we can give it a thumbs up or thumbs down that just provides feedback to Microsoft, it doesn’t actually do anything else, or we can click the plus, and this adds this visualisation to the page. So, I’ll just click away from it, and now we’ve got this visualisation and we can choose to do what we wish because everything is in here just like a normal visualisation.
Now, if you take another example, why do we have a peak all the way up here? So, if I right and click and go to Analyse, Explain The Increase and scroll down, you can see that the Yorkshire’s had that the most significant increase among the region names. So, Yorkshire’s went up from 42% to 68 compared to Other, which was just 45 to 57. And you can see at the bottom, that we’ve got to different visualisations, we can quickly have a look at, while explaining this increase.
Now, sometimes it might not be a particular point, but it might be the distribution. So, right here we’ve got all of our bins. So, our grouping up binning that we had a look at a few videos ago.
Now, there will be some regions where the distribution won’t be like this, and it might be interesting to see which particular region sells out. So, I can right and click on any point. Now, here it doesn’t really matter about the particular point because I’m not going to be asking Power BI to explain the difference between this point and the previous point, I’m just going to get it to look at the distribution as a whole. So, find why this distribution is different. So, here’s the distribution that we’ve had. So, you can see this here with the background over here.
Now, look at Tyne and Wear the bars. Quite a different distribution, so we’ve got quite a lot right at the beginning. It’s is very front-loaded, and we’ve not much of a tail over here. Compare this with Greater Manchester where we’ve got more of a tail of very little at the beginning, and similarly for West Midlands.
Now, just one point, this is the totality of all of our regions, but we are saying that Tyne and Wear goes all the way up here in reality, of course it doesn’t, it is a subset of this. Here we are just comparing proportions. We are not having a look at like of like, and we can see on the right-hand side, we’ve got an access for all sales volume, going zero to 400, whilst here on the left-hand side, it’s just 10 where zero to about 70 when we get to 400. So, it’s comparing proportions. If we didn’t want that, and we wanted everything to be on the same access, then I would get rid of this, and here you can see Tyne and Wear as the subset of the totality. So, here is Greater Manchester, and here is the West Midlands. And again, we can add any visualisations that you like to the page.
Now, there are a few restrictions. So, at the time of recording, the analyse feature won’t work with certain type of filters. So, for instance, Top N filters, it won’t work with.
In addition, some of the data types are not currently supported Direct Query and Live Connect.
Basically, anything which is not using get the standard import. However, outside of those, if you want to find out why one particular point is different from its neighbour, or if you want to have a look at the distribution of all of the points in a way it might be different, have a look at Analyse and you can explain the increase or decrease or find out where this distribution is different.
Popular posts
Recent Posts