This is happening very frequently, especially now, because people are learning how to optimize dax and therefore they post their solution, asking us to check whether there are any issue or if they’ve actually found a good solution. I love that, because, whenever i see code written by somebody else, i always find new ideas or i find errors or mistakes that i can talk about in order to make you understand what to care when you optimize your code, but i don’t want to do a Long introduction, so let’s go straight to the code and see what the proposal is and how we are going to check the performance. The article i’m speaking about is understanding data lineage index. This article talks about data lineage and i’m, not going to talk about the article because it’s, not the topic of the video. The article at the end proposes this code first say first daily first day, sales, v4 that’s, the fourth version of an expression. Now this measure computes the sales amount, considering only the first day of sales, despite the product having been sold for maybe a year two years. We only want the sales in the first day. So the day we launched the product and to do that, we first create a variable products with sales that contains the product names that appear in sales. By using summarize, then, we compute for each product, the first sale using calculate minimum of sales order data. And finally – and this is the topic of the article – we use tritas – to change the lineage of product and first data – so to make a variable that will be able to filter the model.

Indeed, the next variable we use the product and first date with correct lineage to filter the model and compute the sales amount, and this is the code we authored, whereas in the comment renato one day ago, said well excellent, as always. Thank you renato. I was rereading your article for study purposes. Now. Let me make a quick stop here. Re reading, an article is like re reading. The definite guide to dax is the best gift that you can make to yourself. If you really want to learn the accent, there are a lot of small details and it’s utterly easy to forget them. If you just do a quick read of any article or of the task, the textbook so reread study consider that something to study not just to learn about in a casual way. Anyway, he was rereading the article and tried several different ways of writing the same code, which again is a absolutely beautiful way of learning dax and he watched an unplugged video that means i’m recording an unplugged video about a guy that posted an article after i haven’t Seen another unplugged video, but he ended up with this version of the code that he claims is faster we’re, going to analyze the code later for now, it’s important just to see that we have a first version here and a few hours later, renato came up with Another version, a second version and actually a third version, which is the best one.

So we just copy and we just copy everything, and then we check how it works and how it goes now. As a first look, look just looking at group by seems to be a problem here, but we need to test it so in order to test it, i have the same model that renato was using and i only use it because i want that studio. We are going to run attack, studio, test the measure and see how it goes. Let me paste everything format the code and then we need to run a query. This is a measure definition, so we need to define a measure let’s call it sales first day, sales, renato and then we need to write a query like evaluate summarize columns by product brand test and let’s use a new line. So we can replace that later and then the measure is first, they says renato let’s see if it runs it does, because we will need to check the performance of this measure against ours. It’S also good. If we copy our measure this one and we define that too let’s put it later, this is the code of our measure, okay, and we also need to define that let’s call that sql bi format the code. So we have first renatos measure, then our measure and a test query that runs the first. That runs the measure. This is renato’s measure running. I need to enable server timings to check performance.

Okay, so let’s do a first run and we have server timings. Renato’S measure runs in 12 milliseconds, whereas our measure runs in 14 milliseconds now before doing any other conclusion, there is an important detail here: the difference between 40 milliseconds and 12 milliseconds is not two milliseconds. The difference is nothing at all, so there is basically no difference at all between a measure that runs in 10 or 15 milliseconds. It looks like 5 milliseconds is the difference, but that is not true. Whenever you are running on a pc, 5. 10, even 20. Milliseconds of difference is absolutely normal, depending on the workload on the computer at that precise moment in time. So do not get fooled by tiny differences, because you are not optimizing anything if you just cut a few milliseconds out of a measure with that said i’m. Not here to tell that they run the same way because they actually are two different algorithms let’s, look at renato’s measures and start to think that we will need to have a different database running on the standard contoso database. We are not going to see difference in performance, because what renato does is create a variable and it uses he uses the group by function now. Group by is very close to summarize, but the big difference between group by and summarize is that samurais can be pushed down to storage engine quite easily by dax, whereas group by is entirely computed in formula engine.

So most of the scenarios grew by needs to be much slower than samurais. It is not in this case they are performing, they are performing in a very similar way, but again, i suspect that this is because of the size of the model. So, on a small model, all the measures run amazingly fast, so it’s grouping by say is using group by and it’s grouping sales by product name computing. At the same time, the minimum out of the current group of the data, then it uses it, uses three tasks. Exactly as we are doing and finally calculate our version of the code is nearly identical, the difference is that we are using summarize and add columns instead of using group by and, as i said in this model, they are nearly the same that’s. Why i’m going to use not this model, but instead i’m going to check the difference in performance between the two on contoso, the full contoso, the one containing 12 million rows? We need to re, enable server timings and then we look at the two measures. First renato’s measure: we run it of course, having a larger model. Now the speed will be different now it’s 217 milliseconds, and if we run our version of the code, it runs in 86 milliseconds. Now the difference start to be meaningful because one is running in half the time of the other one and the difference is around 120 milliseconds. So you can measure that.

But we need to understand it better. So let’s go to renato’s measure and see exactly what it does. Okay, this is the measure and do we have everything here, yep the measure and the code. Here there are two storage engine queries you see the first one and the second one, as i told you group by, is a function that runs entirely in formula engine. So how is the engine going to answer this query? This query is grouping sales by product name and computing. The minimum date out of the the minimum related date is actually the minimum sales order. Data, but fee is a code that needs to be executed by formula engine. Therefore, dax requires to create a temporary table that contains the product name, all the dates when the product was sold and finally performs the grouping by so the both the group by operation and finding the minimum entirely informal engine. This requires a matter realization that is visible here, that it’s computing quantity times, net price and it’s selecting brand. No sorry, i was looking at the wrong one: it’s, selecting product name, product brand and the data so for each product and brand and date it’s retrieving the three columns. You see that there is no mean here. There is no group by operation. There is the mean operation. This min x is not being executed by storage engine out of this table. It will formal engine will compute the minimum value, and indeed the materialization of this table is quite large, it’s a half, a million rows, which i guess is the combination of product brand product product name, product brand and data.

We can actually give that a try if we create a new query and we evaluate count rows of summarize sales by product name, product brand and data, and we pack everything into a table. Okay – and i run it, you see that the result is 500 half a million rows that are all the combination of product name, product brand and data. That is exactly the same number we have here. So storage engine is creating a data, cache containing half a million rows on top of which formula engine computes at the minimum, and then we have another query that computes other values, but this contains the filter and the sales amount. So i would expect to find the very same query also in our version. There are other important details to note here: uh the ratio between storage engine and formal engine is not that bad, but the parallelism for storage engine is not that large, because storage engine spent most of its time building this large data cache. What happens if we run the same on our model? So, instead of with our measure, we use the sqlbi measure. If i run that because i used summarize instead of using group by this – is our code, i expect samurais to try to push most of its operation down to storage engine, and you can see that here. First of all, the materialization is much smaller before it was half a million rows now, it’s 2.

5 thousand. We have two materialization of 2.5 000 and one of 14 rows and the first one computes by product name, product brand, the minimum of sales order date here is the key of why our measure is faster because the mean operation is not executed in storage, informal engine. It is executed in storage engine and we do not require to have all the combination of product brand and date. We only have a product brand and minimum date so, depending on how many sales you have, this will be way way smaller and it’s a lot smaller. 2.5: 000 against half a million uh. The ratio between storage engine and formula engine follows storage engine. You see that most of the code is executed in storage engine and the degree of parallelism is even higher, because storage engine actually made more work and it spent less time in computing in building this huge data cache. So, at the end, what it produced is a faster measure that does less materialization, because we used summarize so it was not bad. The idea of using group by instead of summarize, but you need to remember that group by was not designed to work on large data sets group by, is designed to be one of the last operations during a query. Once once you have variables or data structure which are not that large, then you can use group by to push to formula engine the final grouping by operation. But if you need to group large data sets group, i is not.

Your best friend, summarize is typically much faster. Then summarize has the problem that you cannot compute value, so you always need to use summarize width and columns whenever you want to compute a value, but never replace summarize with group by unless you deliberately want to push the calculation to formula engine, sometimes that’s exactly what You want to do you have a calculation that is bound to storage engine and you want to get to move a part of the calculation to formula engine for whatever reason, then using group by is your best friend, but not in this case. In this case, using summarize is your best option and there are some important details that are worth talking about during the conclusion. First of all, you have seen that the same query executed on different databases provides, of course, different numbers. If you need to do performance optimization, you cannot run on a very small database because there are, the numbers are so small that you can end up with the wrong conclusion. You need to use tens or hundreds of millions of rows in order to appreciate differences in the numbers. The difference between 12 and 14 milliseconds is basically nothing at all. Then, whenever you want to push calculation down to storage engine, try to avoid using group by because group i was not designed as a formula that is able to push calculation down to storage. Engine group by is entirely executed in formula engine, which means slower calculations with all that said, keep that in mind trying different versions of the code is that the best way to learn dax? You have a formula it works.

You try to write it in a different way and then again in another different way, and you continue this process of testing different examples. All this is extremely important to make ducts stick in your brain and learn a different way of expressing the same code. Thank you.