DAX: SUM vs SUMX misunderstanding
How common is this misunderstanding, and why does it exist?
<false statement>
“SUMX ( ‘Table’, ‘Table'[Column] ) is much more expensive (from performance point of view) than SUM ( ‘Table'[Column] ) because the iterator functions differ.
</false statement>
Is the confusion a terminology issue? “Aggregators” versus “iterators” might suggest a fundamental difference in their operation.
Perhaps it’s because of an attempt to explain that SUMX could be costly if it involves a complex calculation.
✅ SUMX might be expensive = TRUE
🔴 SUMX ( ‘Table’, ‘Table'[Column] ) is more expensive than SUM ( ‘Table'[Column] ) = FALSE, they are the same SUMX function with different syntax
⚠ SUMX ( ‘Table’, <expression more complex than a single column reference> ) is more expensive than SUM ( <same expression> ) = NULL or ERROR, the comparison makes no sense, SUM syntax won’t allow that, there is nothing to compare with
P.S. I believe, when I was learning Power BI, years ago, I also was sure iterators actually work differently on the engine level, but how and why it was a mystery that had no logical explanation. The explanation: “they iterate, not aggregate had no sense”. To aggregate you still need to iterate through all rows, there is no other way. Then, I can’t recall when and how, I learned the truth.