Category Archives: SSAS Basic Concept

Basic Concepts Used within MDX Query:-

Here you learn what I couldn’t learn in a month 🙂

Multi-Dimensional Expression (MDX)

You can use MDX queries to get at data stored in a SQL Server Analysis Server cube by bringing back facts based on dimensions An MDX expression returns a multi-dimensional result set (dataset) that consists of axis data and cell data.

Cube:-

OLAP Cube is the basic unit of storage for Multidimensional data, on which we can do analysis on stored data and study the various patterns

Dimensions:-

Dimensions, in general, we can say are the Master entities with related member attributes using which we can study data stored in OLAP Cube Quickly and effectively, The primary functions of dimensions are to provide Filtering, Grouping, and Labeling on your data. Dimension tables contain textual descriptions about the subjects of the business.

Measure & Measure Groups:-

Metrics value stored in your Fact Tables is called Measure. Measures are used to analyze the performance of the Business. Measure usually contains numeric data, which can be aggregated against usage of associated dimensions. Measure Group holds the collection of related Measures.

1

Here Basic Terms you should know while working on Multi Dimensional Cube

Introduction to Level, Member, Hierarchy

Level

Generally, Attributes under Dimension are considered as levels, they are also called as Attribute

 Hierarchy.

Let’s take an example of Date Dimension in this we have various levels of Quarter of the Year, Semester of the Year, Week of the Year, Calendar Year, etc.

Members

A key component of the MDX query is member. Each Level contains one or more members.

e.g. Calendar Quarter of Year contains various members like CY Q1, CY Q2, CY Q3, CYQ4.

2

Now we are ready to start playing with MDX Query in our Query Editor Window.

“I think this is the beginning of a beautiful friendship between you and MDX.”

3

Introduction to Axis in MDX Query

MDX queries can have 0, 1, 2 or up to 129 query axes in the SELECT statement. Each axis behaves in exactly the same way, unlike SQL where there are significant differences between how the rows and the columns of a query behave.

Refer to the following table for Axis Numbers reserved and Alias given to them:
Axis Number Alias
0 Columns
1 Rows

Using SQL Server Management Studio (SSMS), we can only browse values on two axis, Columns (Axis 0) and Rows (Axis 1).

 

1. Start with Simple MDX Query

Syntax:  Select From [Your Cube Name] ;.

Which will give you aggregated result as shown in result pane.
4

2. Dropping Dimensions on Axis

Syntax  Select Dimension.Member on Column From [OLAPCubeName ]

Select [Customer].[Customer].[Customer] on columns From [Adventure Works]; 5

3. Using Both the Axis (Rows & Columns)

You can select Dimension or Measure on any Axis.

 Syntax Select [Measure] on Columns,

[Dimension].[Members] on Rows From [Cube Name] ;

Select [Measures].[Internet Sales Amount] on Columns,[Customer].[Customer].[Customer] on RowsFrom [Adventure Works];

6

4. Introduction to .members, and .children in MDX Query

.Members

If you will use this with hierarchy level, then it will retrieve all the values below it and also bring aggregation of that in the form of [ALL].

 Syntax Select [Dimension].[Hierarchy].members on Columns from CubeName

Select [Measures].[Internet Average Sales Amount] on Columns,[Product].[Category].members on RowsFrom [Adventure Works];

7

.Children

When we want to retrieve all members values under particular level of a dimension at that time we use .children ,This will exclude aggregation values [ALL] in your result set.

Syntax  Select [Measures].[Internet Average Sales Amount] on Columns,[Product].[Category].children on RowsFrom [Adventure Works];

8

5. Introduction to Tuple and Set

Tuple:

When we need to place more than one members of a dimension or hierarchy of that dimension on a axis at that time tuple comes into the picture, tuple is enclosed within curly bracket { }, for single tuple bracket is optional.

select {[Date].[CY 2005], [Date].[CY 2006] , [Date].[CY 2007]} on rows,[Measures].[Internet Sales Amount] on columns from[Adventure Works];

9

Set:

Combination of tuple or tuples will give you set , When You want to include range at that time you can use : instead of separating tuple members by comma if they are belonging to same dimension member.

select {[Date].[CY 2008] : [Date].[CY 2005]}  on rows,[Measures].[Internet Sales Amount] on columns from[Adventure Works];

10

6. Using CROSS JOIN

Cross Join Function returns cross product of one or more sets.

Whenever we need to combine more than one member from same or different dimension at that time we can use cross join. * sign can be use to implement cross join between dimension members.

select {[Product].[Category].children * [Product].[Subcategory].children} on rows,[Measures].[Internet Sales Amount] on columns from[Adventure Works];

11

7. Using NonEmpty

NonEmpty function evaluated first so it will remove rows if there was no data in first measure.

8. Apply Slicing using Where Clause

Select [Measures].[Internet Sales Amount] on columns,[Product].[Product].[Product].members on rows from [Adventure Works]where [Date].[Calendar Year].[CY 2007];

Looks like now you know what is mdx at least; Let’s put a smile on face.

Advertisements

Difference between Star Flake or Snow Flake Schema?

Let’s find out which schema should use on which condition.

We will go in three Phase:-

1:- Basic diff b/w Star Flake and Snow Flake

2:- Come up with some details which one I should prefer when?

3:- Where do the two methods fit in?

Comparison chart

Snowflake Schema Star Schema
Ease of maintenance / change No redundancy, so snowflake schemas are easier to maintain and change. Has redundant data and hence less easy to maintain/change
Ease of Use More complex queries and hence less easy to understand Lower query complexity and easy to understand
Query Performance More foreign keys and hence longer query execution time (slower) Less number of foreign keys and hence shorter query execution time (faster)
Type of Datawarehouse Good to use for datawarehouse core to simplify complex relationships (many:many) Good for datamarts with simple relationships (1:1 or 1:many)
Joins Higher number of Joins Fewer Joins
Dimension table A snowflake schema may have more than one dimension table for each dimension. A star schema contains only single dimension table for each dimension.
When to use When dimension table is relatively big in size, snowflaking is better as it reduces space. When dimension table contains less number of rows, we can choose Star schema.
Normalization/ De-Normalization Dimension Tables are in Normalized form but Fact Table is in De-Normalized form Both Dimension and Fact Tables are in De-Normalized form
Data model Bottom up approach Top down approach

Where should they be applied?

And why?

 

The Star schema vs Snowflake schema comparison brings forth four fundamental differences to the fore:

  1. Data optimization: 

Snowflake model uses normalized data, i.e. the data is organized inside the database in order to eliminate redundancy and thus helps to reduce the amount of data. The hierarchy of the business and its dimensions are preserved in the data model through referential integrity.

Figure 1 – Snow flake model

Star model on the other hand uses de-normalized data. In the star model, dimensions directly refer to fact table and business hierarchy is not implemented via referential integrity between dimensions.

Figure 2 – Star model

  1. Business model:

Primary key is a single unique key (data attribute) that is selected for a particular data. In the previous ‘advertiser’ example, the Advertiser_ID will be the primary key (business key) of a dimension table. The foreign key (referential attribute) is just a field in one table that matches a primary key of another dimension table. In our example, the Advertiser_ID could be a foreign key in Account_dimension.

In the snowflake model, the business hierarchy of data model is represented in a primary key – Foreign key relationship between the various dimension tables.

In the star model all required dimension-tables have only foreign keys in the fact tables.

  1. Performance:

The third differentiator in this Star schema vs Snowflake schema face off is the performance of these models. The Snowflake model has higher number of joins between dimension table and then again the fact table and hence the performance is slower. For instance, if you want to know the Advertiser details, this model will ask for a lot of information such as the Advertiser Name, ID and address for which advertiser and account table needs to be joined with each other and then joined with fact table.

The Star model on the other hand has lesser joins between dimension tables and the facts table. In this model if you need information on the advertiser you will just have to join Advertiser dimension table with fact table.

  1. ETL

Snowflake model loads the data marts and hence the ETL job is more complex in design and cannot be parallelized as dependency model restricts it.

The Star model loads dimension table without dependency between dimensions and hence the ETL job is simpler and can achieve higher parallelism.

This brings us to the end of the Star schema vs Snowflake schema debate. But where exactly do these approaches make sense?

Are we on Same Page?

I think you have doubt. Let me give two situation hope you will get better understanding.

Where do the two methods fit in?

With the snowflake model, dimension analysis is easier. For example, ‘how many accounts or campaigns are online for a given Advertiser?’

The star schema model is useful for Metrics analysis, such as – ‘What is the revenue for a given customer?’