Difference among xml auto,xml raw and xml explicit.

RAW

The RAW mode takes the query result and transforms each row
in the result set into an XML element with a generic
identifier row as the element tag and the columns in the
SELECT as attributes.

AUTO

The AUTO mode returns query results in a simple, nested XML
tree. Each table in the FROM clause for which at least one
column is listed in the SELECT clause is represented as an
XML element. The columns listed in the SELECT clause are
mapped to the appropriate element’s attributes. The columns
can be optionally mapped to subelements. The nesting of the
elements or hierarchy in the result set is based on the
order of tables identified by the columns specified in the
SELECT clause. The leftmost table will be the top element.
The second leftmost table (identified by columns in the
SELECT statement) will be nested within the top element, and
so on.

EXPLICIT

In EXPLICIT mode, you can explicitly define the shape of the
resulting XML tree. Using this mode requires that the
queries be written in a specific way, so that additional

play with given query.

IF OBJECT_ID('tempdb..#XmlTestTable') IS NOT NULL DROP TABLE #XmlTestTable
CREATE TABLE #XmlTestTable 
(
    ID INT PRIMARY KEY IDENTITY(1,1),
    FirstName VARCHAR(20),
    LastName VARCHAR(20)
)
INSERT INTO #XmlTestTable (FirstName,LastName) VALUES
('John','Doe'),
('Jane','Doe'),
('Brian','Smith'),
('Your','Mom')

--YOUR TESTS
SELECT * FROM #XmlTestTable FOR XML AUTO
SELECT * FROM #XmlTestTable FOR XML RAW
SELECT * FROM #XmlTestTable FOR XML RAW, ELEMENTS
SELECT * FROM #XmlTestTable FOR XML PATH('Customers')

DROP TABLE #XmlTestTable

information about the desired nesting is specified
explicitly as part of the query.

Advertisements

Difference between Star Flake or Snow Flake Schema?

Let’s find out which schema should use on which condition.

We will go in three Phase:-

1:- Basic diff b/w Star Flake and Snow Flake

2:- Come up with some details which one I should prefer when?

3:- Where do the two methods fit in?

Comparison chart

Snowflake Schema Star Schema
Ease of maintenance / change No redundancy, so snowflake schemas are easier to maintain and change. Has redundant data and hence less easy to maintain/change
Ease of Use More complex queries and hence less easy to understand Lower query complexity and easy to understand
Query Performance More foreign keys and hence longer query execution time (slower) Less number of foreign keys and hence shorter query execution time (faster)
Type of Datawarehouse Good to use for datawarehouse core to simplify complex relationships (many:many) Good for datamarts with simple relationships (1:1 or 1:many)
Joins Higher number of Joins Fewer Joins
Dimension table A snowflake schema may have more than one dimension table for each dimension. A star schema contains only single dimension table for each dimension.
When to use When dimension table is relatively big in size, snowflaking is better as it reduces space. When dimension table contains less number of rows, we can choose Star schema.
Normalization/ De-Normalization Dimension Tables are in Normalized form but Fact Table is in De-Normalized form Both Dimension and Fact Tables are in De-Normalized form
Data model Bottom up approach Top down approach

Where should they be applied?

And why?

 

The Star schema vs Snowflake schema comparison brings forth four fundamental differences to the fore:

  1. Data optimization: 

Snowflake model uses normalized data, i.e. the data is organized inside the database in order to eliminate redundancy and thus helps to reduce the amount of data. The hierarchy of the business and its dimensions are preserved in the data model through referential integrity.

Figure 1 – Snow flake model

Star model on the other hand uses de-normalized data. In the star model, dimensions directly refer to fact table and business hierarchy is not implemented via referential integrity between dimensions.

Figure 2 – Star model

  1. Business model:

Primary key is a single unique key (data attribute) that is selected for a particular data. In the previous ‘advertiser’ example, the Advertiser_ID will be the primary key (business key) of a dimension table. The foreign key (referential attribute) is just a field in one table that matches a primary key of another dimension table. In our example, the Advertiser_ID could be a foreign key in Account_dimension.

In the snowflake model, the business hierarchy of data model is represented in a primary key – Foreign key relationship between the various dimension tables.

In the star model all required dimension-tables have only foreign keys in the fact tables.

  1. Performance:

The third differentiator in this Star schema vs Snowflake schema face off is the performance of these models. The Snowflake model has higher number of joins between dimension table and then again the fact table and hence the performance is slower. For instance, if you want to know the Advertiser details, this model will ask for a lot of information such as the Advertiser Name, ID and address for which advertiser and account table needs to be joined with each other and then joined with fact table.

The Star model on the other hand has lesser joins between dimension tables and the facts table. In this model if you need information on the advertiser you will just have to join Advertiser dimension table with fact table.

  1. ETL

Snowflake model loads the data marts and hence the ETL job is more complex in design and cannot be parallelized as dependency model restricts it.

The Star model loads dimension table without dependency between dimensions and hence the ETL job is simpler and can achieve higher parallelism.

This brings us to the end of the Star schema vs Snowflake schema debate. But where exactly do these approaches make sense?

Are we on Same Page?

I think you have doubt. Let me give two situation hope you will get better understanding.

Where do the two methods fit in?

With the snowflake model, dimension analysis is easier. For example, ‘how many accounts or campaigns are online for a given Advertiser?’

The star schema model is useful for Metrics analysis, such as – ‘What is the revenue for a given customer?’

CTE – part 2

Complex Scenarios

So what is the big deal about CTE, the deal is when you need some complex queries or operations, trust me nothing goes as good as CTE. Let’s have a look at one of the most commonly encountered complex issues… Duplicates. We have a sample table (EMP) for the example.

Select * from EMP

EID         ENAME                DEPT
----------- -------------------- ----------
1           Sara                 IT
2           Rick                 HR
3           Ted                  IT
4           Sheldon              Accounts
5           Sara                 IT

(5 row(s) affected)

For removing the duplicate employee, i.e. ‘Sara’ from the table, we create a CTE:

;WITH EliminateDup(Eid,Name,Dept,RowID) AS
(
SELECT Eid,Ename,Dept, ROW_NUMBER()OVER(PARTITION BY Ename,Dept ORDER BY EID)AS RowID
FROM EMP
)
DELETE FROM EliminateDup WHERE RID>1
The query below creates a temporary result set as :
SELECT Eid,Ename,Dept, ROW_NUMBER()OVER(PARTITION BY Ename,Dept ORDER BY EID)AS RowID
FROM EMP

Eid         Ename                Dept       RowID
----------- -------------------- ---------- --------------------
2           Rick                 HR         1
1           Sara                 IT         1
5           Sara                 IT         2
4           Sheldon              Accounts   1
3           Ted                  IT         1

(5 row(s) affected)

And later, we remove the duplicate with the DELETE. Quite simple, isn’t it.

Recursion

The next and the most important feature is recursion.

With the UNION ALL, we can make the CTE recursive to formulate a final result. There is an anchor member and a recursive member which may or may not have a terminating condition. Let’s see with an example..

Suppose we have a comma separated string and we wish to extract each word from the string..

Let’s consider the string to be ‘Where,there,is,a,will,there,is,a,way’.

DECLARE @T VARCHAR(100)='Where,there,is,a,will,there,is,a,way'
SET @T =@T+','
;WITH MyCTE(Start,[End]) AS(

SELECT 1 AS Start,CHARINDEX(',',@T,1) AS [End]
UNION ALL
SELECT [End]+1 AS Start,CHARINDEX(',',@T,[End]+1)AS [End] from MyCTE where [End]<LEN(@T)
)
Select SUBSTRING(@T,Start,[End]-Start)from MyCTE;

Let’s understand what we have done here… We have an anchor in the form of SELECT 1,CHARINDEX(',',@T,1).

For the first pass, the anchor returns the values 1,6 (this value being the CHARINDEX of first comma after the word ‘Where,’) for columns Start & [End].

Next the recursive code returns [End]+1=7 as Start and 12 for CHARINDEX(',',@T,[End]+1, i.e. 7) AS [End], this code recurses unless the terminating condition is met which is [End]<LEN(@T)i.e. 37.

The UNION ALL operator unites all the start & [End], for clarity let’s take another look at the values.

DECLARE @T VARCHAR(100)='Where,there,is,a,will,there,is,a,way'
SET @T =@T+','
;WITH MyCTE(Start,[End]) AS(

SELECT 1 AS Start,CHARINDEX(',',@T,1) AS [End]
UNION ALL
SELECT [End]+1 AS Start,CHARINDEX(',',@T,[End]+1)AS [End] from MyCTE where [End]<LEN(@T)
)
Select Start,[End],SUBSTRING(@T,Start,[End]-Start)AS String from MyCTE;

Hope that makes things clearer. With CTE, we can achieve the same feats of programmability as C# or C++ with respect to generating Fibonacci series, a specific string patterns, etc. The recursion specifically finds an important use while you need a hierarchy to be reported, we will see that in a while. Currently, let’s look into the recursion option.

What if we want the first two values only out of the string?

DECLARE @T VARCHAR(100)='Where,there,is,a,will,there,is,a,way'
SET @T =@T+','
;WITH MyCTE(Start,[End]) AS(

SELECT 1 AS Start,CHARINDEX(',',@T,1) AS [End]
UNION ALL
SELECT [End]+1 AS Start,CHARINDEX(',',@T,[End]+1)AS [End] from MyCTE where [End]<LEN(@T)
)
Select Start,[End],SUBSTRING(@T,Start,[End]-Start)AS String from MyCTE
OPTION (MAXRECURSION 1);

The OPTION MAXRECURSION enables the code to recurse only once and terminates as soon as that happens.The self explanatory message flashes and the values returned out on the results pane is:

  1. initial anchor value
  2. first recursed value

MAXRECURSION value can be between 0 and 32,767. 32,767 is fine but would what a 0 return? 0 enables an infinite recursion hence if the recursive statement does not have a terminating condition, the program loops infinitely. For first hand experience, try the below code?

;WITH MyCTE(Val) AS(

SELECT 1 AS Val
UNION ALL
SELECT Val=(Val+1) FROM MyCTE
)
Select Val from MyCTE
OPTION (MAXRECURSION 0);

Fetching Hierarchy

Before we call it a day, let’s look at the final example of fetching the complete hierarchy of a particular organization. In such scenarios, CTE could outperform any complex code both in terms of simplicity and LOC (lines of code) required to derive the result.

We have a table Org as below:

Select * from Org

For fetching the bottom up hierarchy, we pass the eid and get the complete hierarchy for the concerned employee. For example, for Andy’s organizational hierarchy, we pass @T = 6 (His Eid).

DECLARE @T INT = 6
;WITH OrgCTE(Eid,Employee,SupervisorID,ReportsTo)AS
(
SELECT @T,O.Name,O2.EID,O2.Name FROM Org O
INNER JOIN Org O2 ON O.SupervisorID=O2.EID
AND O.EID=@T

UNION ALL

SELECT OC.SupervisorID,OC.ReportsTo,O2.EID,O2.Name
FROM OrgCTE OC
INNER JOIN
Org O
ON OC.SupervisorID=O.EID
INNER JOIN
Org O2
ON O.SupervisorID=O2.EID
)
SELECT * FROM OrgCTE

So we have been able to get the hierarchy for Andy.

Similarly for the top down hierarchy, we can implement the below CTE which gives the level indicating the top down org chart.

;WITH OrgCTE(Eid,SupervisorID,Employee,[Role],[Level])AS
(
SELECT EID,SupervisorID,Name,[Role],0 FROM
Org WHERE SupervisorID=0

UNION ALL

SELECT O.EID,O.SupervisorID,O.Name,O.[Role],[Level]+1
FROM Org O
INNER JOIN OrgCTE OC
ON O.SupervisorID=OC.Eid
)
SELECT * FROM OrgCTE

SQL programmers find CTE of immense use and thanks to this feature, complexities in programming life have been considerably simplified. I hope I have been able to justify CTE reasonably well in this article.

CTE for beginner – part 1

CTE as a Derived Table

We have a simple table Products in our database.

Select * from PRODUCTS
ProductID ProductDesc ManufacturingDate       ExpiryDate              IsSalable Price
---------------------------------------------------------------------------------------
1	  Biscuits    2011-09-01 00:00:00.000  2012-09-01 00:00:00.000 1	20.00
2	  Butter      2010-09-01 00:00:00.000  2011-09-01 00:00:00.000 1	30.00
3	  Milk	     2011-10-01 00:00:00.000 2011-11-01 00:00:00.000  1	46.00

We have created a simple ProductsCTE for displaying all the Products with Price greater than 20.00. Here CTE performs the job of acting as a simple derived table.

;WITH ProductsCTE(ProdName,Price) AS
( SELECT ProductDesc,Price
  FROM PRODUCTS
  WHERE Price>20.00
)
SELECT * FROM ProductsCTE
ProdName                                           Price
-------------------------------------------------- ---------------------
Butter                                             30.00
Milk                                               46.00

(2 row(s) affected)

Important point that needs a mention is the SELECT followed by the CTE definition, any operation SELECT,INSERT, UPDATE, DELETE or Merge can be performed immediately after the CTE and the CTE lasts only for a single such operation. When I say that, what do I mean…

It means that the below code is erroneous:

DECLARE @T INT,@I INT
SET @T = 10
SET @I = 20
;WITH ProductsCTE(ProdName,Price) AS
( SELECT ProductDesc,Price
  FROM PRODUCTS
  WHERE Price>20.00
)
SELECT @T+@I
SELECT * FROM ProductsCTE

On executing the code, it results in the below error. Which means I need to execute the ProductsCTE SELECTimmediately after the CTE definition completes.

Msg 422, Level 16, State 4, Line 10
Common table expression defined but not used.

To rectify the same… I would reinstate the order as:

DECLARE @T INT,@I INT
SET @T = 10
SET @I = 20
;WITH ProductsCTE(ProdName,Price) AS
( SELECT ProductDesc,Price
  FROM PRODUCTS
  WHERE Price>20.00
)
SELECT * FROM ProductsCTE
SELECT @T+@I
ProdName                                           Price
-------------------------------------------------- ---------------------
Butter                                             30.00
Milk                                               46.00

(2 row(s) affected)

-----------
30

(1 row(s) affected)

Let’s perform an UPDATE:

;WITH ProductsCTE(ProdName,Price) AS
( SELECT ProductDesc,Price
  FROM PRODUCTS
  WHERE Price>20.00
)
UPDATE ProductsCTE SET Price=50 WHERE ProdName='Milk'
SELECT * FROM ProductsCTE
-------------------

(1 row(s) affected)
Msg 208, Level 16, State 1, Line 7
Invalid object name 'ProductsCTE'.

The price for Milk gets duly updated to 50 but the next set of select doesn’t work. Always remember, you can hit for a result set once and hit immediately after the CTE definition. Like:

;WITH ProductsCTE(ProdName,Price) AS
( SELECT ProductDesc,Price
  FROM PRODUCTS
  WHERE Price>20.00
)
SELECT * FROM ProductsCTE
UNION
SELECT 'Bread' AS ProdName,MIN(Price) AS PRICE from ProductsCTE

ProdName                                           Price
-------------------------------------------------- ---------------------
Bread                                              30.00
Butter                                             30.00
Milk                                               50.00

(3 row(s) affected)
Calling Multiple CTEs

We can have multiple CTEs calls from one single query. Let’s have a look at the example. We have 2 tables:

Select * from Student

Select * from Teacher

Let’s implement a CTE to get all the respective class teachers for the students.

;WITH StudCTE(RollNo,StudentName,TeacherID)
AS
(
SELECT ID,Name,TID FROM Student
)
,TeacherCTE(TID,TeacherName)
AS
(
SELECT ID,Name FROM Teacher
)
SELECT RollNo,StudentName,TeacherName
FROM StudCTE SC
INNER JOIN
TeacherCTE TC
ON SC.TeacherID=TC.TID

We have called 2 CTEs from a single SELECT and based upon the inner join returned the student-teacher information. That was a simple example to show how multiple CTEs are done.