Tuesday, April 22, 2014

FOR XML Clause With Join

Today, we will explore how to transforms rows into XML elements when dealing with joining multiple tables. The examples below will demonstrate the important of the columns ordering when using join.

-- Example 1.
-- This query returns the first 2 customers' orders 
-- by INNER JOIN Sales.Customers and Sales.Orders table
SELECT c.custid, c.contactname, o.orderid, o.orderdate
FROM Sales.Customers AS c
JOIN Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2;

Result: The customers' orders are not ordered based on the custid. Without the ORDER BY clause, there's no guarantee of the rows ordering.













-- Example 2
--In this query, ORDER BY clause is not specified and result is transformed to XML.
SELECT c.custid, c.contactname, o.orderid, o.orderdate
FROM Sales.Customers AS c
JOIN Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2
FOR XML AUTO;

Result: Notice just like, in term of ordering,  the result of this query is the same as previous query above, the customers' orders are scattered. The element from Sales.Customers table are repeated multiple time. Also the XML element names are the tables' aliases, and XML result formatted using attribute-centric presentation since ELEMENTS directive wasn't specified.
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10308" orderdate="2006-09-18T00:00:00" />
  <o orderid="10625" orderdate="2007-08-08T00:00:00" />
</c>
<c custid="1" contactname="Allen, Michael">
  <o orderid="10643" orderdate="2007-08-25T00:00:00" />
  <o orderid="10692" orderdate="2007-10-03T00:00:00" />
  <o orderid="10702" orderdate="2007-10-13T00:00:00" />
</c>
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10759" orderdate="2007-11-28T00:00:00" />
</c>
<c custid="1" contactname="Allen, Michael">
  <o orderid="10835" orderdate="2008-01-15T00:00:00" />
</c>
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10926" orderdate="2008-03-04T00:00:00" />
</c>
<c custid="1" contactname="Allen, Michael">
  <o orderid="10952" orderdate="2008-03-16T00:00:00" />
  <o orderid="11011" orderdate="2008-04-09T00:00:00" />
</c>

-- Example 3.
--this query include an ORDER BY clause, result in tabular format.
SELECT c.custid, c.contactname, o.orderid, o.orderdate
FROM  Sales.Customers AS c
JOIN  Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2
ORDER BY c.custid, o.orderid;
Result: Notice that all order are listed in ascending order based on custid and orderid.














-- Example 4.
--this query include an ORDER BY clause and output result as XML.
SELECT c.custid, c.contactname, o.orderid, o.orderdate
FROM Sales.Customers AS c
JOIN Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2
ORDER BY c.custid, o.orderid
FOR XML AUTO;
Result: since we list column from table Sales.Customers first in the SELECT and ORDER BY clause, Sales.Customers is the parent. Thus all elements from Sales.Orders are the children and will be grouped under Sales.Customers's elements. In this case a customer can have one or many orders.
<c custid="1" contactname="Allen, Michael">
  <o orderid="10643" orderdate="2007-08-25T00:00:00" />
  <o orderid="10692" orderdate="2007-10-03T00:00:00" />
  <o orderid="10702" orderdate="2007-10-13T00:00:00" />
  <o orderid="10835" orderdate="2008-01-15T00:00:00" />
  <o orderid="10952" orderdate="2008-03-16T00:00:00" />
  <o orderid="11011" orderdate="2008-04-09T00:00:00" />
</c>
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10308" orderdate="2006-09-18T00:00:00" />
  <o orderid="10625" orderdate="2007-08-08T00:00:00" />
  <o orderid="10759" orderdate="2007-11-28T00:00:00" />
  <o orderid="10926" orderdate="2008-03-04T00:00:00" />
</c>

-- Example 5.
-- This query, we swap the ordering of the columns by specifying columns 
--  in the Sales.Orders table first in both SELECT and ORDER BY clause
SELECT o.orderid, o.orderdate, c.custid, c.contactname 
FROM Sales.Customers AS c
JOIN Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2
ORDER BY o.orderid, c.custid
FOR XML AUTO;
Result:
In this case, the parents elements are from Sales.Orders, and all elements from Sales.Customers are the children elements. The XML fragment is not well formatted compared to the result of example 4.
<o orderid="10308" orderdate="2006-09-18T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10625" orderdate="2007-08-08T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10643" orderdate="2007-08-25T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10692" orderdate="2007-10-03T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10702" orderdate="2007-10-13T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10759" orderdate="2007-11-28T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10835" orderdate="2008-01-15T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10926" orderdate="2008-03-04T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10952" orderdate="2008-03-16T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="11011" orderdate="2008-04-09T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>

-- Example 6.
-- This query, we swap the ordering of the column by specifying columns 
--  in the Sales.Orders table first in ORDER BY clause
SELECT c.custid, c.contactname, o.orderid, o.orderdate
FROM Sales.Customers AS c
JOIN Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2
ORDER BY o.orderid, c.custid
FOR XML AUTO;
Resutl:
This XML output is similar as in Example 2. Elements from Sales.Customers are repeated multiple time.
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10308" orderdate="2006-09-18T00:00:00" />
  <o orderid="10625" orderdate="2007-08-08T00:00:00" />
</c>
<c custid="1" contactname="Allen, Michael">
  <o orderid="10643" orderdate="2007-08-25T00:00:00" />
  <o orderid="10692" orderdate="2007-10-03T00:00:00" />
  <o orderid="10702" orderdate="2007-10-13T00:00:00" />
</c>
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10759" orderdate="2007-11-28T00:00:00" />
</c>
<c custid="1" contactname="Allen, Michael">
  <o orderid="10835" orderdate="2008-01-15T00:00:00" />
</c>
<c custid="2" contactname="Hassall, Mark">
  <o orderid="10926" orderdate="2008-03-04T00:00:00" />
</c>
<c custid="1" contactname="Allen, Michael">
  <o orderid="10952" orderdate="2008-03-16T00:00:00" />
  <o orderid="11011" orderdate="2008-04-09T00:00:00" />
</c>

-- Example 7.
-- This query, we swap the ordering of the column
-- by specifying columns in the Sales.Orders table first in SELECT clause
SELECT o.orderid, o.orderdate, c.custid, c.contactname
FROM Sales.Customers AS c
JOIN Sales.Orders AS o ON o.custid = c.custid
WHERE c.custid <= 2
ORDER BY c.custid, o.orderid
FOR XML AUTO;
Result:
The result is not the same as example 4, even though the Sales.Customer.custid is listed first in the ORDER BY clause. Thus the columns order in the SELECT clause will also effect the XML output.
<o orderid="10643" orderdate="2007-08-25T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10692" orderdate="2007-10-03T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10702" orderdate="2007-10-13T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10835" orderdate="2008-01-15T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10952" orderdate="2008-03-16T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="11011" orderdate="2008-04-09T00:00:00">
  <c custid="1" contactname="Allen, Michael" />
</o>
<o orderid="10308" orderdate="2006-09-18T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10625" orderdate="2007-08-08T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10759" orderdate="2007-11-28T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>
<o orderid="10926" orderdate="2008-03-04T00:00:00">
  <c custid="2" contactname="Hassall, Mark" />
</o>


In conclusion, when output XML from joining the tables, use ORDER BY clause to group element together. This is important because if the ORDER BY clause is not specified, the XML elements are not group together. That's not all children elements are grouped under their parent element. Also, the order of the columns in the SELECT clause will influence the XML result. Recommendation: list the order of the columns in the ORDER BY and SELECT clause in one-to-many order. Example: a customer can have one or many orders.

That's all for now. Next several posts, we'll learn about XML Path Mode, XPath Expression and XQuery.

No comments:

Post a Comment