Saturday, June 28, 2014

Inner Join XML Data Using XQuery

Today we'll learn how to join XML data using XQuery.
--
-- assume that we have 2 XML documents, customers and orders.
-- load xml files to SQL Server as CLOB = Character Large Object 
DECLARE @customers XML = 
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\customers.xml', SINGLE_CLOB) AS customers);

DECLARE @orders XML =
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\orders.xml', SINGLE_CLOB) AS orders);

-- Concatenate XML documents using XML PATH mode.
DECLARE @customers_orders XML = (SELECT @customers, @orders
                                 FOR XML PATH(''));
SELECT @customers_orders;
Here's the XML data that we'll use for the examples below.
--cross join
--When multiple for clauses are specified, the result is similar as a nested loops.
--In this case, the return clause is evaluated once for each of the combination of the $c and $o variable's value.
--Since there're 3 customers and 2 orders, the return clause is evaluate 3 * 2 = 6 times.
SELECT @customers_orders.query(
       'for $c in customers/customer
        for $o in orders/order
        return $c');
--Inner Join.
--In this example: the join conditions are specified in the where clause like SQL ANSI-89 join syntax.
--The non matching nodes of the cross join result are filtered.
--This query returns all customers that have at least one order
SELECT @customers_orders.query(
        'for $c in customers/customer
         for $o in orders/order
         where $c/@custid = $o/custid
         return $c');
--
-- what if we only want the customer's first and last name?.
--all customers' first-name and last-name that have at least one orders
--this won't work. Error: XQuery [query()]: The variable '$c' was not found in the scope in which it was referenced.
SELECT @customers_orders.query('
       for $c in customers/customer,                                 
       $o in orders/order                           
       where $c/@custid = $o/custid                             
       return $c/first-name, $c/last-name');
--
-- by using a pair of () the return clause return a sequence.
--all customers' first-name and last-name that have at least one orders
SELECT @customers_orders.query('
       for $c in customers/customer,                                 
       $o in orders/order                           
       where $c/@custid = $o/custid                             
       return ($c/first-name, $c/last-name)');
-- 
-- In this example, we use xml constructor in the return clause to construct the output.
--all customers first-name and last-name that have at least one orders
SELECT @customers_orders.query(
      'for $c in customers/customer,
       $o in orders/order
       where $c/@custid = $o/custid
       return 
       <customer>
         {$c/first-name}
         {$c/last-name}
       </customer>');

--
--all customers first-name, last-name, and orders info
--error: XML well-formedness check: Attribute cannot appear outside of element declaration. 
--       Rewrite your XQuery so it returns well-formed XML.
SELECT @customers_orders.query(
      'for $c in customers/customer                                
       for $o in orders/order
       where $c/@custid = $o/custid
       return
       <order>
         {$c/first-name}
         {$c/last-name}
         {$o/@orderid}
         {$o/orderdate}
       </order>');

--
--all customers' first-name, last-name, and orders info
--to fix the error above, since orderid is an attribute, list {$o/@orderid} first in the return clause
SELECT @customers_orders.query(
      'for $c in customers/customer
       for $o in orders/order
       where $c/@custid = $o/custid
       return
       <order>
         {$o/@orderid}
         {$c/first-name}
         {$c/last-name}
         {$o/orderdate}
       </order>');

--all customers' first-name, last-name, and orders info
--another option, include a tag for orderid
SELECT @customers_orders.query(
      'for $c in customers/customer,
       $o in orders/order
       where $c/@custid = $o/custid
       return
       <order>
        {$c/first-name}
        {$c/last-name} 
        {$o/@orderid}
        {$o/orderdate}
        </order>');
-- another option:
-- specify the join conditions in the for clause
SELECT @customers_orders.query(
      'for $c in //customers/customer
       for $o in //orders/order[custid = $c/@custid]
       return 
       <order>
         {$o/@orderid}
         {$c/first-name}
         {$c/last-name}
         {$o/orderdate}
      </order>');
In XQuery FLWOR expression, the join conditions can be specified in the where clause. This is similar to SQL ANSI-89 join syntax. Also join conditions can be part of the for clause predicate as shown in example above.

No comments:

Post a Comment