Saturday, June 28, 2014

Inner Join XML Data Using XQuery

Today we'll learn how to join XML data using XQuery.
--
-- assume that we have 2 XML documents, customers and orders.
-- load xml files to SQL Server as CLOB = Character Large Object 
DECLARE @customers XML = 
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\customers.xml', SINGLE_CLOB) AS customers);

DECLARE @orders XML =
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\orders.xml', SINGLE_CLOB) AS orders);

-- Concatenate XML documents using XML PATH mode.
DECLARE @customers_orders XML = (SELECT @customers, @orders
                                 FOR XML PATH(''));
SELECT @customers_orders;
Here's the XML data that we'll use for the examples below.
--cross join
--When multiple for clauses are specified, the result is similar as a nested loops.
--In this case, the return clause is evaluated once for each of the combination of the $c and $o variable's value.
--Since there're 3 customers and 2 orders, the return clause is evaluate 3 * 2 = 6 times.
SELECT @customers_orders.query(
       'for $c in customers/customer
        for $o in orders/order
        return $c');
--Inner Join.
--In this example: the join conditions are specified in the where clause like SQL ANSI-89 join syntax.
--The non matching nodes of the cross join result are filtered.
--This query returns all customers that have at least one order
SELECT @customers_orders.query(
        'for $c in customers/customer
         for $o in orders/order
         where $c/@custid = $o/custid
         return $c');
--
-- what if we only want the customer's first and last name?.
--all customers' first-name and last-name that have at least one orders
--this won't work. Error: XQuery [query()]: The variable '$c' was not found in the scope in which it was referenced.
SELECT @customers_orders.query('
       for $c in customers/customer,                                 
       $o in orders/order                           
       where $c/@custid = $o/custid                             
       return $c/first-name, $c/last-name');
--
-- by using a pair of () the return clause return a sequence.
--all customers' first-name and last-name that have at least one orders
SELECT @customers_orders.query('
       for $c in customers/customer,                                 
       $o in orders/order                           
       where $c/@custid = $o/custid                             
       return ($c/first-name, $c/last-name)');
-- 
-- In this example, we use xml constructor in the return clause to construct the output.
--all customers first-name and last-name that have at least one orders
SELECT @customers_orders.query(
      'for $c in customers/customer,
       $o in orders/order
       where $c/@custid = $o/custid
       return 
       <customer>
         {$c/first-name}
         {$c/last-name}
       </customer>');

--
--all customers first-name, last-name, and orders info
--error: XML well-formedness check: Attribute cannot appear outside of element declaration. 
--       Rewrite your XQuery so it returns well-formed XML.
SELECT @customers_orders.query(
      'for $c in customers/customer                                
       for $o in orders/order
       where $c/@custid = $o/custid
       return
       <order>
         {$c/first-name}
         {$c/last-name}
         {$o/@orderid}
         {$o/orderdate}
       </order>');

--
--all customers' first-name, last-name, and orders info
--to fix the error above, since orderid is an attribute, list {$o/@orderid} first in the return clause
SELECT @customers_orders.query(
      'for $c in customers/customer
       for $o in orders/order
       where $c/@custid = $o/custid
       return
       <order>
         {$o/@orderid}
         {$c/first-name}
         {$c/last-name}
         {$o/orderdate}
       </order>');

--all customers' first-name, last-name, and orders info
--another option, include a tag for orderid
SELECT @customers_orders.query(
      'for $c in customers/customer,
       $o in orders/order
       where $c/@custid = $o/custid
       return
       <order>
        {$c/first-name}
        {$c/last-name} 
        {$o/@orderid}
        {$o/orderdate}
        </order>');
-- another option:
-- specify the join conditions in the for clause
SELECT @customers_orders.query(
      'for $c in //customers/customer
       for $o in //orders/order[custid = $c/@custid]
       return 
       <order>
         {$o/@orderid}
         {$c/first-name}
         {$c/last-name}
         {$o/orderdate}
      </order>');
In XQuery FLWOR expression, the join conditions can be specified in the where clause. This is similar to SQL ANSI-89 join syntax. Also join conditions can be part of the for clause predicate as shown in example above.

Tuesday, June 24, 2014

Import & Concatenate XML Documents in SQL Server

To prepare for next blog post (XML Inner Join using XQuery), today, we'll learn how to import and concatenate XML documents in SQL Server.
USE tempdb;
GO

-- Import XML documents as CLOB = Character Large Object. 
-- Note: AS of SQL 2012, XML data type can hold 2GB of data max. 
DECLARE @customers XML = 
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\customers.xml', SINGLE_CLOB) AS customers);

DECLARE @orders XML =
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\orders.xml', SINGLE_CLOB) AS orders);

SELECT @customers AS customers, @orders AS orders;
-- Concatenate XML document.
-- This query will fails. Operand data type xml is invalid for add operator.
DECLARE @customers_orders XML = @customers + @orders;


-- Concatenate XML document using XML PATH mode.
-- Add the default row tag as root node
DECLARE @customers_orders2 XML = (SELECT @customers, @orders
                                 FOR XML PATH);
SELECT @customers_orders;

-- Concatenate XML document,
-- an empty string as argument in the PATH() will remove the default row tag  
DECLARE @customers_orders3 XML = (SELECT @customers, @orders
                                 FOR XML PATH(''));
SELECT @customers_orders3;

-- Concatenate XML document,
-- and provide a specific root node.
DECLARE @customers_orders4 XML = (SELECT @customers, @orders
                                 FOR XML PATH('customersorders'));
SELECT @customers_orders4;
GO
Result for @customers_orders4
--
-- Another option: using VARCHAR data type. 
DECLARE @customers VARCHAR(MAX) = 
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\customers.xml', SINGLE_CLOB) AS customers);

DECLARE @orders VARCHAR(MAX) =
(SELECT * FROM OPENROWSET(BULK 'C:\Self_Dev\XML\orders.xml', SINGLE_CLOB) AS orders);

DECLARE @customers_orders XML = @customers + @orders;

SELECT @customers_orders;
GO


Next couple posts, we'll learn how inner and outer join work in XQuery.

Reference:
OPENROWSET
XML PATH mode
XML Data Type
LOBs, BLOBs, CLOBs

Friday, June 13, 2014

XQuery FLWOR Expression Part II

In this post we will learn how to filter and sort data in XQuery when querying XML.
-- let's assume that we have a table contain owners
-- and all the detail about the stores that they own.
USE TEMPDB;
GO
IF OBJECT_ID(N'#business', 'U') IS NOT NULL
BEGIN 
    DROP TABLE #business;
END
GO

CREATE TABLE #business (
  OwnerId INT NOT NULL,
  OwnerName VARCHAR(50) NOT NULL,
  stores XML --note: this is an XML data type
);
GO
--generate some XML data
DECLARE @bookstore1 XML = 
N'
<stores>
  <bookstore specialty="novel" address_city="Seattle">
    <book style="autobiography">
      <title>Create a Vision</title>
      <author>
        <first-name>Joe</first-name>
        <last-name>Smith</last-name>
      </author>
      <price>12</price>
    </book>
      <book style="autobiography">
      <title>Face the Sun</title>
      <author>
        <first-name>Joe</first-name>
        <last-name>Cantrell</last-name>
      </author>
      <price>20</price>
    </book>
  </bookstore>
  <bookstore specialty="science" address_city="London">
    <book style="textbook">
      <title>XML Reference</title>
      <author first-name="Mary" last-name="Jane" />
      <price>55</price>
    </book>
    <book style="novel" id="myfave">
      <title>Paths to Success</title>
      <author>
        <first-name>Toni</first-name>
        <last-name>Wilson</last-name>
      </author>
      <price intl="Canada" exchange="0.7">6.50</price>
    </book>
  </bookstore>
</stores>';

DECLARE @bookstore2 XML = 
N'
<stores>
  <bookstore specialty="computing" address_city="Seattle">
    <book style="textbook">
      <title>Advanced Game Testing</title>
      <author>
        <first-name>Sam</first-name>
        <last-name>Heng</last-name>
      </author>
      <price>20</price>
    </book>
    <book style="textbook">
      <title>XML Basic</title>
      <author>
        <first-name>Najwa</first-name>
        <last-name>Ebanks</last-name>
      </author>
      <price>30</price>
    </book>
  </bookstore>
</stores>';
--insert xml data to the table.
INSERT INTO #business(OwnerId, OwnerName, stores)
VALUES (1, 'Hai Ton', @bookstore1)
      ,(2, 'Phuong Tran', @bookstore2);

-- all books with price <= 20
-- note since there's no schema XML specified for the XML documents,
-- the price is treated as string.
-- Implicit casting occurs for the price element when the predicate $book/price <= 20 evaluates.
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where $book/price <= 20
                     return $book') AS books
FROM #business;
-- Filter with multiple predicates
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where $book/price <= 20 and $book/author/first-name = "Toni"
                     return $book')
FROM #business;
-- filter using attribute
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where  $book/@style = "textbook"
                     return $book')
FROM #business;
-- sort
-- this query will returns an error.
-- order by clause only works with a sequence of one item (a singleton),
--  thus we need to specify a number inside a [].
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where $book/price <= 20
                     order by $book/price
                     return $book')
FROM #business;
Msg 2389, Level 16, State 1, Line 8 XQuery [#business.stores.query()]: 'order by' requires a singleton (or empty sequence), found operand of type 'xdt:untypedAtomic *'
-- all books with price <= 20 and sort books in ascending order by price.
-- Default sorting order is ascending. Specify descending keyword to revert sort order.
--   example: order by $price[1] descending
-- Since there's no XML schema specified for the XML documents,
--  the price is treated string thus: 6.50 > 20 > 12
-- Note: the order by clause only works with a sequence of one item (a singleton),
--  thus we need to specify a number inside a [].
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where $book/price <= 20
                     order by $book/price[1]
                     return $book')
FROM #business;

-- to work around cast price to a numeric type then sort.
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where $book/price <= 20
                     order by xs:double($book/price[1])
                     return $book')
FROM #business;
-- specify multiple sorting criteria
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     where $book/price <= 20
                     order by $book/@style[1] descending, xs:double($book/price[1]) 
                     return $book')
FROM #business;



Reference:
http://www.w3.org/TR/2014/REC-xquery-30-20140408/
http://msdn.microsoft.com/en-us/library/ms189075%28v=sql.110%29.aspx
http://technet.microsoft.com/en-us/library/ms345122%28v=sql.90%29.aspx
http://msdn.microsoft.com/en-us/library/ms190945.aspx
http://shop.oreilly.com/product/9780596006341.do

Tuesday, June 10, 2014

XQuery FLWOR Expression

In previous several posts, I provided some examples querying XML data mainly using XPath Expressions. Today we'll learn about the FLWOR expressions. The FLWOR expression is the core of the XQuery language. With FLWOR expressions, we can create more complex queries. Also we can sort the result set.
FLWOR(for, let, where, order by, return). Below is a brief description for these 5 clauses.

for: Contains variables to hold each value in the sequence. This set up iterations. Similar like, but not exactly the same, foreach loop in C#. The key different is the iteration in the FLWOR expression is not processed in ordered.
let: contains variables to hold a sequence. The let clause perform the assignment of a sequence to a variable. Only evaluate once (no iteration).
where: filter result. Similar to WHERE clause in SQL. (optional)
order by: sort the sequence. Similar to ORDER BY in SQL. Order by clause requires a singleton or an empty sequence. (optional)
return: returns result. Similar to SELECT clause in SQL. The return clause is evaluated once for each iteration in the for clause. (required)
Note: along with the return clause, a FLWOR expression requires at least one for or one let clause.
Let's go over some examples. Since we're using SQL Server to query our XML data, we'll need to use the query() method in SQL Server. XQuery is case sensitive.
-- let's assume that we have a table contain owners
-- and all the detail about the stores that they own.
USE TEMPDB;
GO
IF OBJECT_ID(N'#business', 'U') IS NOT NULL
BEGIN 
    DROP TABLE #business;
END
GO

CREATE TABLE #business (
  OwnerId INT NOT NULL,
  OwnerName VARCHAR(50) NOT NULL,
  stores XML --note: this is an XML data type
);
GO
--generate some XML data
DECLARE @bookstore1 XML = 
N'
<stores>
  <bookstore specialty="novel" address_city="Seattle">
    <book style="autobiography">
      <title>Create a Vision</title>
      <author>
        <first-name>Joe</first-name>
        <last-name>Smith</last-name>
      </author>
      <price>12</price>
    </book>
      <book style="autobiography">
      <title>Face the Sun</title>
      <author>
        <first-name>Joe</first-name>
        <last-name>Cantrell</last-name>
      </author>
      <price>20</price>
    </book>
  </bookstore>
  <bookstore specialty="science" address_city="London">
    <book style="textbook">
      <title>XML Reference</title>
      <author first-name="Mary" last-name="Jane" />
      <price>55</price>
    </book>
    <book style="novel" id="myfave">
      <title>Paths to Success</title>
      <author>
        <first-name>Toni</first-name>
        <last-name>Wilson</last-name>
      </author>
      <price intl="Canada" exchange="0.7">6.50</price>
    </book>
  </bookstore>
</stores>';

DECLARE @bookstore2 XML = 
N'
<stores>
  <bookstore specialty="computing" address_city="Seattle">
    <book style="textbook">
      <title>Advanced Game Testing</title>
      <author>
        <first-name>Sam</first-name>
        <last-name>Heng</last-name>
      </author>
      <price>20</price>
    </book>
    <book style="textbook">
      <title>XML Basic</title>
      <author>
        <first-name>Najwa</first-name>
        <last-name>Ebanks</last-name>
      </author>
      <price>30</price>
    </book>
  </bookstore>
</stores>';
--insert xml data to the table.
INSERT INTO #business(OwnerId, OwnerName, stores)
VALUES (1, 'Hai Ton', @bookstore1)
      ,(2, 'Phuong Tran', @bookstore2);
-- all books
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     return $book')  AS books
FROM #business;

-- all books
-- note the syntax for let clause. It uses := not =
SELECT OwnerName,
       stores.query('let $book := stores/store/book
                     return $book') AS books
FROM #business;

-- all books
-- notice, in this XQuery, the return clause doesn't use anything from the let clause
SELECT OwnerName,
       stores.query('let $i := something
                     return stores/store/book') AS books
FROM #business;
--this xquery result in an error.
--A FLWOR expression requires at least one for or one let clause 
SELECT OwnerName,
       stores.query('return stores/store/book') AS books
FROM #business;
Result: Msg 2209, Level 16, State 1, Line 3 XQuery [#business.stores.query()]: Syntax error near 'return'
-- all books price
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     let $price := $book/price
                     return $price') AS BooksPrice
FROM #business;
-- all books price with title
-- notice we can specify multiple variables for the let clause,
-- and use a pair of () when return multiple sequences
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     let $price := $book/price
                     let $title := $book/title
                     return ($title, $price)') AS BooksPrice
FROM #business;
-- What if we want the output to be like this:
-- 
--   abc
--   12
-- 
-- this query only return $title and $price as string, not the values.
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     let $price := $book/price
                     let $title := $book/title
                     return ($title, $price)') AS BooksPrice
FROM #business;
--use the curly brace {} to get the variables' values.
--However, this query still doesn't provide the correct output.
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     let $price := $book/price
                     let $title := $book/title
                     return ({$title}, {$price})') AS BooksPrice
FROM #business;
--this query should provide the result in the format below. 
-- 
--   abc
--   12
-- 
--notice, we removed the comma between $title and $price variable.
SELECT OwnerName,
       stores.query('for $book in stores/store/book
                     let $price := $book/price
                     let $title := $book/title
                     return ({$title} {$price})') AS BooksPrice
FROM #business;


That's all for now. Next post we'll learn how to use the where and order by clause to filter and sort result.

Reference:
http://www.w3.org/TR/2014/REC-xquery-30-20140408/
http://msdn.microsoft.com/en-us/library/ms189075%28v=sql.110%29.aspx
http://technet.microsoft.com/en-us/library/ms345122%28v=sql.90%29.aspx
http://msdn.microsoft.com/en-us/library/ms190945.aspx
http://shop.oreilly.com/product/9780596006341.do