Explanations to the XML data type


The cornerstone of SQL Server 2005 and SQL Server 2008 XML functionality is the xml large object (LOB) data type. This data type provides a lot of native XML functionality through its various built-in methods, it can be validated via an XML schema, and its contents can be indexed with XML indexes. When you store XML data using SQL Server's xml data type, SQL Server converts it to an internal binary representation based on the XQuery/XPath Data Model (XDM) recommendation. The XDM recommendation defines a hierarchical representation for XML data. When you retrieve XML data stored as the xml data type, SQL Server literally re-creates the XML text based on the XDM instance contents. The XDM allows for several changes to be made to the textual representation of the XML data, including the following:

The DTD is used to expand entities in the XML data, and then the DTD is stripped from the XML data. The DTD isn't stored with the XML data. See the answer to "What's a DTD?"
CDATA sections are expanded and the CDATA markers are stripped from the XML data.
Insignificant whitespace (spaces, tabs, linefeeds, and carriage returns between markup tags) is removed.
Entities are expanded to their textual representations. In addition, typed data, such as integers, is stored in an internal binary format. An integer such as 00150 can be stored in a numeric representation, returning the textual representation 150 (creating a typed representation of your XML data requires storing the data in a typed xml instance).

If you need to store exact copies of the text of your XML data with DTD, CDATA sections, insignificant whitespace, entities, and exact character-for-character representations of data, store your XML data using a character or binary data type such as varbinary(max), varchar(max), or nvarchar(max). The downside to this method of storing XML data in SQL Server is that you will not have access to the xml data type methods that allow you to query and manipulate your XML data. SQL Server also strips away insignificant whitespace. Insignificant whitespace consists of tab characters, spaces, linefeeds, and carriage returns between XML markup tags. By default, when SQL Server converts your XML data to an internal XDM representation, it strips away insignificant whitespace. You can use the CONVERT function with 3 specified for the style option to convert your text-based XML data to an xml instance and preserve insignificant whitespace in the process. This conversion using CONVERT with style 3 isn't 100 percent exact, and some whitespace characters may not be preserved in the conversion process. XML supports different character encodings, such as ISO-8859-1 or UTF-8 for single-byte character encodings and UTF-16 for Unicode character representations. SQL Server replaces the Chinese characters with question marks to indicate that it couldn't convert them to a single-byte encoding. The fix for this problem is simple: use nvarchar as the source when you need to represent character data that requires Unicode encoding. The problem is that the character data you are converting to XML is single-byte varchar data, whereas the prolog specifies a Unicode UTF-16 encoding. SQL Server's XML parser can't reconcile this conflict. You can fix this by changing your source XML data to nvarchar Unicode data. I recommend always using Unicode, or binary format with the proper encoding specified in the prolog, to accurately represent your XML data and prevent conversion problems.

Sometimes you might want to grab a single scalar value from your XML data instead of entire XML nodes. The xml data type .value() method provides this functionality. As with the .query() method, you pass an XQuery query to the .value() method. You also need to pass a T-SQL data type to the method. After SQL Server retrieves the scalar value from your XML data, it converts it to a T-SQL data type.

Legal Disclaimer

Our website is not responsible for the information contained by this article. Articleinput.com is a free articles resource thus practically any visitor can submit an article. However if you notice any copyrighted material, please contact us and we will remove the article(s) in discussion right away.

Note: This article was sent to us by: Eugene P. Hart at 01092010

Related Articles

1. Associative arrays in Python are dictionaries
This article discusses dictionaries, Python's name for associative arrays, which it implements using hash tables. Dictionaries are amazingly us...

2. What can be used as a key in Python
Python permits more than just strings to be used in this manner. Any Python object that is immutable and hashable can be used as a key to a dic...

3. Definitions and uses of functions in Python
This article assumes you're familiar with function definitions in at least one other computer language and with the concepts that correspond to...

4. Python functions handle variable numbers of arguments
Python functions can also be defined to handle variable numbers of arguments. You can do this two different ways. One way handles the relativel...

5. Lambda expressions and generator functions in Python
Short functions like those you just saw can also be defined using lambda expressions of the form. Lambda expressions are anonymous little funct...

6. How to create a basic program in Python
Up until now, you've been using the Python interpreter mainly in interactive mode. For production use, you'll want to create Python programs or...

7. How to make Python script execution in UNIX and Mac OS X and in Windows
If you're on UNIX, you can easily make a script directly executable. Note that if Python 3.x isn't your default version of Python, you may need...

8. The difference between scripts on Windows scripts on UNIX
The way you call scripts on Windows differs from the way scripts are called on Linux/ UNIX, and that difference can affect what kind of scripts...

9. Python applications are distributed as source files
You can distribute your scripts as source files (as .py files). You can also ship them as byte code (as .pyc or .pyo files). A byte code file w...

10. How to use the Python file system
Working with files involves one of two things: basic I/O and working with the filesystem (for example, naming, creating, moving, or referring t...