1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Manipulating XML Data

Discussion in 'SQL Server' started by MinalS, Dec 16, 2014.

  1. It is a markup language that is used for describing the data structure in a hierarchical manner. The data can be saved and retrieved in an XML format in SQL Server. Several database applications are supported.

    Storing XML Data in a Table



    The data in an XML format can be XML fragments or XML document. The top level element is not available.

    The following code shows the XML document.

    Code:
    <studentinfo>
    <student ID=101>
        <studname>Ajay</studname>
    <student>
    <student ID=102>
        <studname>Jay</studname>
    </studentinfo>
    
    
    The code is executed and the following output is displayed to the user.

    Code:
    
    <student ID=101><studname>Ajay</studname></student>
    
    
    When the user wants to save the XML data into its original format, the XML data type is used. The variables and tables of the XML data type can be created. The XML data can be shred and the values are stored in different columns. The process of converting the XML data into a rowset is known as shredding.

    The XML data can be stored in the following two types in SQL Server.
    1. Storing the XML data in a rowset
    2. Storing the data in an XML column

    1. Storing XML data in a rowset



    There are requirements when the user needs to store the data in XML format. The data is stored in the database table. The XML data is shred. The OPENXML function is used for shredding the data.

    The steps for shredding the XML document is as mentioned below:
    1. Parse the XML document
    2. The rowset is retrieved from the tree
    3. The data is stored from the rowset
    4. The memory is cleared
    The following diagram demonstrates the shredding of the XML document.

    [​IMG]

    In the above diagram, the sp_xml_preparedocument procedure is called for parsing the document. A handle value is returned to the user. The DOM known as Document Object Model tree is used for the representation. The handle is passed to the OPENXML. The OPENXML then creates a rowset view of the document depending on the parameters. The sp_xml_removedocument stored procedure is used for the memory clean up.

    1. Parse the XML Document

    The SQL Server has the xml_preparedocument stored procedure is used for parsing the XML document. The MSXML parser is used for parsing the xml document. The XML data is validated with the DTD or schema structure. The document is an internal tree presentation of nodes as elements, attributes and comments.

    sp_xml_preparedocument is used for returning the handle used for accessing the internal representation of the XML document. The handle can be used before the sp_xml_removedocument stored procedure is called.

    2. The rowset is retrieved from the tree

    Once user has checked for the structure and correctness of the data, the variables are to be added in XML type. The OPENXML function is used for creating the in-memory rowset from the parsed input.

    The syntax for OPENXML function is as shown below:
    Code:
    OPENXML ( idoc int [ in ] , rowpattern nvarchar [ in ], [ flags byte [ in ] ] )
    [ WITH ( SchemaDeclaration | TableName ) ]
    
    Where,
    • idoc is the document handle for the internal representation of the XML document.
    • rowpattern is the XPATH pattern for identifying the nodes to be used as rows.
    flags is used for mapping between the XML data and the rowset.

    The values that can be used as the parameter are as mentioned below:
    1. 0: It is used when default mapping is used
    2. 1: It is used for retrieving the attribute values
    3. 2: It is used for retrieving the element values
    4. 3: It is used for attribute and element values
    SchemaDeclaration is the rowset schema declaration for the columns to be returned

    TableName is the name of the table used instead of the schema

    3. The data is stored from the rowset

    The rowset created using the OPENXML is used for storing the data in the similar manner for other rowset. The rowset data can be inserted into the permanent tables.

    4. The memory is cleared

    Once the data is saved into the database, user can release the memory where the rowset is stored. The sp_xml_removedocument stored procedure is used for clearing the memory.

    Consider an example to demonstrate the employee details. The data is available to the user in XML format.

    Code:
    
    DECLARE @Doc int
    DECLARE @XMLDOC nvarchar(1000)
    SET @XMLDOC = N’<ROOT>
        <Employee EmpID=”E001” EmpName=”Mick” >
            <EmpInfo RoleID=”R101” Role=”Manager” />
            <EmpInfo RoleID=”R102” Role=”Test Lead” /></Emplnfo>
        </Employee>
    
        <Employee EmpID=”E002” EmpName=”Harry” >
            <EmpInfo RoleID=”R103” Role=”Trainer” /></Emplnfo>
        </Employee>
    
    For displaying the XML document, the following statement is executed.

    Code:
    EXEC sp_xml_preparedocument @Doc OUTPUT, @XMLDOC
    
    The following statement is used for adding the values in the table. The OPENXML function is as shown below:

    Code:
    
    SELECT * INTO empdata
    FROM OPENXML ( @Doc, ‘/ROOT/Employee’, 1 )
    WITH ( EmpID varchar(20), EmpName varchar(50))
    
    
    In the above statement, the OPENXML function contains three parameters. The @Doc is the document handle used for saving the internal representation. The employee element is described as the second parameter. The value 1 is used for retrieving the attribute values of the employee element.

    Code:
    
    EmpID    EmpName
    E001    Mick
    E002    Harry
    
    
    The table containing the employee details is as shown below:

    2. Storing XML Data in XML Columns



    There is a need that the user wants to store the XML data in the similar format. The details for every record are saved into the XML file. The data is saved in the SQL Server. Consider the example of EmpDetails for storing the Employee data.

    Code:
    
        CREATE TABLE EmpDetails
        (
            EmpID int,
            EmpData XML
        )
    
    
    The data can be in the columns with the XML data type. The following ways can be used.
    1. Untyped XML data: It is a well formed data and is not associated with the SQL Server schema. The data is not validated but the data is saved with the XML data type. The data is well formed.
    2. Typed XML data: It is well formed data associated with the schema defining the elements and attributes. The namespace for the data is defined. The XML data is validated against the schema before saving in XML format. The corresponding data type based on the schema is defined.
    Untyped XML data

    For storing XML data, the user can use the columns or variables with the XML data type. The INSERT INTO statement is used for storing the values in the table.

    Code:
    
    INSERT INTO EmpDetails VALUES ( 1, ‘<Emp Name = “Nick” Country=”UK” />’)
    
    
    In the above code snippet, the string value containing the XML fragment is converted implicitly. The CAST or CONVERT function are used for the conversion.

    The following statement shows the conversion of the data before adding to the table.

    Code:
    
    INSERT INTO EmpDetails VALUES ( 1, convert ( XML, ‘< Emp Name=”Nick” Country=”UK” />’) )
    
    
    Typed XML data

    For storing the typed XML data, the XML schema is registered in the schema collection objects in the database. One or more XML schemas can be stored using the collection. The code for creating the XML schema collection is as shown below:

    Code:
    
    CREATE XML SCHEMA COLLECTION <Name> as Expression
    
    
    Where,

    Name is used as an identifier used for the identification of the schema collection

    Expression is the XML value containing one or more XML Schema documents.

    Consider an example for employee details declared in the schema.

    Code:
    
    <schema xmlns=”http://www.w3.org/2001/XMLSchema “>
        <element name=”EmpName” type=”string” />
        <element name=”Country” type=”string” />
    </schema>
    
    
    User can use the following statement for registering the schema with the database.

    Code:
    
    CREATE XML SCHEMA COLLECTION EmployeeSchemaCollection AS 
    ‘<schema xmlns=”http://www.w3.org/2001/XMLSchema”>
        <element name=”EmpName” type=”string” />
        <element name=”Country” type=”string” />
    </schema>’
    
    
    In the above statement, the XML Schema collection object is created.

    Once the schema is registered, the schemas to validate the typed XML values before the records are inserted in the tables. The XML data is to be declared before adding the XML data.

    Code:
    CREATE TABLE StudDetails
    (
        StudID int,
        StudDetail XML ( StudentSchema Collection )
    )
    
    The following statement is used for adding the data into the StudDetails table.

    Code:
    INSERT INTO StudDetails VALUES ( 1, ‘< StudName> John </StudName><Class>X</Class>’)
    

    Retrieving table data into XML format



    There are certain situations when user wants the relational data to be selected from the XML table. The data is extracted in the form of well formed XML fragments. The following clauses are used for retrieving the data in XML format.
    1. 1. The FOR XML clause in the SELECT statement
    2. 2. XQuery
    1. The FOR XML clause in the SELECT statement

    The SELECT statement is used for extracting the data from the relational tables in the XML format along with the FOR XML clause.

    The following modes are useful for retrieving the XML data using the FOR XML clause.
    1. RAW
    2. AUTO
    3. PATH
    4. EXPLICIT
    1. The RAW Mode

    The RAW Mode is used for returning the XML file in which every row is representing an XML element. Every element has an element with element name, row. Every column value that is not NULL is mapped to an attribute with the similar column name.

    Consider the following query to display the data using the RAW mode.

    Code:
    select empid, salary from empdetails where empid=2 or empid=4 for xml raw
    
    The output displayed after performing the query is as shown below:

    [​IMG]

    2. The AUTO Mode

    The AUTO mode is used for returning the query results as nested XML elements. Every column value that is NOT NULL is mapped to an attribute containing the column or alias. The element to which these attributes belong to or the table alias is used in the SELECT statement.

    The query to demonstrate the auto mode is as shown below:

    Code:
    
    select empid, emprole, location from empdetails where empid=1 or empid = 2 for xml auto
    
    
    The output for the query is as shown below:

    [​IMG]

    3. The PATH Mode

    The PATH mode returns the specific values through the column names for data to be retrieved.

    The query to demonstrate the PATH mode is as shown below:

    Code:
    
    select empid, location, salary from empdetails for xml path
    
    
    The output for the query is as shown below:

    [​IMG]

    4. The EXPLICIT Mode

    The EXPLICIT mode is used for returning the XML file that retrieves the format used in the SELECT statement. Every XML statement contains the first tags as Tag and Parent. The nesting of the elements is controlled through the Parent tag. The tag number of the parent element of the current element is present. The value is set to 0 or NULL.

    The following format is used for writing the EXPLICIT mode queries.

    Code:
    
    ElementName ! TagNumber ! AttributeName ! Directive
    
    
    Where,

    ElementName is the name of the element

    TagNumber is the unique tag value that is assigned to the element

    AttributeName is the name of the attribute

    Directive is the type of the AttributeName

    The query to demonstrate the explicit mode is as shown below:

    Code:
    
    select 1 as tag, null as parent, empid as [ emp ! 1 ! id ] from empdetails where empid = 1 or empid = 2 for XML explicit
    
    
    The output for the query is as shown below:

    [​IMG]

    Using XQuery

    The SQL Server helps the user to extract the data stored in the columns or variables through XQuery. XQuery uses the statements and functions provides by the XML data type. The specific data parts are extracted using the XQuery.

    Every XQuery statement contains two divisions as prolog and body. The namespaces are declared in the prolog section. The XML nodes to be retrieved are declared in the body section.

    The following expressions are supported in the XQuery language.
    1. for: It is used for iterating the nodes at the similar level in an XML document.
    2. order by: It is used for specifying the sequence
    3. let: It is used for declaring the variables and assigning the values
    4. return: It is used for specifying the XML returned from the statement
    5. where: It is used to specify the criteria of the data to be extracted
    The functions provided by the XML data type are used in the XQuery statement.

    1. Query: It is used to extract the XML from the XML data type. The value to be accessed is passed as a parameter.

    Consider an example to demonstrate the Query function.

    Code:
    
    INSERT INTO StudentInfo
    VALUES
    ( 1, ‘John’, 
    ‘<xml version =”1.0” ?>
    <StudentData
    xmlns=”http://schemas.newschema.com/StudentInfo”>
        <Student Class=”tenth”>
        <Location>UK</Location>
    </Student>
    
        <Student Class=”Fifth”>
        <Location>USA</Location>
    </Student>
    </StudentData>’)
    
    
    The following query is used for retrieving the id and location of the student.

    Code:
    
    SELECT id, StudentData.query
    (‘declare namespace
    ns = “http://schemas.newschema.com/StudentInfo”;
    ns:StudentData/ns:Student/ns:Location’) as ‘Info’ FROM StudentInfo
    
    
    In the above query, the query() function is used for accessing the student information.

    2. Value: It is used for returning the single value from the document. The XQuery expression is used for identifying the single node and the data type of the value.

    The following query is used for retrieving the class of the student.

    Code:
    
    SELECT StudentData.query
    (‘declare namespace 
    ns=”http://schemas.newschema.com/StudentInfo”;
    ( ns: StudentData/ns: Student/ns: Class) [ 1 ]’) ClassValue FROM StudentInfo
    
    
    In the above query, path represents the index of the first class.

    3. Exist: It is used to check the existence of the node in the XML data. The value 1 is returned if the node exists and 0 if it does not exist.

    The following query is used to demonstrate the values in the specific order.

    Code:
    
    SELECT id FROM StudentInfo
    WHERE StudentData.exist 
    ( ‘declare namespace ns=”http://schemas.newschema.com/StudentInfo”;
    /ns:StudentData/ns:Student [ @Class=tenth ] ‘)
    
    

    Modifying the Data



    User needs to modify the XML data depending on the criteria. The modify function is used to specify the XQuery expression and the statement on which the modification is to be performed.

    The following functions are used for performing the modifications.

    1. Insert

    The function is used for adding nodes to the XML in an XML column or a variable.

    The following query is used to demonstrate the insert function.

    Code:
    
    update Custinfo set CustDetails.modify ( ‘insert attribute Type { “City” } as first into (/customer) [1]’ )
    
    
    The output for the query is as shown below:

    [​IMG]

    2. Replace

    User needs to update the XML data. The Replace function is used for modifying the XML data.

    Consider the following query to demonstrate the replace fuction.

    Code:
    
    update Custinfo set CustDetails.modify ( ‘replace value of (Customer/@Type [1] with “India” ‘) where CustID=2  
    
    
    The output for the query is as shown below:

    [​IMG]

    3. Delete

    It is used for removing the node from the XML data.

    The query to demonstrate the delete function is as shown below:

    Code:
    
    update Custinfo set CustDetails.modify ( ‘delete (/Customer/@City)[1]’)
    
    
    The output for the query is as shown below:

    [​IMG]
     

Share This Page