0
votes

I have a column in SQL Server that is stored as XML, here is an example:

<data>
<pages>4935</pages>
<world_id>88BDBD41-736D-4CF1-4BC9-A6BF4935E481</world_id>
<Url>https://example.com/4935</Url>
<Token>https://example.com/80d71735733e91efc74e71a878eeaee5e84935</Token>
<phone>1122334935</phone>
<email>[email protected]</email>
<alt_number>334554935</alt_number>
</data>


<data>
<book_id>493567</book_id>
<page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
<work_number>1114935</work_number>
</data>

The goal is to look through all possible nodes and find the number 4935 but only when it's at the end of its node. If this is located then I want to complete node returned:

The example above should only return the following data:

<pages>4935</pages>
<Url>http://example.org/4935</Url>
<Token>http://example.net/80d71735733e91efc74e71a878eeaee5e84935</Token>
<alt_number>334554935</alt_number>
<page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
<work_number>1114935</work_number>

Here is what I have attempted.

I've taken all the data that contains the number 4935 into a #temp table:

SELECT xml_data, xml_id INTO #t
FROM table 
WHERE cast(xml_data as nvarchar(max)) LIKE '%4935%'

I altered XML column to be a string:

ALTER TABLE #t
ALTER COLUMN xml_data varchar(max);

This allows me to manipulate the data as a string. I have used a combination of SUBSTRING, PATINDEX, and LEN to grab what I want but it's not giving me the desired result:

SELECT xml_id, 
       xml_data, 
       Substring (xml_data, PATINDEX( '%4935%',xml_data) - 10,  -- Start Position
LEN(xml_data)           -- End Position.
       ) AS stripped 
FROM   #t

I'm not sure if my attempt in changing the XML to a string is the best approach so I am open to other functions that can accomplish what I am looking to do.

Thank you in advance for taking the time in answering this question.

1
Have you considered using XML query paths to get the nodes instead of treating the XML as a string? sqlshack.com/for-xml-path-clause-in-sql-server - Charleh
@Charleh does this require knowing all the names of the nodes previously? If so, then it might not work because there is wide variance info and I would need to look through thousands of nodes to map with. - ApacheOne
Nope, it doesn't but there are some limitations in that some functions such as contains are not supported. Lookup some articles on SQLXML XPath and you'll get an idea of what's possible - Charleh

1 Answers

4
votes

It is better to stay with the XML data type. XQuery and FLWOR expression make it relatively easy.

It would be even easier if SQL Server would support the fn:ends-with() function. But MS SQL Server is lagging in its w3c XQuery standards support.

I recently asked Microsoft to beef up NoSQL functionality in SQL Server.

Please check it out, add your comments, and vote for it:

https://feedback.azure.com/d365community/idea/153a9604-7025-ec11-b6e6-000d3a4f0da0

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, xml_data XML);
INSERT INTO @tbl (xml_data) VALUES
(N'<root>
    <data>
        <pages>4935</pages>
        <world_id>88BDBD41-736D-4CF1-4BC9-A6BF4935E481</world_id>
        <Url>https://example.com/4935</Url>
        <Token>https://example.com/80d71735733e91efc74e71a878eeaee5e84935</Token>
        <phone>1122334935</phone>
        <email>[email protected]</email>
        <alt_number>334554935</alt_number>
    </data>
    <data>
        <book_id>493567</book_id>
        <page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
        <work_number>1114935</work_number>
    </data>
</root>');
-- DDL and sample data population, end

DECLARE @searchFor VARCHAR(20) = '4935';
DECLARE @searchForLength INT = (LEN(@searchFor) - 1);

SELECT ID, xml_data.query('
        for $x in /root/data/*
        let $len := string-length(string($x/text()[1]))
        return $x[substring($x/text()[1], $len - sql:variable("@searchForLength"), $len) eq sql:variable("@searchFor")]
    ')
FROM @tbl;

Output

<pages>4935</pages>
<Url>https://example.com/4935</Url>
<Token>https://example.com/80d71735733e91efc74e71a878eeaee5e84935</Token>
<phone>1122334935</phone>
<alt_number>334554935</alt_number>
<page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
<work_number>1114935</work_number>

It is also possible to move XPath predicate to the where clause:

SELECT ID, xml_data.query('
        for $x in /root/data/*
        let $len := string-length(string($x/text()[1]))
        where substring($x/text()[1], $len - sql:variable("@searchForLength"), $len) eq sql:variable("@searchFor")
        return $x
')
FROM @tbl;

A dream where MS SQL Server fully supports w3c XQuery standards

SELECT ID, xml_data.query('
    for $x in /root/data/*[ends-with(./text()[1], sql:variable("@searchFor"))]
    return $x
')
FROM @tbl;

To search in any position

SELECT ID, xml_data.query('
    for $x in /root/data/*[contains(./text()[1], sql:variable("@searchFor"))]
    return $x
')
FROM @tbl;