I am relatively new to XPath and XQuery. I am working with XML stored in a table called tblXML.
XML is stored in this table by caseID. For this example, this value is unique.
The XML itself is stored in a column labeled Data.
If you were to select * from tblXML where caseID = '12345' you would see this:
caseID data
------ ------
12345 <root><patient><diagnosis preferred_name="diagnosis" tier="1">Melanoma</diagnosis> (XML file stored as blob in database)
If you were to dive into the XML, it would look something like this. For example, this is the XML in the data column for caseID '12345':
<root>
<patient>
<diagnosis preferred_name="diagnosis" tier="1">Melanoma</diagnosis>
<month_of_birth preferred_name="month_of_birth" tier="1">02</month_of_birth>
<day_of_birth preferred_name="day_of_birth" tier="2">01</day_of_birth>
<year_of_birth preferred_name="year_of_birth" tier="1">1960</year_of_birth>
<new_tumor_events>
<month_of_nte preferred_name="month_of_nte" tier="1">12</month_of_nte>
<day_of_nte preferred_name="day_of_nte" tier="2">30</day_of_nte>
<year_of_nte preferred_name="year_of_nte" tier="1">1994</year_of_nte>
</new_tumor_events>
<follow_ups>
<follow_up>
<month_of_fu preferred_name="month_of_fu" tier="1">12</month_of_fu>
<day_of_fu preferred_name="day_of_fu" tier="2">31</day_of_fu>
<year_of_fu preferred_name="year_of_fu" tier="1">1995</year_of_fu>
</follow_up>
<follow_up>
<month_of_fu preferred_name="month_of_fu" tier="1">12</month_of_fu>
<day_of_fu preferred_name="day_of_fu" tier="2">31</day_of_fu>
<year_of_fu preferred_name="year_of_fu" tier="1">1996</year_of_fu>
</follow_up>
</follow_ups>
</patient>
</root>
Goals:
I am trying to select all values for all elements at all levels/nodes of the XML document that have a @Tier attribute of "1". I would like to see each value be output in it's own row in a column called 'Value' like this:
caseID Value
------ ------
12345 Melanoma
12345 02
12345 1960
12345 12
12345 1994
12345 12
12345 1995
12345 12
12345 1996
Here's SQL that's analogous to what I'm trying to write:
DECLARE @tier int
SET @tier = '1' --To search by tier explicitly via variable
select
x.caseID,
bar.value('(//*[@tier = sql:variable("@tier")])[1]','varchar(max)') as value
from tblXML as x
CROSS APPLY data.nodes('//*:patient') AS foo(bar)
where
x.caseID = '12345'
and x.data.value('(//@tier)[1]', 'varchar(3)') = '1'
This gives me...
case ID value
------ ------
12345 Melanoma
...but not the additional rows I'm looking for.
I think the my confusion centers around both the constraint imposed by the singleton [1] and the context node in my CROSS APPLY statement. I'm not sure how to get everything in the document that's Tier 1, not just the first instance of Tier 1 that the query finds.
For this example I know my context node in my CROSS APPLY statement is focused on the patient section of the XML, but I would like to be able to search at all levels of the XML tree both up and down. I know the // for @Tier lets me search at all levels of the document, but my context node seems to drive/constrain me to some degree on how deep into the document I can search.
It's my understanding that I can drill deep into the XML (follow_up perhaps?), then search from the top down. However, the deepest node will change from caseID to caseID and XML document to XML document. Is there a way to wildcard my CROSS APPLY statement into the deepest node in a given XML document and then look for any occurrence of @Tier within that document, then retrieve its corresponding value?
I realize that method lacks elegance and is not performant. I have not been able to find clear documentation thus far to give me the results I'm looking for.
Thanks in advance for your help.
EDIT: I should specify that this SQL will be the core of a table-valued function in SQL Server.