2
votes

To be clear, I have already found a way of accomplishing what I wanted to using XSLT, but it strikes me as largely inefficient and I would like to see if it is possible to use a different solution because it would aid in writing future stylesheets.

Also, I sincerely apologize for the following paragraph's verbosity.

Minimally, I am trying to extract from a story written in Russian and encoded in XML ('pavlova.xml' - unfortunately way too large of a file to post with my question but it isn't completely necessary) data which would be useful for generating a pie chart in SVG. I'm trying to store this data in a variable called $chartData. A given pie chart will pertain to a target character (a parameter) overall, and each separate chunk of the pie will represent a speaker (another character), their sizes indicating how much each speaker spoke of the target character relative to other speakers. I began by iterating over all of the speakers (any character that ever mentioned the target character) and then determining the quantity of times the character's name showed up in the particular speaker's speech. However, these values are only useful in relation to the sum total of times that the character was spoken of. I need the total in order to calculate percentages to generate the pie graph. I am aware that I am able to separately calculate this in another variable, but I am interested in seeing if I can store a sequence of nodes in a variable. Calculating the sum separately seems wasteful since it traverses a path nearly identical to the one that finds the individual counts, where it would be much more efficient to just sum the individual values after they are retrieved. So far, I have been able to calculate everything I want, but I am trying instead to iterate over a variable using an xsl:for-each and have it recognize my variable as a sequence.

I want the format of the variable to be this:

<count name="$speaker1>
    <xsl:value-of select="$spokenCount1"/>
</count>
<count name="$speaker2>
    <xsl:value-of select="$spokenCount2"/>
</count>
...

I've tried many ways of solving this issue that I'm not sure is possible to solve. First was more accurately defining the content of the xsl:variable (specifically, trying to define its content as a sequence) by using the @as attribute, but I'm finding it difficult to locate and comprehend documentation on that attribute and how SequenceType's work. Secondly, I tried playing around with how I put the content into my variable (directly putting the for-each within the variable and either using xsl:value-of or xsl:copy-of, versus declaring the variable and using xsl:copy-of in an external for-each to append data to the variable).

It seems that xsl:copy-of, as opposed to xsl:sequence (ironically) is what I need to append a sequence of nodes. I was only able to obtain the format I described above with the following code:

<xsl:variable name="chartData"/>
    <xsl:for-each
        select="distinct-values(//speech[.//name[not(parent::nonspeech) and ./@ref eq $character]]/@speaker)">
        <xsl:copy-of select="$chartData"/>
        <xsl:variable name="speaker" select="current()"/>
        <xsl:variable name="spokenCount"
            select="document('pavlova.xml')/count(//speech[@speaker eq $speaker]//name[not(parent::nonspeech) and ./@ref eq $character])"/>
        <count name="{$speaker}">
            <xsl:value-of select="$spokenCount"/>
        </count>
    </xsl:for-each>
    <xsl:variable name="spokenTotal" select="sum($chartData)"/>
    <xsl:value-of select="$chartData"/>
    <!--xsl:for-each select="$chartData/count">
        ...
    </xsl:for-each-->

The end goal is to make the commented out for-each loop at the bottom iterate over each count element instead of giving me an error. My question is, can I treat a variable as XML and iterate over each count element if it looks like the following, or is it always treated as a string?

<?xml version="1.0" encoding="UTF-8"?>
<count name="young-man">4</count>
<count name="olga">24</count>
<count name="cecilija">34</count>
<count name="blonde">1</count>
<count name="vera">32</count>
<count name="servant-fem">6</count>
<count name="valickaja">20</count>
<count name="muse">66</count>
<count name="viktor">4</count>
<count name="dmitrij">15</count>
<count name="anna">11</count>
<count name="iličev">3</count>
<count name="narod">1</count>
<count name="society-man">3</count>

And if so, how?

Also, if someone could explain, why does xsl:copy-of only deep-copy when I have the for-each outside of the variable's scope and set select to '$chartData'? If I try to use xsl:copy-of within the variable it only copies the text content of the literal <count> elements and creates a string: 424341326206641511313).

1
Please, edit the question and provide complete (but small): 1. Source XML document; 2. Wanted result; 3. Explanation how the wanted result is to be produced from the source XML document. When all this important information is specified, many people may be able to answer the question. In its current state it requires guessing, and almost no one would invest their time in guessing. You may be interested to look at my XSLT implementation of concordance as this is closely related to my understanding of what you are trying to do: oxygenxml.com/archives/xsl-list/200511/msg00190.htmlDimitre Novatchev
Providing source XML would not help in answering my question - I have all of the data I want and my question doesn't involve how to extract it. Also, my wanted is result is just the ability to iterate over a variable with newly created elements, and I provided what that variable currently looks like. I guess I'm not aware why the nodes being from the document or newly created makes a difference in XSLT, so I apologize for the confusion - I thought it would be clear from the source code that I'm creating the count elements, not drawing them from the document.Eric Gratta
I'm currently looking at your implementation of a concordance, and I see that you do use a for-each to iterate over certain members of the variable $vverseWords, and that variable consists of a sequence of strings generated by tokenize(). The variable I want to iterate over was constructed using xsl:copy-of, which is copying newly created elements. But when I try to iterate over $chartData, the entire variable is treated one string, and my program will not recognize that there are multiple elements within $chartData. Why does yours work and mine not? Let me know if you need more info stillEric Gratta
@DimitreNovatchev did you ever finish the optimalised version of that concordance?Private

1 Answers

2
votes

I'm afraid your question is very confusing. What you are trying to do isn't difficult, but it's hard to know where to start in helping you move forward.

I am interested in seeing if I can store a sequence of nodes in a variable

Yes, you can. You need to be clear in your mind whether you want to create new nodes, or whether you want the variable to hold references to existing nodes. In the first case you use the "temporary tree" construct:

<xsl:variable name="tree">
  ... content ...
</xsl:variable>

In the second case you can either use the select attribute, or a contained sequence constructor plus an "as" attribute:

<xsl:variable name="seq" select="expression"/>

or

<xsl:variable name="seq" as="node()*">
  ... content ...
</xsl:variable>

I am trying instead to iterate over a variable using an xsl:for-each and have it recognize my variable as a sequence

Well, the value of every variable is a sequence, the question is, what is in the sequence? If you use the first form of variable above (content, no as attribute) the sequence will be a singleton sequence containing a single document node; you will probably want to navigate to its children using path expressions, for example <xsl:for-each select="$tree/*"/>. In the second and third cases the sequence will typically contain multiple nodes, which can be nodes of any kind, so you are more likely to write <xsl:for-each select="$seq"/>.

I'm finding it difficult to locate and comprehend documentation on that attribute and how SequenceType's work

Where did you look? I can't recommend my book "XSLT 2.0 Programmer's Reference" too highly... I think you are the kind of person who wants to understand how things really work, rather than learning by trial and error from examples. The book was written for that kind of person.

It seems that xsl:copy-of, as opposed to xsl:sequence (ironically) is what I need to append a sequence of nodes.

No, you've got that wrong. If you are creating a variable that is to contain references to existing nodes, then you need xsl:sequence. If you are creating a temporary tree, which will contain copies of nodes, then xsl:sequence and xsl:copy-of have exactly the same effect - both will copy the selected nodes.

can I treat a variable as XML and iterate over each count element if it looks like the following

Yes you can. But we can't tell from your representation of the sequence whether these nodes are children of a document node, and knowing that is crucial.

why does xsl:copy-of only deep-copy when I have the for-each outside of the variable's scope and set select to '$chartData'? If I try to use xsl:copy-of within the variable it only copies the text content of the literal elements and creates a string

You've drawn an incorrect inference here. xsl:copy-of always does a deep copy. If you're only seeing the string content of the nodes, it's because a subsequent operation is being used that atomizes the nodes or extracts their string value.