0
votes

I'm extracting data from powerpoint documents by reading the underlying xml.

I want to get the name of the slide master layout that a particular slide is using, but I can't figure out how to get this info from the slide element in question.

For example, My slide master has many layouts, and one is called 1_Title Slide.

I can open the xml and find the list of slide master layouts and names pretty easily. The slidemaster layout that I want looks like this:

<p:sldLayout xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main" showMasterSp="0" userDrawn="1"><p:cSld name="1_Title Slide"> ... </p:sldLayout>

Where the layut name is clearly 1_Title Slide.

However, when I search for that string within the <p:sld /> slide element that I know is using that layout, I cannot find that text. The layout has a lot of different elements and attributes, presumably because the layout in question has several placeholders and shapes, and so it's not easy to find out how the xml may be referencing it. What tag or attribute in openxml stores the "unique id" of the layout within the slide element? How is it mapped?

1
I wonder if python-pptx can do this for you - or at least inspire you as to how to do it.Martin Packer

1 Answers

0
votes

Just so you know, a layout with a title like 1_Title Slide is an accidental layout that got there by pasting in a slide from a different presentation. It's not native to the template or theme.

For references to relationships among XML parts, look in the _rels folder for the relationship file. So slide1.xml will have a file in there called slide1.xml.rels. One of the entries will look something like this:

<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideLayout" Target="../slideLayouts/slideLayout7.xml"/>

Then open slideLayout7.xml and get the layout name.