6
votes

In an Excel Power Query file the data connection can be from a SQL server. We have a large number of files that specify a SQL server by name and this server is going to be decommissioned. We need to update the connection to replace the older server name with the new server name. This is possible by opening the Excel file, browsing to the query and editing the server name manually. Due to the large number of files it is desired to do this using C#. The image below shows the input fields (with the names removed) where you would update this manually.

SQL Connection Form

First by unzipping the Excel file and browsing the contents under the folder xl > connections.xml I would have expected it to specfiy the connection there but it only says $Workbook$

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<connections xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <connection id="1" keepAlive="1" name="Query" description="Connection to the query in the workbook." type="5" refreshedVersion="6" background="1" saveData="1">
    <dbPr connection="Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=&quot;table&quot;" command="SELECT * FROM [table]"/>
  </connection>
</connections>

On the MDSN forms there is a reference to this topic and the answer provided by Will Gregg says:

External data source connection information is stored in the XLSX package in a custom part. You can locate the custom part under the customXML folder of the package. For example: customXml\iem1.xml.

Contained in item1.xml is a element. The definition for the element can be found in the [MS-QDEFF]: Query Definition File Format document (https://msdn.microsoft.com/en-us/library/mt577220(v=office.12).aspx).

In order to work with the data of the element you will need to decode the contents as described in the [MS-QDEFF]: Query Definition File Format document.

Once the data is decoded, you will need to examine the contents of the PackagePart. Within that package you will find the external data connection information in the Forumlas\Section1.m part.

This is helpful to point me to the item.xml file in the customXml folder but does not give any details on how to decode the information in the DataMashup object. The answer did mention the [MS-QDEFF]: Query Definition File Format document is available at this link from the main article about the query definition format. The information in this document can seem dense and complex at first glance.

On Stack Overflow there are 6 questions that mention DataMashup and 4 of them are related to Power BI, which while similar to this issue are not the same. The links to each of those questions are listed below:

The other 2 questions are more relevant as they ask about Excel not Power BI which I will discuss below:

  1. This question asks how to remove Power Query queries' custom XML data using VBA. I do not want to delete the query but rather update the connection string and I would like to do this in C# not VBA. The questions shows the result using the Macro recorder and I do not want to open each Excel file to run a VBA Macro.
  2. This question asks how to find the query information and comes across the same $Workbook$ that I did. In the comment by Axel Richter he says In *.xlsx/customXml/ you will find a item1.xml which contains a DataMashup element which contains a base64Binary which is the binary query definition file. I have no clue how to work with that. That's why only a comment and not a answer. Over a year later an answer was added by Tom Jebo pointing to the Open Specifications details I found as well but does not offer a solution on how to manipulate the DataMashup object. I am adding this as a new question since the this question is looking to solve a little different problem than I am and it is also looking for a solution in JavaScript.

What is the best way to decode the DataMashup object, change the server name, and then save the updated connection back to the Excel file?

In this blog post by Jeff Atwood on July 1, 2011, asking and answering your own questions is encouraged. In addition this page form the Stack Overflow help center address the same issue. I decided to post a full working solution in C# for others to modify and use, hopefully saving them the time of needing to sludge through all the working out I did.

1

1 Answers

7
votes

As mentioned in the question, the most helpful document is the [MS-QDEFF]: Query Definition File Format. I will include the most relevant parts of this document here but refer to the original document if needed. Below shows the example XML with the DataMashup provided by Microsoft. This is for a short query but expect something similar if you open up the customXml > item1.xml file.

<DataMashup sqmid="7690c5d6-5698-463c-a560-a0093d4f6332"
    xmlns="http://schemas.microsoft.com/DataMashup">
  AAAAAEUDAABQSwMEFAACAAgAta0pR62KRJynAAAA+QAAABIAHABDb25maWcvUGFja2FnZS54bWwgohgA
  KKAUAAAAAAAAAAAAAAAAAAAAAAAAAAAhY9NDoIwGESvQrqnP4jGkI+ycCuJCdG4bUqFRiiGFsvdXHgkr
  yCJYti5nMmb5M3r8YRsbJvgrnqrO5MihikKlJFdqU2VosFdwi3KOByEvIpKBRNsbDJanaLauVtCiPce+
  xXu+opElDJyzveFrFUrQm2sE0Yq9FuV/1eIw+kjwyMcxTimmzVmMWVA5h5ybRbMpIwpkEUJu6FxQ6+4M
  uGxADJHIN8b/A1QSwMEFAACAAgAta0pRw/K6aukAAAA6QAAABMAHABbQ29udGVudF9UeXBlc10ueG1sI
  KIYACigFAAAAAAAAAAAAAAAAAAAAAAAAAAAAG2OSw7CMAxErxJ5n7qwQAg1ZQHcgAtEwf2I5qPGReFsL
  DgSVyBtd4ilZ+Z55vN6V8dkB/GgMfbeKdgUJQhyxt961yqYuJF7ONbV9Rkoihx1UUHHHA6I0XRkdSx8I
  Jedxo9Wcz7HFoM2d90Sbstyh8Y7JseS5x9QV2dq9DSwuKQsr7UZB3Fac3OVAqbEuMj4l7A/eR3C0BvN2
  cQkbZR2IXEZXn8BUEsDBBQAAgAIALWtKUdi3rmEPAAAAEsAAAATABwARm9ybXVsYXMvU2VjdGlvbjEub
  SCiGAAooBQAAAAAAAAAAAAAAAAAAAAAAAAAAAArTk0uyczPUwiG0IbWvFy8XMUZiUWpKQqBpalFlYYKt
  go5qSW8XApAEJxfWpScChQx1Dbk5crMQxa1BgBQSwECLQAUAAIACAC1rSlHrYpEnKcAAAD5AAAAEgAAA
  AAAAAAAAAAAAAAAAAAAQ29uZmlnL1BhY2thZ2UueG1sUEsBAi0AFAACAAgAta0pRw/K6aukAAAA6QAAA
  BMAAAAAAAAAAAAAAAAA8wAAAFtDb250ZW50X1R5cGVzXS54bWxQSwECLQAUAAIACAC1rSlHYt65hDwAA
  ABLAAAAEwAAAAAAAAAAAAAAAADkAQAARm9ybXVsYXMvU2VjdGlvbjEubVBLBQYAAAAAAwADAMIAAABtA
  gAAAAA0AQAA77u/PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz48UGVybWlzc2lvb
  kxpc3QgeG1sbnM6eHNpPSJodHRwOi8vd3d3LnczLm9yZy8yMDAxL1hNTFNjaGVtYS1pbnN0YW5jZSIge
  G1sbnM6eHNkPSJodHRwOi8vd3d3LnczLm9yZy8yMDAxL1hNTFNjaGVtYSI+PENhbkV2YWx1YXRlRnV0d
  XJlUGFja2FnZXM+ZmFsc2U8L0NhbkV2YWx1YXRlRnV0dXJlUGFja2FnZXM+PEZpcmV3YWxsRW5hYmxlZ
  D50cnVlPC9GaXJld2FsbEVuYWJsZWQ+PFdvcmtib29rR3JvdXBUeXBlIHhzaTpuaWw9InRydWUiIC8+P
  C9QZXJtaXNzaW9uTGlzdD7LBwAAAAAAAKkHAADvu788P3htbCB2ZXJzaW9uPSIxLjAiIGVuY29kaW5nP
  SJ1dGYtOCI/PjxMb2NhbFBhY2thZ2VNZXRhZGF0YUZpbGUgeG1sbnM6eHNpPSJodHRwOi8vd3d3LnczL
  m9yZy8yMDAxL1hNTFNjaGVtYS1pbnN0YW5jZSIgeG1sbnM6eHNkPSJodHRwOi8vd3d3LnczLm9yZy8yM
  DAxL1hNTFNjaGVtYSI+PEl0ZW1zPjxJdGVtPjxJdGVtTG9jYXRpb24+PEl0ZW1UeXBlPkFsbEZvcm11b
  GFzPC9JdGVtVHlwZT48SXRlbVBhdGggLz48L0l0ZW1Mb2NhdGlvbj48U3RhYmxlRW50cmllcyAvPjwvS
  XRlbT48SXRlbT48SXRlbUxvY2F0aW9uPjxJdGVtVHlwZT5Gb3JtdWxhPC9JdGVtVHlwZT48SXRlbVBhd
  Gg+U2VjdGlvbjEvUXVlcnkxPC9JdGVtUGF0aD48L0l0ZW1Mb2NhdGlvbj48U3RhYmxlRW50cmllcz48R
  W50cnkgVHlwZT0iSXNQcml2YXRlIiBWYWx1ZT0ibDAiIC8+PEVudHJ5IFR5cGU9IlJlc3VsdFR5cGUiI
  FZhbHVlPSJzTnVtYmVyIiAvPjxFbnRyeSBUeXBlPSJGaWxsRW5hYmxlZCIgVmFsdWU9ImwxIiAvPjxFb
  nRyeSBUeXBlPSJGaWxsVG9EYXRhTW9kZWxFbmFibGVkIiBWYWx1ZT0ibDAiIC8+PEVudHJ5IFR5cGU9I
  kZpbGxDb3VudCIgVmFsdWU9ImwxIiAvPjxFbnRyeSBUeXBlPSJGaWxsRXJyb3JDb3VudCIgVmFsdWU9I
  mwwIiAvPjxFbnRyeSBUeXBlPSJGaWxsQ29sdW1uVHlwZXMiIFZhbHVlPSJzQlE9PSIgLz48RW50cnkgV
  HlwZT0iRmlsbENvbHVtbk5hbWVzIiBWYWx1ZT0ic1smcXVvdDtRdWVyeTEmcXVvdDtdIiAvPjxFbnRye
  SBUeXBlPSJGaWxsRXJyb3JDb2RlIiBWYWx1ZT0ic1Vua25vd24iIC8+PEVudHJ5IFR5cGU9IkZpbGxMY
  XN0VXBkYXRlZCIgVmFsdWU9ImQyMDE1LTA5LTEwVDA0OjQ1OjQxLjkyNzU5MDBaIiAvPjxFbnRyeSBUe
  XBlPSJSZWxhdGlvbnNoaXBJbmZvQ29udGFpbmVyIiBWYWx1ZT0ic3smcXVvdDtjb2x1bW5Db3VudCZxd
  W90OzoxLCZxdW90O2tleUNvbHVtbk5hbWVzJnF1b3Q7OltdLCZxdW90O3F1ZXJ5UmVsYXRpb25zaGlwc
  yZxdW90OzpbXSwmcXVvdDtjb2x1bW5JZGVudGl0aWVzJnF1b3Q7OlsmcXVvdDtTZWN0aW9uMS9RdWVye
  TEvQXV0b1JlbW92ZWRDb2x1bW5zMS57UXVlcnkxLDB9JnF1b3Q7XSwmcXVvdDtDb2x1bW5Db3VudCZxd
  W90OzoxLCZxdW90O0tleUNvbHVtbk5hbWVzJnF1b3Q7OltdLCZxdW90O0NvbHVtbklkZW50aXRpZXMmc
  XVvdDs6WyZxdW90O1NlY3Rpb24xL1F1ZXJ5MS9BdXRvUmVtb3ZlZENvbHVtbnMxLntRdWVyeTEsMH0mc
  XVvdDtdLCZxdW90O1JlbGF0aW9uc2hpcEluZm8mcXVvdDs6W119IiAvPjxFbnRyeSBUeXBlPSJGaWxsZ
  WRDb21wbGV0ZVJlc3VsdFRvV29ya3NoZWV0IiBWYWx1ZT0ibDEiIC8+PEVudHJ5IFR5cGU9IkFkZGVkV
  G9EYXRhTW9kZWwiIFZhbHVlPSJsMCIgLz48RW50cnkgVHlwZT0iUmVjb3ZlcnlUYXJnZXRTaGVldCIgV
  mFsdWU9InNTaGVldDIiIC8+PEVudHJ5IFR5cGU9IlJlY292ZXJ5VGFyZ2V0Q29sdW1uIiBWYWx1ZT0ib
  DEiIC8+PEVudHJ5IFR5cGU9IlJlY292ZXJ5VGFyZ2V0Um93IiBWYWx1ZT0ibDEiIC8+PEVudHJ5IFR5c
  GU9Ik5hbWVVcGRhdGVkQWZ0ZXJGaWxsIiBWYWx1ZT0ibDAiIC8+PEVudHJ5IFR5cGU9IkZpbGxUYXJnZ
  XQiIFZhbHVlPSJzUXVlcnkxIiAvPjxFbnRyeSBUeXBlPSJCdWZmZXJOZXh0UmVmcmVzaCIgVmFsdWU9I
  mwxIiAvPjxFbnRyeSBUeXBlPSJGaWxsU3RhdHVzIiBWYWx1ZT0ic0NvbXBsZXRlIiAvPjxFbnRyeSBUe
  XBlPSJRdWVyeUlEIiBWYWx1ZT0iczdlMDQzNjJlLTkyZjUtNGQ4Mi04YjA3LTI3NjFlYWY2OGFlNSIgL
  z48L1N0YWJsZUVudHJpZXM+PC9JdGVtPjxJdGVtPjxJdGVtTG9jYXRpb24+PEl0ZW1UeXBlPkZvcm11b
  GE8L0l0ZW1UeXBlPjxJdGVtUGF0aD5TZWN0aW9uMS9RdWVyeTEvU291cmNlPC9JdGVtUGF0aD48L0l0Z
  W1Mb2NhdGlvbj48U3RhYmxlRW50cmllcyAvPjwvSXRlbT48L0l0ZW1zPjwvTG9jYWxQYWNrYWdlTWV0Y
  WRhdGFGaWxlPhYAAABQSwUGAAAAAAAAAAAAAAAAAAAAAAAA2gAAAAEAAADQjJ3fARXREYx6AMBPwpfrA
  QAAACLWGAG5O6FHjkAGtB+m5EQAAAAAAgAAAAAAA2YAAMAAAAAQAAAAaH8KNe2ciHwfVosIvSCr6gAAA
  AAEgAAAoAAAABAAAAA40fOKWe6kmTAWJSBXs4cYUAAAAPNy7uF6Dtr9PvADu+eZdeV7JutpIQTh41qqT
  3QnFoWPwE0Xyrur5N6Q2s2TEzjlBDfkEmNaGtr3htemOjWZYXKQHP+R5u/90zHWiwOwjjowFAAAAF2UC
  6Jm8C98hVmJBo638e4Qk65V
</DataMashup>

The value of this object is encoded in a Base64 string. If you are not familiar with Base 64, this Wikipedia article would be a good place to start. The first step in the solution will be to open the XML document and convert this into its byte representation. This can be done as follows:

string file = @"\customXml\item1.xml"; // or wherever your xml file is
XDocument doc = XDocument.Load(file);

byte[] dataMashup = Convert.FromBase64String(doc.Root.Value);

NOTE: In the full example provided at the bottom of this answer, all the manipulation is done in memory.

From the Microsoft definition document:

Version (4 bytes): Unsigned integer that MUST be set to 0.

Package Parts Length (4 bytes): Unsigned integer that specifies the length of the Package Parts field.

Package Parts (variable): Variable-length binary stream (section 2.3).

Permissions Length (4 bytes): Unsigned integer that specifies the length of the Permissions field.

Permissions (variable): Variable-length binary stream (section 2.4).

Metadata Length (4 bytes): Unsigned integer that specifies the length of the Metadata field.

Metadata (variable): Variable-length binary stream (section 2.5).

Permission Bindings Length (4 bytes): Unsigned integer that specifies the length of the Permission Bindings field.

Permission Bindings (variable): Variable-length binary stream (section 2.6).

Since each field that defines the length of its content is 4 bytes I defined a constant

private const int FIELDS_LENGTH = 4;

Then each of the values defined in this section (quoted from Microsoft) can be found as shown below:

int version = BitConverter.ToUInt16(dataMashup.Take(FIELDS_LENGTH).ToArray(), 0);

int packagePartsLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH).Take(FIELDS_LENGTH).ToArray(), 0);
byte[] packageParts = dataMashup.Skip(FIELDS_LENGTH * 2).Take(packagePartsLength).ToArray();

int permissionsLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH  * 2 + packagePartsLength).Take(FIELDS_LENGTH).ToArray(), 0);
byte[] permissions = dataMashup.Skip(FIELDS_LENGTH * 3).Take(permissionsLength).ToArray();

int metadataLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH * 3 + packagePartsLength + permissionsLength).Take(FIELDS_LENGTH).ToArray(), 0);
byte[] metadata = dataMashup.Skip(FIELDS_LENGTH * 4 + packagePartsLength + permissionsLength).Take(metadataLength).ToArray();

int permissionsBindingLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH * 4 + packagePartsLength + permissionsLength + metadataLength).Take(FIELDS_LENGTH).ToArray(), 0);
byte[] permissionsBinding = dataMashup.Skip(FIELDS_LENGTH * 5 + packagePartsLength + permissionsLength + metadataLength).Take(permissionsBindingLength).ToArray();

Using the byte[] for the package parts, it represents a Package object from the System.IO.Packaging namespace.

using (MemoryStream ms = new MemoryStream(packageParts)) {
    using (Package package = Package.Open(ms, FileMode.Open, FileAccess.ReadWrite)) {
        PackagePart section = package.GetParts().Where(x => x.Uri.OriginalString == "/Formulas/Section1.m").FirstOrDefault();

        string query;
        using (StreamReader reader = new StreamReader(section.GetStream())) {
            query = reader.ReadToEnd();
            // do other replacing, removing of query here
        }
        using (BinaryWriter writer = new BinaryWriter(section.GetStream())) {
            // write updated query back to package part
            writer.Write(Encoding.ASCII.GetBytes(query));
        }
    }

    packageParts = ms.ToArray();
}

Finally I need to update the original byte[] with the new information from the updated package.

bytes = BitConverter.GetBytes(version)
            .Concat(BitConverter.GetBytes(packageParts.Length))
            .Concat(packageParts)
            .Concat(BitConverter.GetBytes(permissionsLength))
            .Concat(permissions)
            .Concat(BitConverter.GetBytes(metadataLength))
            .Concat(metadata)
            .Concat(BitConverter.GetBytes(permissionsBindingLength))
            .Concat(permissionsBinding);
doc.Root.Value = Convert.ToBase64String(bytes.ToArray());
entryStream.SetLength(0);
doc.Save(entryStream);

Below is the full example for completeness. It is a console application which takes in a directory of files to update as command line arguments and then replaces the old server name with the new server name.

using System;
using System.Collections.Generic;
using System.Linq;
using System.IO;
using System.IO.Compression;
using System.Xml.Linq;
using System.IO.Packaging;
using System.Text;

namespace MyApp {
    class Program {
        private const int FIELDS_LENGTH = 4;

        static void Main(string[] args) {
            if (args.Length != 1) {
                Console.WriteLine("specify one directory to update");
            }
            if (!Directory.Exists(args[0])) {
                Console.WriteLine("directory does not exist");
            }

            IEnumerable<FileInfo> files = Directory.GetFiles(args[0]).Where(x => Path.GetExtension(x) == ".xlsx").Select(x => new FileInfo(x));

            foreach (FileInfo file in files) {
                using (FileStream fileStream = File.Open(file.FullName, FileMode.OpenOrCreate)) {
                    using (ZipArchive archive = new ZipArchive(fileStream, ZipArchiveMode.Update)) {

                        ZipArchiveEntry entry = archive.GetEntry("customXml/item1.xml");

                        IEnumerable<byte> bytes;
                        using (Stream entryStream = entry.Open()) {
                            XDocument doc = XDocument.Load(entryStream);

                            byte[] dataMashup = Convert.FromBase64String(doc.Root.Value);
                            int version = BitConverter.ToUInt16(dataMashup.Take(FIELDS_LENGTH).ToArray(), 0);

                            int packagePartsLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH).Take(FIELDS_LENGTH).ToArray(), 0);
                            byte[] packageParts = dataMashup.Skip(FIELDS_LENGTH * 2).Take(packagePartsLength).ToArray();

                            int permissionsLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH * 2 + packagePartsLength).Take(FIELDS_LENGTH).ToArray(), 0);
                            byte[] permissions = dataMashup.Skip(FIELDS_LENGTH * 3).Take(permissionsLength).ToArray();

                            int metadataLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH * 3 + packagePartsLength + permissionsLength).Take(FIELDS_LENGTH).ToArray(), 0);
                            byte[] metadata = dataMashup.Skip(FIELDS_LENGTH * 4 + packagePartsLength + permissionsLength).Take(metadataLength).ToArray();

                            int permissionsBindingLength = BitConverter.ToUInt16(dataMashup.Skip(FIELDS_LENGTH * 4 + packagePartsLength + permissionsLength + metadataLength).Take(FIELDS_LENGTH).ToArray(), 0);
                            byte[] permissionsBinding = dataMashup.Skip(FIELDS_LENGTH * 5 + packagePartsLength + permissionsLength + metadataLength).Take(permissionsBindingLength).ToArray();

                            // use double memory stream to solve issue as memory stream will change
                            // size when re-saving the data mashup object
                            using (MemoryStream packagePartsStream = new MemoryStream(packageParts)) {
                                using (MemoryStream ms = new MemoryStream()) {
                                    packagePartsStream.CopyTo(ms);
                                    using (Package package = Package.Open(ms, FileMode.Open, FileAccess.ReadWrite)) {
                                        PackagePart section = package.GetParts().Where(x => x.Uri.OriginalString == "/Formulas/Section1.m").FirstOrDefault();

                                        string query;
                                        using (StreamReader reader = new StreamReader(section.GetStream())) {
                                            query = reader.ReadToEnd();
                                            // do other replacing, removing of query here
                                            query = query.Replace("old-server", "new-server");
                                        }
                                        using (BinaryWriter writer = new BinaryWriter(section.GetStream())) {
                                            writer.Write(Encoding.ASCII.GetBytes(query));
                                        }
                                    }

                                    packageParts = ms.ToArray();
                                }

                                bytes = BitConverter.GetBytes(version)
                                            .Concat(BitConverter.GetBytes(packageParts.Length))
                                            .Concat(packageParts)
                                            .Concat(BitConverter.GetBytes(permissionsLength))
                                            .Concat(permissions)
                                            .Concat(BitConverter.GetBytes(metadataLength))
                                            .Concat(metadata)
                                            .Concat(BitConverter.GetBytes(permissionsBindingLength))
                                            .Concat(permissionsBinding);
                                doc.Root.Value = Convert.ToBase64String(bytes.ToArray());
                                entryStream.SetLength(0);
                                doc.Save(entryStream);
                            }
                        }
                    }
                }
            }
        }
    }
}

NOTE: As I only needed to update the Package Parts part, I can confirm this decoding / encoding works but I did not test the the decoding / encoding of the Permissions, Metadata, or Permissions Binding. If you need to use these this should at least get you started.

NOTE: This code does not catch errors or handle every case. It is meant to be a working example of how to update the connections in a Power Query file. Feel free to adapt this as you need.