XML Parsing in .Net

Posted on October 7, 2007 
Filed Under .Net Code, CSharp, XML

While working on wxDesktop, which was my first attempt at parsing data from an XML source, I tried to find the best way to read and parse out certain data from an XML document. Before I even really got to the meat of it, I had just had it in my head that I was going to use a DataSet. Why would I do that, you ask? Well, when that’s all you know how to do, that’s usually what you go with. I have never dealt with the XML namespace, mostly because I haven’t had to. Now that I have, I know I will never use the crappy DataSet class for this ever again.

Take the following XML document.

<test>
    <data>
      <name>Sample set name</name>
      <value>5</value>
      <value>4</value>
      <value>6</value>
      <value>1</value>
      <value>20</value>
      <value>54</value>
      <value>10</value>
      <value>19</value>
      <value>17</value>
      <value>15</value>
   </data>
   <otherdata>
      <name>Second Sample set name</name>
      <value>10</value>
      <value>12</value>
      <value>14</value>
      <value>2</value>
      <value>4</value>
      <value>6</value>
      <value>8</value>
      <value>16</value>
      <value>18</value>
      <value>20</value>
   </otherdata>
</test>

When you read this document in using the DataSet.ReadXML function like this:

private void readXMLDataSet(string FileName)
{
    DataSet ds = new DataSet();
    ds.ReadXml("test.xml");
}

The DataSet will contain 3 tables. It will contain the tables called data, value, and otherdata. Not really what you would think would be in there… It sort of makes sense, except for the fact that all value columns are then lumped into one table and then linked via an ID. While this isn’t too bad, and it would probably be usable, it’s not ideal, nor fun to use.

Now take this piece of code:

private void readXMLNodes(string FileName)
{
    System.Xml.XmlDocument xd = new System.Xml.XmlDocument();
    xd.Load("test.xml");
 
    System.Xml.XmlNode node = xd.SelectSingleNode("/test/data");
    System.Xml.XmlNode nameNode = node.SelectSingleNode("name");
    System.Xml.XmlNodeList valueList = node.SelectNodes("value");
}

This uses the XmlDocument, XmlNode, and XmlNodeList classes and allows you to easily parse out certain parts of the XML document to find exactly what you need. nameNode.InnerText will show “Sample set name” and the valueList will contain every value in the data section. So valueList[0].InnerText will be 5.

Now, you could shove this information into a correctly laid out DataSet, but why bother? Now I see the value in creating a class just to parse certain XML documents. You can set the class up so that it will match the structure as it should be logically laid out and use the class to just access the nodes as needed through functions.

Comments

Leave a Reply