Introduction
XML, or Extensible Markup Language, is a very popular format used to store and share data. In a nutshell, XML stores information in a tree-based text format that allows both you and I as well as computers to easily read the data. I'm sure you have used XML-like languages directly or indirectly, to borrow two popular examples, if you have ever used RSS feeds or have written XHTML pages.
In this tutorial, I will explain how to read data from an XML file in C#. The .NET Framework provides built-in functionality for reading and writing XML, but knowing how to use those classes can be important. Before diving into the code, I want to provide a brief overview of XML and cover some terminology because it will help you to better understand why the code does something of the things it does.
Here is how a simple XML file looks like:
Code: XML
<?xml version"1.0"?>
<forums>
<forum name="Web Development">
<thread>
<title>ASP/ASP.NET</title>
<link>http://www.go4expert.com/forumdisplay.php?f=67</link>
</thread>
<thread>
<title>PHP</title>
<link>http://www.go4expert.com/forumdisplay.php?f=66</link>
</thread>
<thread>
<title>PERL-CGI</title>
<link>http://www.go4expert.com/forumdisplay.php?f=69</link>
</thread>
</forum>
</forums>
Elements can be nested, and the nested content can either be attributes or other elements. With that said, let's revisit the above file and look at two examples of nested content. First, let's look at an example of nested elements:
Code: XML
<thread>
<title>ASP/ASP.NET</title>
<link>http://www.go4expert.com/forumdisplay.php?f=67</link>
</thread>
Our example file also contains an example of another type of nested content - attributes. Take a look at the the forum element:
Code: XML
<forum name="Web Development">
Basic Approach to Reading XML
The way you read an XML file is similar to using a magnifying glass and looking at each element in the XML file individually. At each element, you determine whether that element has anything valuable to look at, and if it does, you extract the valuable info and move on to the next node.
If you convert the above basic overview into something useful, you will get the following block of code that you can use to read a XML file:
Code: CSHARP
XmlTextReader reader = new XmlTextReader("f:\\XML\\MyXML.xml");
while (reader.Read())
{
XmlNodeType nodeType = reader.NodeType;
switch (nodeType)
{
case XmlNodeType.Element:
Console.WriteLine("Element name is {0}", reader.Name);
if (reader.HasAttributes)
{
for (int i = 0; i < reader.AttributeCount; i++)
{
reader.MoveToAttribute(i);
Console.WriteLine("Attribute is {0} with Value {1}: ", reader.Name, reader.Value);
}
}
break;
case XmlNodeType.Text:
Console.WriteLine("Value is: " + reader.Value);
break;
}
}
Looking at the Code
Let me go through each line of the code in greater detail:
XmlTextReader reader = new XmlTextReader("f:\\XML\\MyXML.xml"));
The XmlTextReader class is what you primarily use to read data from XML files. In the above line of code, I create a reader object of type XmlTextReader, and I pass the path of my XML file to the constructor.
Notice that I am using two \\ slashes instead of a single \ slash to designate the path. The reason is that a single \ in a string can be interpreted as an escape character. By using two slashes, you avoid having to use the less elegant " and / combination to prevent a Unrecognized Escape Sequence error.
The final thing to note about this line is that if you plan on deploying your application to other users with an embedded MyXML.xml file, be sure to check out my tutorial on how to use resources to internalize MyXML.xml to your situation:
Code: CSHARP
XmlTextReader reader = new XmlTextReader(Assembly.GetExecutingAssembly().GetManifestResourceStream("XMLTest.MyXML.xml"));
Code: CSHARP
while (reader.Read())
{
XmlNodeType nodeType = reader.NodeType;
if (nodeType == XmlNodeType.Element)
{
switch(reader.Name)
{
case "title":
Console.WriteLine("TITLE: " + reader.ReadString());
break;
case "link":
Console.WriteLine("LINK: " + reader.ReadString());
break;
case "forum":
reader.MoveToAttribute(0);
Console.WriteLine("FORUM: " + reader.Value);
break;
}
}
}
The reader.Read() statement is a boolean value that returns a true as long as there is data to be read. Once the we reach the end of our XML file, reader.Read() will return a false and the loop terminates.
Code: CSHARP
XmlNodeType nodeType = reader.NodeType;
Code: CSHARP
switch (nodeType)
{
case XmlNodeType.Element:
if (reader.HasAttributes)
{
for (int i = 0; i < reader.AttributeCount; i++)
{
reader.MoveToAttribute(i);
}
Console.WriteLine("Attribute is {0} with Value {1}: ", reader.Name, reader.Value);
}
break;
case XmlNodeType.Text:
Console.WriteLine("Value is: " + reader.Value);
break;
}
Code: CSHARP
Console.WriteLine("Element name is {0}", reader.Name);
Code: CSHARP
if (reader.HasAttributes)
{
for (int i = 0; i < reader.AttributeCount; i++)
{
reader.MoveToAttribute(i);
}
Console.WriteLine("Attribute is {0} with Value {1}: ", reader.Name, reader.Value);
}
Code: CSHARP
if (reader.HasAttributes)
{
for (int i = 0; i < reader.AttributeCount; i++)
{
reader.MoveToAttribute(i);
Console.WriteLine("Attribute is {0} with Value {1}: ", reader.Name, reader.Value);
}
}
Something really unique is that it's not good enough to just know the index position of where your next attribute is. You need to actually move to that particular attribute by using the reader object's MoveToAttribute property. To link an earlier analogy I used, you physically move your magnifying glass to the next node. Once you have moved to the new location, you can access the Name and Value properties like you did before.
Code: CSHARP
case XmlNodeType.Text:
Console.WriteLine("Value is: " + reader.Value);
break;
Quick Review / Alternate Approach
While it looked like there was a lot of code, what the code actually does is fairly simple. The most important thing to keep in mind is that the above approach loops through each node in your code. You cannot, at least in the implementation I presented, look at a previous or future nude from your current location. That explains why when you wanted to access the attribute value, you first had to explicitly move to the next element before accessing that element's value.
The code I provided so far is pretty generic. There may be situations where you want to only access certain elements from your XML file. The following code highlights how you can access values from only elements whose names you specify:
Code: CSHARP
XmlTextReader reader = new XmlTextReader("f:\\XML\\MyXML.xml"));
while (reader.Read())
{
XmlNodeType nodeType = reader.NodeType;
if (nodeType == XmlNodeType.Element)
{
switch(reader.Name)
{
case "title":
Console.WriteLine("TITLE: " + reader.ReadString());
break;
case "link":
Console.WriteLine("LINK: " + reader.ReadString());
break;
case "forum":
reader.MoveToAttribute(0);
Console.WriteLine("FORUM: " + reader.Value);
break;
}
}
}
