At my latest position, we work with data in XML format. A Lot! (Jump to the code. Download all the code.)

Our results are often offered to outside clients from some of our web services. In most cases the data is provided in XML format (we are beginning to convert our output to JSON, but that’s another story).



The .NET DataSet object offers a “GetXml()” method which is a very useful tool. It converts your DataSet into its XML equivalent. For DataSets with a single table, this is a trivial exercise. Just populate your DataSet, call the method and voila! You have some XML to work with.

(You can also generate nested XML using Transact-SQL. See my discussion of “FOR XML EXPLICIT” (Coming Soon!))

However, in many (most?) cases, one never works with information from a single table. Data is likely stored in a hierarchical arrangement of related tables sometimes with multiple inter-connected relations. The classic Customer-Orders-OrderItems example comes to mind.

In this example we have three tables with a cascading one-to-many relations. Using Transact-SQL and the DataAdapter object, we can populate a DataSet with these tables using a single call to a stored procedure. But getting the data from our DataSet into a useful XML format takes a few steps.

In order to generate a heirarchical/nested XML from a dataset, you need to define relations between your tables. You also need to specify that the relations are nested. With this in place your call to GetXml() will produce a nested XML. However, the result is not “clean” XML. There is not outer node to contain your results. And child relations are not grouped.

Get to the code, already!! (I have provided SQL scripts to generate the database I’m working with as well as all the code for these examples here )


1. Get the raw data. In this example. I make a call to a stored procedure which gets my data and I populate a DataSet using a SqlDataAdapter.

// This a connection string to our database (located in App.Config)
string connStr = ConfigurationManager.ConnectionStrings["default"].ConnectionString;

// Create our DB connection.
SqlConnection conn = new SqlConnection(connStr);

// This stored procedure returns four tables (Customer, Orders,
// OrderItems and CustNotes) [SEE NOTE AT TOP OF FILE]
SqlCommand cmd = new SqlCommand("[sp_GetDataForXml]", conn);
DataSet ds = new DataSet();

// Use a DataAdapter to fill the dataset with four tables at once
SqlDataAdapter a = new SqlDataAdapter(cmd);
a.Fill(ds);

// Set the names of the tables are returned from our stored proc
// (makes it easier to deal with the tables later)
DataTable tblCustomers = ds.Tables[0];
tblCustomers.TableName = "Customer";
DataTable tblOrders = ds.Tables[1];
tblOrders.TableName = "Order";
DataTable tblOrderItems = ds.Tables[2];
tblOrderItems.TableName = "OrderItem";

[NOTE: If you call GetXml() at this point you get XML that is not nested, because we have not created relations between that tables in our dataset, yet. The output from these code snippets is available in a ZIP file: NestedXMLSampleOutput in the file named NoNesting.XML]

2. Create the relations between the tables. I won’t go into detail in creating the relations here. You can see the complete source in the solution files NestedXML. Here is how you would create a relation between the Customers and Orders table.

// This relation will yield an "Orders" node in each Customer node
DataRelation relCustOrders = new DataRelation(
    "relCustOrders",                        // relation name
    tblCustomers.Columns["CustomerId"],     // parent column
    tblOrders.Columns["CustomerId"]); // child column

3. Set the Nested property. This is what allows the XML to be rendered in a useful format.

// Set the "Nested" property on all the relations we created
relCustOrders.Nested = true;

4. Now add the relation to the DataSet.

// Add the relation to the dataset
ds.Relations.Add(relCustOrders);

5. Call the GetXml() method. This will render a useful (but not completely formatted XML string).

    string xml = ds.GetXml();

So far, so good. We have imported our data tables, created relations and exported them into semi-useful XML. [Refer to the sample output file named NestedNoWrapper.XML.]

Note that there are several “Order” nodes within each customer, and that there are several “OrderItem” nodes within each “Order” node. Although, this is perfectly acceptable, I would prefer to see the Order nodes “wrapped” in a single node under Customer, and all the OrderItem nodes wrapped up in a single node under the Order nodes. [Refer to the sample output file named NestedWithWrappers.XML]

This will present our next challenge. Each level of nodes represents a data table. This implies that there is a “Orders” table in addition to an “Order” table, and there is a “Items” table in addition to the OrderItem table.

Since these tables are not in our data set, we will need to create them ourselves and insert them into our DataSet. These tables are not complicated. In fact, they only contain one data colum: the ID column that previously bridged the parent and child tables. This table is simply an intermediary. This is the definition of the “Orders” wrapper data table:

// Will create a "Orders" node under each Customer
DataTable tblOrderWrapper = new DataTable("Orders");
tblOrderWrapper.Columns.Add(
    "CustomerId",
    tblCustomers.Columns["CustomerId"].DataType);

// Populate our "Orders" wrapper tables.
foreach (DataRow currCust in tblCustomers.Rows)
{
    tblOrderWrapper.Rows.Add(currCust["CustomerId"]);
}

// Add our wrapper tables to the dataset
ds.Tables.Add(tblOrderWrapper);

Once these table are in place, we now need to re-create the data relations to insert our wrappers into into the hierarchy. This creates the new Customer-to-OrderWrapper relation and the new OrderWrapper-to-Order relations:

// This relation will yield an "Orders" node in each Customer node
DataRelation relCustOrders = new DataRelation(
    "relCustOrders",                        // relation name
    tblCustomers.Columns["CustomerId"],     // parent column
    tblOrderWrapper.Columns["CustomerId"]); // child column

// This will group all Orders inside the "Orders" node
DataRelation relOrdersOrder = new DataRelation(
    "relOrdersOrder",                       // relation name
    tblOrderWrapper.Columns["CustomerId"],  // parent column
    tblOrders.Columns["CustomerId"]);       // child column

// Finally add the relations to the DataSet
ds.Relations.Add(relCustOrders);
ds.Relations.Add(relOrdersOrder);

It is a similar process to create the tables/relations for the new Order to OrderItem hierarchy (the full source is available in the Solution file located here: NestedXML. Once you have the extra tables and relations, making a call to GetXml() will result in a much more “readable” XML package.

Are these extra steps worth the effort? That’s up to you to decide. In my case, I’m preparing XML for clients to consume in their own applications. So presenting a well organized XML file simply provides a more “professional” product. In the end, the decision is yours.


12 Replies to “Generating Hierarchical (Nested) XML from a DataSet”

  1. it’s gud, but i m getting error “Object reference not set to an instance of an object” in the line “DataRelation POItems = dXML.Relations.Add(“POItems”, dXML.Tables[“PO”].Columns[“POID”], dXML.Tables[“ITEM”].Columns[“POID”]);”

    1. I think you have me confused with someone else.

      The code you referenced is not found in my project (link at the top of the article).

      But thanks for visiting my site!

  2. Brilliant article, thank you.
    Just something I picked up and I might be wrong here, but you never added the relCustOrders to the dataset on the sample code provided here.

    1. Peter, you are correct!

      I wrote out the commands to create the DataRelation but I did not show the code that adds the Relation to the DataSet. I’ve added these lines to the sample code.

      The downloadable code project (top of the page) did have the relations added to the DataSet; it was only the code on the page that was incorrect.

      Thanks for you sharp eye!

  3. Thanks for publishing this detailed example. It saved me a lot of fumbling around … so have a beer on me.
    🙂

    1. Shivani,
      Thanks for your question. Unfortunately, I am not able to suggest a good solution for this problem. I will post an update if I come across a solution for this. If you develop a nice solution, perhaps you would be willing to share the knowledge.
      -tomas

      1. Thank you Tomas for giving your time. What I found is that we have to implicitly create the datatable for all the relationships present in the dataset.

        For example,

        DataRelation customerOrdersRelation =
        customerOrders.Relations.Add(“CustOrders”,
        customerOrders.Tables[“Customers”].Columns[“CustomerID”],
        customerOrders.Tables[“Orders”].Columns[“CustomerID”]);

        DataRelation orderDetailRelation =
        customerOrders.Relations.Add(“OrderDetail”,
        customerOrders.Tables[“Orders”].Columns[“OrderID”],
        customerOrders.Tables[“OrderDetails”].Columns[“OrderID”], false);

        DataRelation orderProductRelation =
        customerOrders.Relations.Add(“OrderProducts”,
        customerOrders.Tables[“Products”].Columns[“ProductID”],
        customerOrders.Tables[“OrderDetails”].Columns[“ProductID”]);

        foreach (DataRow custRow in customerOrders.Tables[“Customers”].Rows)
        {
        Console.WriteLine(“Customer ID: ” + custRow[“CustomerID”]);

        foreach (DataRow orderRow in custRow.GetChildRows(customerOrdersRelation))
        {
        Console.WriteLine(” Order ID: ” + orderRow[“OrderID”]);
        Console.WriteLine(“tOrder Date: ” + orderRow[“OrderDate”]);

        foreach (DataRow detailRow in orderRow.GetChildRows(orderDetailRelation))
        {
        Console.WriteLine(“t Product: ” +
        detailRow.GetParentRow(orderProductRelation)[“ProductName”]);
        Console.WriteLine(“t Quantity: ” + detailRow[“Quantity”]);
        }
        }
        }

  4. Thank you Tomas for giving your time. What I found is that we have to implicitly create the datatable for all the relationships present in the dataset.

    For example,

    DataRelation customerOrdersRelation =
    customerOrders.Relations.Add(“CustOrders”,
    customerOrders.Tables[“Customers”].Columns[“CustomerID”],
    customerOrders.Tables[“Orders”].Columns[“CustomerID”]);

    DataRelation orderDetailRelation =
    customerOrders.Relations.Add(“OrderDetail”,
    customerOrders.Tables[“Orders”].Columns[“OrderID”],
    customerOrders.Tables[“OrderDetails”].Columns[“OrderID”], false);

    DataRelation orderProductRelation =
    customerOrders.Relations.Add(“OrderProducts”,
    customerOrders.Tables[“Products”].Columns[“ProductID”],
    customerOrders.Tables[“OrderDetails”].Columns[“ProductID”]);

    foreach (DataRow custRow in customerOrders.Tables[“Customers”].Rows)
    {
    Console.WriteLine(“Customer ID: ” + custRow[“CustomerID”]);

    foreach (DataRow orderRow in custRow.GetChildRows(customerOrdersRelation))
    {
    Console.WriteLine(” Order ID: ” + orderRow[“OrderID”]);
    Console.WriteLine(“tOrder Date: ” + orderRow[“OrderDate”]);

    foreach (DataRow detailRow in orderRow.GetChildRows(orderDetailRelation))
    {
    Console.WriteLine(“t Product: ” +
    detailRow.GetParentRow(orderProductRelation)[“ProductName”]);
    Console.WriteLine(“t Quantity: ” + detailRow[“Quantity”]);
    }
    }
    }

Leave a Reply

Your email address will not be published. Required fields are marked *