Stuff I learned today – hierarchyid


I’m in training this week. Since I was restricted to a local class, I decided to take one focused on development, rather than dba-related topics, and get some exposure to facets of SQL Server I don’t normally deal with on a day-to-day basis. And I figured I’d share some of my newly garnered knowledge with you.

Today’s nugget of new knowledge: the hierarchyID data type. This is a new data type in SQL 2008 that addresses the challenges of storing parent/child hierarchical relationships inside a table. Previously, developers had to use CTEs and self-joins to handle this kind of scenario. A classic example is the Employee table, where Employees have Managers who are also Employees, seen below.

USE AdventureWorks;
GO

SELECT EmployeeID, LastName, FirstName, ManagerID
INTO EmployeeDemo
FROM HumanResources.Employee
JOIN Person.Contact ON Employee.ContactID = Contact.ContactID;
GO

SELECT
     Mgr.EmployeeID AS ManagerID, Mgr.FirstName + ' ' + Mgr.LastName AS ManagerName,
     Emp.EmployeeID AS EmployeeID, Emp.FirstName + ' ' + Emp.LastName AS EmployeeName
FROM EmployeeDemo AS Emp
LEFT JOIN EmployeeDemo AS Mgr
ON Emp.ManagerID = Mgr.EmployeeID
ORDER BY ManagerID, EmployeeID;

Another way to look at it is via a self-joining CTE. This will allow you to get the hierarchical level of the employee.

 WITH Organization(EmployeeId, LastName, Manager, HierarchyOrder)
 AS
 (
    SELECT emp.EmployeeId, emp.LastName, emp.LastName, 1 AS HierarchyOrder
    FROM EmployeeDemo AS emp
      WHERE emp.ManagerId is Null
    UNION ALL
    SELECT emp.EmployeeId, emp.LastName, Parent.LastName, HierarchyOrder + 1
    FROM EmployeeDemo AS emp
           INNER JOIN Organization AS Parent
                 ON emp.ManagerId = parent.EmployeeId
 )
 SELECT *
 From Organization
GO

That’s great and all, and pretty slick looking, but you could start to see some performance hit with a self-referencing view like that.
So let’s say I want to take my existing EmployeeDemo table and use it to create a hierarchical table, how would I do that? I’d start by creating a new table that contains a hierarchyid column.

CREATE TABLE OrgChart
(
  OrgNode hierarchyid,
  EmployeeID int,
  LastName varchar(50),
  FirstName varchar(50),
  ManagerID int
);
GO

I can then use a slightly modified version of my previous CTE to populate it.

WITH Organization(path, EmployeeID)
AS (
-- This section provides the value for the root of the hierarchy
SELECT hierarchyid::GetRoot() AS OrgNode, EmployeeID
FROM EmployeeDemo AS C
WHERE ManagerID IS NULL
UNION ALL
-- This section provides values for all nodes except the root
SELECT
CAST(p.path.ToString() + CAST(ROW_NUMBER() OVER (PARTITION BY ManagerID ORDER BY ManagerID) AS varchar(30)) + '/' AS hierarchyid),
C.EmployeeID
FROM EmployeeDemo AS C
JOIN Organization AS p
   ON C.ManagerID = P.EmployeeID
)
INSERT INTO OrgChart
SELECT path, o.EmployeeID, LastName, FirstName, ManagerID
FROM Organization o JOIN EmployeeDemo e ON o.EmployeeID = e.EmployeeID

SELECT * FROM OrgChart;
GO

So now what do I do with this new column? Well, having this new column makes the CTE unnecessary. You can traverse the hierarchy branches using built-in functions. Suppose I want to see who works under Peter Krebs, employee 21.

DECLARE @Manager hierarchyid

SELECT @Manager = OrgNode
FROM OrgChart
WHERE EmployeeID = 21 ;

SELECT *
FROM OrgChart
WHERE OrgNode.IsDescendantOf(@Manager) = 1
order by OrgNode;

With that query we can see that Cynthia Randall reports to Peter, and Andy Ruth reports to Cynthia, etc.

I’m definitely going to have to play around with this new datatype and all of the associated functions to get my brain wrapped around it better, but it looks pretty cool.

Some resources for further reading on the hierarchyid data type:
Populating a Table with Existing Hierarchical Data

Creating and Managing Data in a Hierarchical Table

Using hierarchyid Data Types (Database Engine)

Also recommended:

Leave a comment

Your email address will not be published.