I’m in training this week. Since I was restricted to a local class, I decided to take one focused on development, rather than dba-related topics, and get some exposure to facets of SQL Server I don’t normally deal with on a day-to-day basis. And I figured I’d share some of my newly garnered knowledge with you.
Today’s nugget of new knowledge: the hierarchyID data type. This is a new data type in SQL 2008 that addresses the challenges of storing parent/child hierarchical relationships inside a table. Previously, developers had to use CTEs and self-joins to handle this kind of scenario. A classic example is the Employee table, where Employees have Managers who are also Employees, seen below.
USE AdventureWorks; GO SELECT EmployeeID, LastName, FirstName, ManagerID INTO EmployeeDemo FROM HumanResources.Employee JOIN Person.Contact ON Employee.ContactID = Contact.ContactID; GO SELECT Mgr.EmployeeID AS ManagerID, Mgr.FirstName + ' ' + Mgr.LastName AS ManagerName, Emp.EmployeeID AS EmployeeID, Emp.FirstName + ' ' + Emp.LastName AS EmployeeName FROM EmployeeDemo AS Emp LEFT JOIN EmployeeDemo AS Mgr ON Emp.ManagerID = Mgr.EmployeeID ORDER BY ManagerID, EmployeeID;
Another way to look at it is via a self-joining CTE. This will allow you to get the hierarchical level of the employee.
WITH Organization(EmployeeId, LastName, Manager, HierarchyOrder) AS ( SELECT emp.EmployeeId, emp.LastName, emp.LastName, 1 AS HierarchyOrder FROM EmployeeDemo AS emp WHERE emp.ManagerId is Null UNION ALL SELECT emp.EmployeeId, emp.LastName, Parent.LastName, HierarchyOrder + 1 FROM EmployeeDemo AS emp INNER JOIN Organization AS Parent ON emp.ManagerId = parent.EmployeeId ) SELECT * From Organization GO
That’s great and all, and pretty slick looking, but you could start to see some performance hit with a self-referencing view like that.
So let’s say I want to take my existing EmployeeDemo table and use it to create a hierarchical table, how would I do that? I’d start by creating a new table that contains a hierarchyid column.
CREATE TABLE OrgChart ( OrgNode hierarchyid, EmployeeID int, LastName varchar(50), FirstName varchar(50), ManagerID int ); GO
I can then use a slightly modified version of my previous CTE to populate it.
WITH Organization(path, EmployeeID) AS ( -- This section provides the value for the root of the hierarchy SELECT hierarchyid::GetRoot() AS OrgNode, EmployeeID FROM EmployeeDemo AS C WHERE ManagerID IS NULL UNION ALL -- This section provides values for all nodes except the root SELECT CAST(p.path.ToString() + CAST(ROW_NUMBER() OVER (PARTITION BY ManagerID ORDER BY ManagerID) AS varchar(30)) + '/' AS hierarchyid), C.EmployeeID FROM EmployeeDemo AS C JOIN Organization AS p ON C.ManagerID = P.EmployeeID ) INSERT INTO OrgChart SELECT path, o.EmployeeID, LastName, FirstName, ManagerID FROM Organization o JOIN EmployeeDemo e ON o.EmployeeID = e.EmployeeID SELECT * FROM OrgChart; GO
So now what do I do with this new column? Well, having this new column makes the CTE unnecessary. You can traverse the hierarchy branches using built-in functions. Suppose I want to see who works under Peter Krebs, employee 21.
DECLARE @Manager hierarchyid SELECT @Manager = OrgNode FROM OrgChart WHERE EmployeeID = 21 ; SELECT * FROM OrgChart WHERE OrgNode.IsDescendantOf(@Manager) = 1 order by OrgNode;
With that query we can see that Cynthia Randall reports to Peter, and Andy Ruth reports to Cynthia, etc.
I’m definitely going to have to play around with this new datatype and all of the associated functions to get my brain wrapped around it better, but it looks pretty cool.
Some resources for further reading on the hierarchyid data type:
Populating a Table with Existing Hierarchical Data