An Extensive Examination of Data Structures
Part 3: Binary Trees and BSTs
Scott Mitchell
4GuysFromRolla.com
January 2004
Summary: This article, the third in a six-part series on data structures in the .NET Framework, looks at a common data structure that is not included in the .NET Framework Base Class Library: binary trees. Whereas arrays arrange data linearly, binary trees can be envisioned as storing data in two dimensions. A special kind of binary tree, called a binary search tree, or BST, allows for a much more optimized search time than with arrays. (31 printed pages)
Scott Mitchell continues his study of data structures, focusing on binary trees and BSTs, a common data structure that is not include in the .NET Framework Class Library.
Download the BinaryTrees.msi sample file.
Contents
Introduction
Arranging Data in a Tree
Understanding Binary Trees
Improving the Search Time with Binary Search Trees (BSTs)
Binary Search Trees in the Real-World
In Part 1 of this series, we looked at what data structures are, how their performance can be evaluated, and how these performance considerations play into choosing which data structure to utilize for a particular algorithm. In addition to reviewing the basics of data structures and their analysis, we also looked at the most commonly used data structure—the array—and its relative, the ArrayList. In Part 2 we looked at the cousins of the ArrayList—the Stack and the Queue, which store their data like an ArrayList, but limit the means by which their contained data can be accessed. In Part 2 we also looked at the Hashtable, which is essentially an array that is indexed by some arbitrary object as opposed to by an ordinal value.
The ArrayList, Stack, Queue, and Hashtable all use an underlying array as the means by which their data is stored. This means that, under the covers, these four data structures are bound by the limitations imposed by an array. Recall from Part 1 that an array is stored linearly in memory, requires explicit resizing when the array's capacity it reached, and suffers from linear searching time.
In this third installment of the article series, we will examine a new data structure—the binary tree. As we'll see, binary trees store data in a non-linear fashion. After discussing the properties of binary trees, we'll look at a more specific type of binary tree—the binary search tree (BST). A BST imposes certain rules on how the items of the tree are arranged. These rules provide BSTs with a sub-linear search time, making them more efficient for searching than arrays.
If you've ever looked at a genealogy table, or at the chain of command in a corporation, you've seen data arranged in a tree. A tree is composed of a collection of nodes, where each node has some associated data and a set of children. A node's children are those nodes that appear immediately beneath the node itself. A node's parent is the node immediately above it. A tree's root is the single node that contains no parent.
Figure 1 shows an example of the chain of command in a fictitious company.
Figure 1. Tree view of a chain of command in a fictitious company
In this example, the tree's root is Bob Smith, CEO. This node is the root because it has no parent. The Bob Smith node has one child, Tina Jones, President, whose parent is Bob Smith. The Tina Jones node has three children:
·???????????????????? Jisun Lee, CIO
·???????????????????? Frank Mitchell, CFO
·???????????????????? Davis Johnson, VP of Sales
Each of these nodes' parent is the Tina Jones node. The Jisun Lee node has two children:
·???????????????????? Tony Yee
·???????????????????? Sam Maher
The Frank Mitchell node has one child:
·???????????????????? Darren Kulton
The Davis Johnson node has three children:
·???????????????????? Todd Brown
·???????????????????? Jimmy Wong
·???????????????????? Sarah Yates
All trees exhibit the following properties:
·???????????????????? There is precisely one root.
·???????????????????? All nodes except the root have precisely one parent.
·???????????????????? There are no cycles. That is, starting at any given node, there is not some path that can take you back to the starting node. The first two properties—that there exists one root and that all nodes save the root have one parent—guarantee that no cycles exist.
Trees are useful for arranging data in a hierarchy. As we will discuss later on in this article, the time to search for an item can be drastically reduced by intelligently arranging the hierarchy. Before we can arrive at that topic, though, we need to first discuss a special kind of tree, the binary tree.
Note???Throughout this article we will be looking at numerous new terms and definitions. These definitions, along with being introduced throughout the text, are listed in Appendix A.
A binary tree is a special kind of tree, in which all nodes have at most two children. For a given node in a binary tree, the first child is referred to as the left child, while the second child is referred to as the right child. Figure 2 depicts two binary trees.
Figure 2. Illustration of two binary trees
Binary tree (a) has 8 nodes, with node 1 as its root. Node 1's left child is node 2; node 1's right child is node 3. Notice that a node doesn't need to have both a left child and right child. In binary tree (a), node 4, for example, has only a right child. Furthermore, a node can have no children. In binary tree (b), nodes 4, 5, 6, and 7 all have no children.
Nodes that have no children are referred to as leaf nodes. Nodes that have one or two children are referred to as internal nodes. Using these new definitions, the leaf nodes in binary tree (a) are nodes 6 and 8, and the internal nodes are nodes 1, 2, 3, 4, 5, and 7.
Unfortunately, the .NET Framework does not contain a binary tree class, so in order to better understand binary trees, let's take a moment to create our own binary tree class.
The First Step: Creating a Node Class
The first step in designing our binary tree class is to create a Node class. The Node class abstractly represents a node in the tree. Realize that each node in a binary tree contains two things:
1.????????????????? Data
2.????????????????? 0, 1, or 2 children
The data stored in the nodes depends on what it is that you need to store. Just like arrays can be created to hold instances of integers, strings, or other classes, nodes, too, are designed to hold some instance of a class. To make the Node class as general as possible, we will have the node store data of type object, just like the ArrayList, Queue, Stack, and Hashtable use an underlying array of objects to store data of any type.
Note???With C# Version 2.0 you will be able to use Generics to create a strongly typed Node class, rather than having to use a Node class that holds objects. For more information on using Generics, be sure to read Juval Lowy's article An Introduction to C# Generics.
The following is the code for the Node class:
public class Node
{
?? private object data;
?? private Node left, right;
?? #region Constructors
?? public Node() : this(null) {}
?? public Node(object data) : this(data, null, null) {}
?? public Node(object data, Node left, Node right)
?? {
????? this.data = data;
????? this.left = left;
????? this.right = right;
?? }
?? #endregion
?? #region Public Properties
?? public object Value
?? {
????? get
????? {
???????? return data;
????? }
????? set
????? {
???????? data = value;
????? }
?? }
?? public Node Left
?? {
????? get
????? {
???????? return left;
????? }
????? set
????? {
???????? left = value;
????? }
?? }
?? public Node Right
?? {
????? get
????? {
???????? return right;
????? }
????? set
????? {
???????? right = value;
????? }
?? }
?? #endregion
}
Note that the Node class has three private member variables:
1.????????????????? data, of type object. This member variable contains the data stored in the node.
2.????????????????? left, of type Node. This member variable contains a reference to the node's left child.
3.????????????????? right, of type Node. This member variable contains a reference to the node's right child.
4.????????????????? The remainder of the class contains the constructors and the public properties, which provide access to these three member variables. Notice that the left and right member variables are of type Node. That is, the Node class has member variables that are Node class instances themselves.
转 载:http://www.qqread.com/dotnet/q952112002.html
更多内容请看.NET移动与嵌入式技术、.NET开发手册专题,或进入讨论组讨论。
较新的文章:Data Structures with .NET - Part 2: The Queue, Stack, and Hashtable
相关专题
- .NET移动与嵌入式技术 (5950篇文章)
- .NET开发手册 (5652篇文章)
- vb.net入门——OpenFileDialog 组件的使用 (75次浏览)
- vb.net入门——FontDialog 组件的使用 (52次浏览)
- vb.net入门——FolderBrowserDialog 组件的使 (45次浏览)
- vb.net入门——ColorDialog 组件的使用 (41次浏览)
- 用vb.net创建一个鼠标绘图程序 (39次浏览)
- vb.net入门——SaveFileDialog 组件的使用 (38次浏览)
- 在vb.net中用ado.net连接Access (25次浏览)
- ASP.NET缓存:方法分析和实践示例 (23次浏览)
- asp.net动态设置WebService引用 (22次浏览)
- VB.NET关于加密算法 (18次浏览)



