Skip to the content.

Analysis of Binary Search Tree

For analysis of time complexity for operations on BST we explore the following questions.

Search is the critical first step any operation including both insertion and deletion on a binary search tree. A successful or unsuccessful search depends on the length of the tree path from the root to the farthest leaf node. A right-skewed or a left-skewed BSTs exhibit pathological cases for insertions and deletions. These cases occur as follows:

A completely skewed tree requires a time of O(n) for any operation. On the other hand, an operation on a balanced binary search tree takes O(log n). Therefore,

When we have a random input source, all distributions of the values are equally likely. So, it is important to analyze the average case time complexity. The average case analysis of BST depends on the average path length traversed on a BST for performing an operation. We begin with an example. Consider the tree shown below. The total internal path length of the tree is 15.

The first problem we need to address is:

We capture such a dynamic situation by considering the possibilities of insertions and deletions happening in any possible order. We use the following assumptions:

Now consider ith insertion for a fixed i for i = 0, 1, 2, …, n-1. The configuration of BST at this point is illustrated in the figure below:

Consider how the subsequent insertion occurs.

So the average number of probes for i+1the insertion would be:

The average value is obtained by letting i to be any of the n elements. So, the average value is given by the expression:

In the above equation, we use the expression found earlier to replace P(n,i). Since the pair of sums for P(i) and P(n,i) are symmetric, the simplified expression for P(n) becomes:

Now apply induction to prove . The induction requires a base case, an induction hypothesis and a proof for the induction step.

Base case: We know P(1) = 1, and evaluation of the expression for n=1 gives us:

Induction hypothesis: Assume

for any i less than n.

Proof of induction step: With the induction hypothesis in mind we apply to terms of the sum in the expression for P(n) and simplify it further to obtain:

Consider the sum on the right-hand side of previous inequality and apply simplification:

Replacing the sum by the final expression, we finally get

Therefore, the average case complexity of an operation on BST is of the order of log n

Back to Index