TL;DR In this blog post, we will explore examples illustrating why alternate variations of Kosaraju-Sharir’s algorithm for computing strongly connected components fail.
Prerequisites
I assume that reader is already familiar with the below topics:
- Strongly connected component problem
- Rough idea of Kosaraju Sharir Algorithm1.
Finally, a handy reference to clrs book as we will be using the conceptual framework from this book.
Terminology
- Discovery time of a vertex: When a vertex is first discovered. Denoted as
v.discovery
- finish time of a vertex: When the vertex is fully explored and recursive dfs function for the vertex returns. Denoted as
v.finish
Conventions used:
- An edge being explored is in bold red color and an edge in blue color is part of dfs tree.
- Time is monotonically increasing and it starts from 1.
discovery_time/finish_time
times appear below the node.- Nodes are that fully processed are in blue color and currently being explored are in red color.
Consider an example of running dfs from a vertex a
Algorithm
- Run dfs on the given graph
G
to compute finish timev.finish
for each vertex. - Compute the transpose $G^T$ of the graph.
- Process the vertices in the decreasing order of finish time as computed in step 1 and run dfs on $G^T$. Dfs trees, thus generated are strongly connected components. When determining a tree, ignore the nodes that are already finished as part of previous trees.2.
To understand why the above algorithm works, refer to clrs book as it does a better job than I could possibly ever do.
Alternate Versions
Do other variations in step 3 work? Like instead of choosing finish_time
we choose discovery_time
and run dfs in step 3 on original graph. In total, there are 8 possibilities (1 correct and 7 incorrect)3. By using examples, we will see that remaining 7 options don’t work.
1. Discover time ascending order and Original graph
- Here, we start dfs from node
a
because its discovery time is minimum and reachb
viaa->b
edge and wrongly computea,b
must be belong to same component. - Wrongly computed components =
[{a,b}]
- Actual components =
[{a}, {b}]
2. Discover time ascending order and Transpose graph
- Here, we start dfs from node
b
because its discovery time is minumum. - Wrongly computed components =
[{b, a}]
- Actual components =
[{a}, {b}]
3. Discovery time descending order and Original graph
- Here, we start dfs from node
c
because its discovery time is maximum. - Wrongly computed components =
[{c,b}, {a}]
- Actual components =
[{a}, {b}, {c}]
4. Discovery time descending order and Transpose Graph
- Here, we start dfs from node
b
because its discovery time is maximum. - Wrongly computed components =
[{b,a}]
- Actual components =
[{a}, {b}]
5. Finish time ascending order and Original graph
- Here, we start dfs from node
c
because its finish time is minimum. - Wrongly computed components =
[{c,a,b}]
- Actual components =
[{a,c}, {b}]
6. Finish time ascending order and Transpose graph
- Here, we start dfs from node
b
because its finish time is minimum - Wrongly computed components =
[{b,a}]
- Actual components =
[{a}, {b}]
7. Finish time descending order and original graph
- Here, we start dfs from node
a
because its finish time is maximum. - Wrongly computed components =
[{a,b}]
- Actual components =
[{a}, {b}]
8. Finish time descending order and transpose graph
This version is correct and refer to clrs for concrete proof.
-
The reason for not using wikipedia version of Kosaraju algorithm is because is I found it diffcult to understand. ↩︎
-
This is same as many online implementations adding the explored node at the beginning of queue and processing the queue from head. ↩︎
-
discovery or finish time = 2, original or transpose graph = 2, ascending or descending order = 2. Total 8 combinations. ↩︎