Adventures in Data Structures Vol 15: Graphs Algorithms!
Finding Reachable Nodes
Supposing that we have a graph and a starting node, what nodes are reachable?
The pseudocode for the single source reachability algorithm is
- Create a set of reachable vertices, initially empty, called $r$.
- Create a container for vertices known to be reachable and call this $c$.
- Add start vertex to container $c$
- While $c$ is not empty:
- remove first entry from $c$ and assign it to $v$
- if $v$ is not already in $r$, then add $v$ to $r$ and add the neighbors of $v$ which are not already reachable to $c$ (you can add these in some order, like alphabetical order). It is okay to have repeats in the stack.
- At the end, return $r$
The reachable collection $r$ can be any of the bag implementations. For the container $c$, we can use a stack or a deque. The graph can be represented using a dynamic array of linked lists or a HashMap of linked lists where they key is the name of the node and the value is the list of neighbors. This type of search is called a depth first search or DFS.
If we choose to instead use queue for $c$, this will be a breadth first search (BFS)!
- DFS uses a stack to hold vertices to check
- Using the maze example, the stack holds all the locations where we had a choice to make in direction to take
- BFS uses a queue to hold vertices to check
- Using the maze examples, this search emanates out in the wave like fashion (like multiple people working on solving the maze)
DFS can take a “wrong turn” and have to backtrack multiple times but it can also get it right off the abt. It can also get stuck in infinite paths.
BFS is guaranteed to always find the solution with the least steps, even if it’s slower than the DFS in some situations.
Both cases have complexity $O(E)$ where $E$ = number of edges in the graph representation.
Djikstra’s Algorithm
For use in a weighted graph, we might be intersted in finding the route with the lowest cost.
The pseudocode for Djikstra’s algorithm is very similar to the process of the depth first search. However, instead of using a stack, we make use of a priority queue.
- Initialize a map of reachable vertices with source vertex having cost 0 and add source vertex to a priority queue with distance 0
- While priority queue is not empty:
- pop vertex $v$ with the shortest distance from the priority queue
- if $v$ is known to be reachable, discard it. Otherwise, add $v$ with given cost to dictionary of reachable vertices
- For all neighbors $v_j$ of $v$, if $v_j$ is not in the set of reachable vertices, combine the cost of reaching $v$ with cost of travelling from $v$ to $v_j$ and add to the priority queue
This is also called a cost first search since what is returned is the least costly path and also the cumulative cost of taking that path. This algorithm always explores the next node with the cumulative least cost.
The complexity of Djikstra’s algorithm is $O(E log E)$ since the time to add and remove from a priority queue is bounded by log E.