There are other ways, of course, to represent graphs. For example, in the Road Simulator lab we are representing a graph where the traffic lights are nodes and the road segments themselves are weighted edges. In the maze lab, we specify the nodes as a r*c grid of nodes, where nodes may only have edges to neighboring nodes, and edges are specified implicitly (by specifying walls, which indicate where there are not edges between neighboring nodes).
Our graph generation program gen_graph takes two arguments: number of nodes and number of edges, and then it emits the number of nodes and generates the appropriate number of random edges. There are two pitfalls in writing gen_graph. First is that you don't want to generate edges from a node to itself, and second is that you don't want to generate duplicate edges. The first pitfall is taken care of easily by checking to make sure that the second random node generated does not equal the first.
To address the second pitfall, we use a set. When we generate a random edge, we turn it into a string composed of the id of the smaller node followed by a space and then the id of the larger node. We check the set for that string, and if it is there, then we have a duplicate edge and must throw it out and try again.
The code is in gen_graph.cpp. Note it does not error check to make sure that e is ≤ n(n-1)/2. It should (think about it).
#include <iostream>
#include <string>
#include <set>
#include <stdlib.h>
using namespace std;
main(int argc, char **argv)
{
int n1, n2, i, tmp;
int n;
int e;
char key[100];
string s;
set <string> st;
set <string>::iterator sit;
if (argc != 3) {
cerr << "usage: gen_graph n e\n";
exit(1);
}
n = atoi(argv[1]);
e = atoi(argv[2]);
srand48(time(0));
cout << "NNODES " << n << endl;
i = 0;
while (i < e) {
n1 = lrand48()%n;
do {
n2 = lrand48()%n;
} while (n2 == n1);
if (n1 > n2) {
tmp = n1;
n1 = n2;
n2 = tmp;
}
sprintf(key, "%d %d", n1, n2);
s = key;
sit = st.find(s);
if (sit == st.end()) {
st.insert(s);
cout << "EDGE " << n1 << " " << n2 << endl;
i++;
}
}
}
|
Regardless, it works as it should. Here we generate two random graphs each with ten nodes.
UNIX> gen_graph 10 6 > g1.txt UNIX> sleep 1 UNIX> gen_graph 10 9 > g2.txtHere are the graph pictures and files:
|
g1.txt
NNODES 10 EDGE 4 9 EDGE 4 6 EDGE 4 7 EDGE 6 8 EDGE 3 5 EDGE 1 3 |
g2.txt
NNODES 10 EDGE 5 9 EDGE 1 2 EDGE 5 8 EDGE 3 7 EDGE 2 7 EDGE 0 3 EDGE 5 7 EDGE 6 8 EDGE 2 9 |
You'll note, g1 has six edges, four connected components and no cycles. G2 has nine edges, two connected components and one cycle (2,7,5,9,2).
This maps into a fairly simple algorithm for counting connected components. First, you read in a graph. Then you set all visited fields to zero. Then you traverse all the nodes in the graph, and whenever you encounter one whose visited field is zero, you perform the connected component depth first search on it. The total number of depth first searches is equal to the number of connected components in the graph.
The code is in cc-count.cpp
#include <iostream>
#include <string>
#include <vector>
using namespace std;
class Node {
public:
int id;
int visited;
vector <int> adj;
};
void cc_visit(vector <Node> *g, int id)
{
int i, n;
(*g)[id].visited = 1;
for (i = 0; i < (*g)[id].adj.size(); i++) {
n = (*g)[id].adj[i];
if (!(*g)[n].visited) cc_visit(g, n);
}
}
main()
{
int i, size, j, n1, n2;
vector <Node> graph;
string s;
int cc;
cin >> s;
if (s != "NNODES") {
cerr << "Bad graph -- first word is not NNODES\n";
exit(1);
}
cin >> size;
graph.resize(size);
for (i = 0; i < size; i++) {
graph[i].id = i;
graph[i].visited = 0;
}
while (!cin.fail()) {
cin >> s;
if (!cin.fail()) {
cin >> n1 >> n2;
graph[n1].adj.push_back(n2);
graph[n2].adj.push_back(n1);
}
}
cc = 0;
for (i = 0; i < graph.size(); i++) {
if (!graph[i].visited) {
cc_visit(&graph, i);
cc++;
}
}
cout << "CC: " << cc << endl;
}
|
As we can see, it works fine on our two example files:
UNIX> cc-count < g1.txt CC: 4 UNIX> cc-count < g2.txt CC: 2 UNIX>It's not a bad idea to copy this file over and put some print statements in so that you can visualize the depth first search.
Cycle detection is another depth first search. Here we also set a visited field; however, if we now encounter a node whose visited field is set, we know that the node is part of a cycle, and we return that fact. Again, it's a simple search, the relevant procedure I include below (in cccycle-wrong.cpp):
int is_cycle(vector <Node> *g, int id)
{
int i, n;
(*g)[id].visited = 1;
for (i = 0; i < (*g)[id].adj.size(); i++) {
n = (*g)[id].adj[i];
if ((*g)[n].visited) {
return 1;
} else {
if (is_cycle(g, n)) return 1;
}
}
return 0;
}
..... later, in main:
isc = 0;
for (i = 0; i < graph.size(); i++) graph[i].visited = 0;
for (i = 0; !isc && i < graph.size(); i++) {
if (!graph[i].visited) isc = is_cycle(&graph, i);
}
cout << "CC: " << cc;
cout << " Cycle: " << ((isc == 0) ? "No" : "Yes") << endl;
}
|
Note that unlike connected components, this procedure has a return value, and it uses that return value to truncate the search when a cycle is found.
When we run it, we see that it doesn't work correctly, as it says that g1 has a cycle, when we know that it doesn't:
UNIX> cccycle-wrong < g1.txt CC: 4 Cycle: Yes UNIX> cccycle-wrong < g2.txt CC: 2 Cycle: Yes UNIX>If you start putting some print statements in, you'll see what's hapenning on g1. The program first visits node 0 and finds no cycle. Then it visits node 1 and recursively visits node 3. Since node 3 has an edge back to node 1, it detects a cycle there. How do we fix this bug?
One simple way is to include who calls is_cycle() as a parameter so that is_cycle() will not detect cycles that include the same edge twice. Here's the changed procedure and call from main( in cccycle.cpp)
int is_cycle(vector <Node> *g, int id, int from)
{
int i, n;
(*g)[id].visited = 1;
for (i = 0; i < (*g)[id].adj.size(); i++) {
n = (*g)[id].adj[i];
if (n != from) {
if ((*g)[n].visited) {
return 1;
} else {
if (is_cycle(g, n, id)) return 1;
}
}
}
return 0;
}
......
for (i = 0; !isc && i < graph.size(); i++) {
if (!graph[i].visited) isc = is_cycle(&graph, i, -1);
}
.......
|
Once again, a lesson from this lecture is to program slowly, incrementally, and test often. That way you find bugs closer to when you make them, making them simpler to fix.