Graph Utilities Package
The
Graph Utilities Package contains a number of functions useful for graph theory applications.
Functions in the Graph Utilities Package.
A graph can be represented by a list of rules or its adjacency matrix. Graphs in the
Combinatorica Package format are also supported.
AdjacencyMatrix
Finding the adjacency matrix of a graph
Given a graph in any of the formats supported by the Graph Utilities Package,
AdjacencyMatrix returns a matrix representing this graph.
The graph
g can be specified by a rule list, an adjacency matrix, or the
Combinatorica representation of a graph.
The rows and columns of the
SparseArray object correspond to vertices in the order returned by
VertexList[g].
This shows the adjacency matrix of a small graph. The 2 on the diagonal represents the two loops at vertex 1. The offdiagonal element 2 represents the two multiedges {b→a, b→a}.
Out[3]//MatrixForm= 
 

The i^{th} row/column of the matrix represents the vertex VertexList[g][[i]]. That is, in the above matrix, the first row corresponds to vertex b, the second to vertex a.
Out[4]=  

This converts a graph into a sparse matrix, with two additional unconnected vertices added. It basically pads the matrix above with two columns and rows of zeros.
Out[5]//MatrixForm= 
 

This shows the adjacency matrix of a graph specified by a Combinatorica object.
Out[6]//MatrixForm= 
 

Bicomponents
Bicomponents[g]  gives the biconnected components of the undirected graph g 
Finding biconnected components.
Bicomponents[g] returns a list of all biconnected components in the undirected graph g.
A vertex
v is a cutpoint of a graph
g if
g is not connected after removing the vertex
v and all edges. A graph is biconnected if it has no cutpoint. A biconnected component (bicomponent) is a maximal subgraph that has no cutpoint.
This shows that a simple line with two vertices is biconnected.
Out[8]=  

This shows two cycles linked by a line. Each of the cycles forms a bicomponent. The line breaks into two bicomponents.
Out[10]=  
Out[11]=  

ClosenessCentrality
Finding the closeness centrality.
The closeness centrality of a vertex
u is defined as the inverse of the sum of the distance from
u to all other vertices. The closeness centrality of a vertex in a disconnected graph is based on the closeness centrality of the component where this vertex belongs.
This calculates the closeness centrality of a grid graph and plots the graph so that vertices of high centrality are red.
Out[13]=  

If a vertex does not have any paths to reach other vertices, the centrality of that vertex is zero.
Out[20]=  

Centrality of a disconnected graph is calculated by treating each component separately.
Out[21]=  
Out[15]=  

Options for ClosenessCentrality
  
Weighted  True  whether edge weight is used in calculating shortest path 
Normalize  True  whether to normalize the output 
Options for ClosenessCentrality.
Weighted
The
Weighted option specifies whether edge weights are to be taken into account when calculating distances. Possible values are
True (the default) or
False.
Weighted>False means that edges are assumed to have unit length.
This shows a complete graph of three vertices, with one of the edges 2→3 repeated twice, thus having an edge weight of 2. This reduces the centrality of vertex 2.
Out[22]=  

If it is intended that edge weights should all be one, then Weighted>False should be specified.
Out[23]=  

Normalize
The
Normalize option specifies whether the output is normalized to give a maximal centrality of 1. Possible values are
True or
False (the default).
This calculates the closeness centrality of a complete graph of three vertices. Since the sum of the distance from any vertex to any other vertex is 2, the closeness for each vertex is 1/2 = 0.5.
Out[25]=  

CommunityModularity, CommunityStructureAssignment, and CommunityStructurePartition
Functions for finding and assessing community structure.
CommunityStructurePartition gives the partition of a graph into communities. The partition groups the vertices into communities, such that there is a higher density of edges within communities than between them.
CommunityStructureAssignment gives the result as an assignment.
This finds the partition of the graph into communities.
Out[31]=  

This shows the assignment of vertices into communities.
Out[34]=  

This shows which vertices are assigned into which community.
Out[33]=  

This plots the graph, with each community colored differently.
Out[35]=  

Algorithms for finding community structure
Much relational information can be represented in graphs. One graph feature is community structure, which is the gathering of vertices into groups such that there is a higher density of edges within groups than between them. The discovery and analysis of community structure in graphs is useful in a number of applications. For example, in software engineering, it is sometimes necessary to group source code in directories so that interrelated routines are located in the same directory. In social networks, it is often necessary to identify groups of friends that form a closeknit community. In parallel computing, it is desirable to partition computing tasks, so that highly related tasks are placed on the same processor.
A number of algorithms exist that identify community structure in a network. One such algorithm is that of [
16]. The key concept is a quantity called community modularity. Suppose the graph of concern is
g={V, E}, and the vertex set
V is partitioned into
k subsets
V=V_{i} such that each subset belongs to one community. Then the community modularity
Q of this partition is defined as
Q= (e_{i i}a_{i}^{2})
Here,
e_{i i} is the percentage of number of edges that has both ends in community
V_{i}, and
a_{i} is the percentage of edges that start from community
V_{i}. In other words,
e_{i i}={ (u, v)uV_{i}, vV_{i}, (u, v)E}/E
a_{i}={ (u, v)uV_{i}, (u, v)E}/E.
Clearly,
Q≤ (e_{i i}a_{i}^{2})≤e_{i i}≤1. Typically, for a random partition of
V,
Q is close to zero, while the partition of
V that gives a large positive value of
Q indicates a significant community structure.
Therefore the algorithm of [
16] starts by assigning every vertex into its own community, and calculates the change in
Q that will result from merging two communities into one. It then merge the two communities that give the highest positive change in
Q. This process is repeated on the merged network. The process terminates if the highest change is found to be negative.
The Weighted Option for CommunityStructurePartition, CommunityStructureAssignment, and CommunityModularity
The following option is accepted.
Option for CommunityStructurePartition, CommunityStructureAssignment and CommunityModularity.
The
Weighted option specifies whether edge weights are to be taken into account. Possible values for this option are
True or
False (the default).
This defines a graph with edge weights of 1 except the edge between vertices 3 and 4, which has an edge weight of 2. 
This finds and shows the three communities, ignoring the edge weights.
Out[41]=  

This shows that when edge weight is taken into account, the larger edge weight between vertices 3 and 4 means that these two vertices are now placed in the same community.
Out[44]=  

EdgeList
EdgeList[g]  give a list of edges in the graph g. 
Finding the list of edges.
This gives the list of vertices of a simple graph.
Out[45]=  

ExpressionTreePlot
Functions for plotting an expression tree.
ExpressionTreePlot plots the expression tree of an expression. It has the same options as
TreePlot.
This gives an expression tree plot, with tooltips for each vertex giving the subexpression from that vertex down.
Out[52]=  

This shows an expression tree up to level 2 with its root at the left.
Out[53]=  

GraphCoordinates and GraphCoordinates3D
GraphCoordinates[g]  calculate a visually appealing twodimensional layout and return the coordinates of the vertices 
GraphCoordinates3D[g]  calculate a visually appealing threedimensional layout and return the coordinates of the vertices 
Graph drawing functions that return coordinates.
GraphCoordinates returns the coordinates of the vertices as computed using a graph drawing algorithm. It is useful when the coordinates of the vertices are needed, rather than a drawing of the graph, or when one needs to draw a graph repeatedly using the same layout but different style.
GraphCoordinates accepts the same options as
GraphPlot.
This defines and plots a small graph.
Out[55]=  

This gives the coordinates of the vertices in the previous drawing.
Out[56]=  

This plots the graph with two different styles, using the layout already calculated.
Out[58]=  

The relationship between vertices and coordinates is given below.
Out[8]=  

GraphCoordinates only returns the position of the vertices, so when used for
LayeredGraphPlot, curved edges will not be reproduced, although multiedges and selfloops will still be drawn.
This finds the vertex coordinates of the previous plot.
Out[61]=  

This shows that the curved edges between vertices 1 and 3 are not reproduced.
Out[62]=  

Options for GraphCoordinates and GraphCoordinates3D
GraphDistance, GraphPath, and GraphDistanceMatrix
GraphDistance[g,start,end]  give the distance from vertex i to vertex j in the graph g 
GraphPath[g,start,end]  find a shortest path between vertices start and end in graph g 
GraphDistanceMatrix[g]  give a matrix in which the (i, j)^{th} entry is the length of a shortest path in g between vertices i and j 
GraphDistanceMatrix[g,Parent]  return a threedimensional matrix in which the (1, i, j)^{th} entry is the length of a shortest path from i to j and the (2, i, j)^{th} entry is the predecessor of j in a shortest path from i to j. 
Finding shortest distance, shortest path, and allpairs shortest distances and paths.
GraphDistance returns the graph distance from one vertex to another.
Infinity is returned if no path exists from
i to
j. By default every edge is assumed to have an edge weight of 1.
GraphPath and
GraphDistanceMatrix find shortest path and allpairs shortest paths.
This shows a dodecahedron in two dimensions.
Out[72]=  

This shows that the distance between vertices 7 and 18 is 4, and gives the path.
Out[87]=  
Out[90]=  

This defines a simple directed graph.
Out[96]=  

This shows the predecessors in the shortest path, for example, the path from 1 to 3 has predecessor 4 ( prd[[1, 3]]), the path from 1 to 4 has predecessor 5 ( prd[[1, 4]]), and therefore the path from 1 to 3 is {1, 5, 4, 3}.
Out[81]//MatrixForm= 
 
Out[82]=  

This confirms the shortest path.
Out[97]=  

Algorithms for shortest paths
In the following it is assumed that
w (i, j) is the edge weight of edge
{i, j}, and
d (i, j) is the current estimate of the length of the shortest path from vertex
i to vertex
j.
Singlesource shortest path algorithms
Singlesource shortest path algorithms find the shortest path from one vertex
u to all other vertices.
Dijkstra's algorithm assumes that edge weights are nonnegative.
S: the set of vertices whose shortest distance to
u has been found.
Q: the set of vertices not in
S that
S "touches", stored in a heap.
1. picks in
Q the vertex
i with the smallest
d value. This step has complexity
O (Log[V])with a binary heap or with a Fibonacci heap, and in all cases it will be repeated V times.
2. places vertex
i in
S, inserts all neighbors
j of
i in
Q, if vertex
j is not in
Q. If
d[j]>d[i]+w[i, j], it sets
d[j]=d[i]+w[i, j] and updates the heap. Each insertion/update to the heap has complexity
O (Log[V]) with a binary heap, and
O (1) with a Fibonacci heap, and in all,
E update operations and
V insertions are needed.
So the overall complexity is
O ( (V+E)LogV) with a binary heap, and
O (VLogV+E+V) with a Fibonacci heap.
The BellmanFord algorithm works even if edge weights are negative. It works by looping through each edge
{i, j}, if
d[j]>d[i]+w[i, j], set
d[j]=d[i]+w[i, j]. Since this looping is carried out
V1 times, the BellmanFord algorithm has a complexity of
O (EV), which is much higher than Dijkstra's algorithm. Therefore it is only used if the graph contains negative weights. The algorithm can detect negative weight cycles.
Allpairs shortest path algorithms
Allpairs shortest path algorithms find shortest paths between all vertex pairs.
The FloydWarshall
algorithm sets all distances to
initially, except for neighboring vertices, for which the distance is set to the edge weights. It then loops over all vertex pairs
{i, j}, and relaxes if
d[i, j]<d[i, k]+d[k, j], in which case it sets
d[i, j]=d[i, k]+d[k, j]. This is so that at any iteration
k, the distance
d[i, j] represents the shortest distance between vertices
i and
j, passing through only vertices
1, 2, ..., k. This algorithm has a complexity of
O (V^{3}). It works even for negative weights, but does not detect negative weight cycles.
If the edge weights are nonnegative, the Dijkstra algorithm can be applied repeatedly to each vertex, for singlesource shortest paths. This gives a complexity of
O ( (V+E)VLogV) with a binary heap, and
O (V^{2}LogV+EV) with a Fibonacci heap.
If there are negative edge weights, then the Dijkstra algorithm can not be applied directly. The Johnson algorithm first makes all edge weights positive, by associating a value
p[i] to every vertex
i, such that
w[i, j]+p[i]p[j]≥0 if
ij. The edge weights are then modified by
w[i, j]:=w[i, j]+p[i]p[j].
To modify the edge weights, a virtual vertex
s is added that is linked to every vertex, with an edge weight of 0. Then for any vertex
i,
p[i] is defined as the distance of
s to
i. So for every edge
{i, j},
p[j]≤p[i]+w[i, j], otherwise
p[j] is not the length of a true shortest path. Hence
w[i, j]+p[i]p[j]≥0.
This modification does not change the shortest path, hence the Dijkstra algorithm can then be applied to the weightmodified graph. After that, the distance is modified back by
d[i, j]:=d[i, j]+p[j]p[i].
The Johnson algorithm has the same complex as the Dijkstra algorithm when edge weights are nonnegative, otherwise it has the same complexity as the BellmanFord algorithm.
Options for GraphPath, GraphDistanceMatrix, and GraphDistance
Options for GraphPath and GraphDistanceMatrix.
  
Weighted  False  whether edge weight is used in calculating shortest paths 
Option for GraphDistance.
Method
The option
Method specifies the algorithm used to find the shortest paths. By default
Method is set to
Automatic, which uses Johnson's algorithm. Other possible values for this option are
"BellmanFord",
"Dijkstra",
"FloydWarshall", and
"Johnson".
Weighted
The option
Weighted specifies whether edge weights are to be taken into account when finding shortest paths. Possible values are
True or
False.
GraphEdit
GraphEdit[]  interactively create and edit a graph 
GraphEdit[g]  interactively edit the graph g 
Interactive graph editing.
GraphEdit is a
J/Link application. On executing this function, a Java window will be opened. Users can then enter a new graph or modify an existing graph, and get the resulting graph back in
Mathematica.
The graph
g may be specified by a rule list, an adjacency matrix, or the
Combinatorica representation of a graph.
Vertices and edges can be added by clicking and dragging with the mouse. Vertices can be moved after setting "Drawing mode" to "Move" and dragging the mouse on the vertices. Vertices can be deleted after setting "Drawing mode" to "Delete" and selecting the vertices to be deleted by clicking.
Users can hold the
Ctrl key and click the vertex (using the first mouse button) in the
GraphEdit window to edit the vertex labels.
On closing the window, a drawing of the graph, a representation of the graph as a list of rules, the coordinates, and the vertex labels are returned.
LineScaledCoordinate
LineScaledCoordinate[{{x_{1},y_{1}},{x_{2},y_{2}},...,{x_{k},y_{k}}},r] 
 give the coordinate of a point in the polyline {{x_{1}, y_{1}}, {x_{2}, y_{2}}, ..., {x_{k}, y_{k}}}, at a scaled distance of r from point {x_{1}, y_{1}} 
LineScaledCoordinate[{{x_{1},y_{1}},{x_{2},y_{2}},...,{x_{k},y_{k}}}] 
 equivalent to LineScaledCoordinate[ {{x_{1}, y_{1}}, {x_{2}, y_{2}}, ..., {x_{k}, y_{k}}}, 0.5] 
Finding the coordinates of a point on a polyline.
LineScaledCoordinate is often used in conjunction with the
EdgeRenderingFunction option of
GraphPlot,
GraphPlot3D, or
LayeredGraphPlot, to add text or graphics along the edge of a graph.
This displays the vertices that form the edges 60% along the arrow.
Out[2]=  

MaximalBipartiteMatching
Finding the maximal bipartite matching.
MaximalBipartiteMatching gives a maximal set of nonadjacent edges between the two vertex sets of the bipartite graph.
The bipartite graph represented by an
m×
n matrix consists of the row and column vertex sets
R={1, 2, ..., m} and
C={1, 2, ..., n}, with a vertex
iR and
jC connected if the matrix element
g_{ij}≠0.
The bipartite graph represented by a rule list
{i_{1}>j_{1}, i_{2}>j_{2}, ...} consists of vertex sets
R=Union[{i_{1}, i_{2}, ...}] and
C=Union[{j_{1}, j_{2}, ...}], with a vertex
iR and
jC connected if the rule
i>j is included in the rule list.
MaximalBipartiteMatching returns a list of index pairs
{{i_{1}, j_{1}}, ..., {i_{k}, j_{k}}}, where the number of pairs
k is not larger than either vertex set.
This finds a maximal matching.
Out[5]=  

Applications
MaximalBipartititeMatching can be used to permute as many entries as possible onto the diagonal of a sparse matrix.
This defines a random 50×60 sparse matrix with approximately 4% of the elements nonzero. Needs["GraphUtilities`"]
Out[6]=  
Out[7]=  

This finds unmatched rows and columns. 
This orders the matrix by permuting matched rows and columns to the principal diagonal block first.
Out[15]=  

MaximalBipartititeMatching can be used to find the structural rank of a sparse matrix. The structural rank of a matrix is the number of entries in the maximal matching of the bipartite graph of the matrix. It is an upper bound on the numeric rank of the matrix. A matrix has structural full rank if it can be permuted so that the diagonal has no zero entries.
This shows that on average, a random 50×50 matrix with 4% nonzero entries has structural rank of 39.4.
Out[24]=  

MaximalIndependentEdgeSet
Finding the maximal independent edge set.
MaximalIndependentEdgeSet gives an approximate maximal set of pairwise nonadjacent edges of
g. A maximal independent edge set of a graph is also called a maximal matching.
This defines a graph and finds a maximal independent edge set.
Out[27]=  

This plots the graph so that edges in the maximal independent edge set are red.
Out[31]=  

The defines a graph with edge weights and finds a maximal independent edge set.
Out[50]=  

This plots the graph so that edges in the maximal independent edge set are red. The edge weights are also shown.
Out[51]=  

This gives preference to edges with higher edge weights in finding the matching.
Out[52]=  

Option for MaximalIndependentEdgeSet
  
Weighted  True  whether edges with higher weights are preferred when forming the maximal independent edge set 
Option for MaximalIndependentEdgeSet.
Weighted
The
Weighted option specifies whether edges with higher weights are preferred when forming the maximal independent edge set. Possible values are
True (the default) or
False.
MaximalIndependentVertexSet
Finding the maximal independent vertex set.
MaximalIndependentVertexSet gives an approximate maximal set of vertices such that no two vertices form an edge. It treats the input as an undirected graph.
The length of the vector
w must be the same as the number of vertices in
g.
This specifies a small graph and finds a maximal independent vertex set.
Out[56]=  

This plots the graph's maximal independent vertex set highlighted in red circles.
Out[57]=  

This finds a maximal independent vertex set with preference given to vertices with even labels.
Out[58]=  
Out[59]=  

This plots the new maximal independent vertex set in red circles.
Out[60]=  

MinCut
MinCut[g]  partition the undirected graph g into k parts, where the number of edge cuts is approximately minimized. 
Minimizing edge cuts.
MinCut treats the input as an undirected graph and tries to find a partition of the vertices into
k parts so that each part has roughly the same number of vertices, and the number of edges between these parts (known as the edge separator) is minimized.
This partitions the vertices into two parts with the minimum number of edge cuts.
Out[77]=  

This plots the graph with partitions, one part colored red and the other colored green.
Out[78]=  

This generates a random sparse matrix with about 2% zero, and symmetrizes it.
Out[96]=  
Out[99]=  

This uses MinCut to try to reorder the matrix into block diagonal form, with the number of offdiagonal elements minimized.
Out[97]=  
Out[98]=  

Applications
One of the important uses of the
MinCut function is in partitioning graphs and meshes into subdomains of roughly equal size, so that the number of edges cut by the partition is minimized. This is important in areas such as parallel computing, where each processor works on one subdomain and the communication overhead between processors must be minimized. The communication overhead is proportional to the number of edges cut, so the number of edges cut by the partition must be minimized.
Suppose a parallel finite element structural analysis is to be carried out on a threedimensional structure. An irregular mesh is generated to model the structure.
This shows a graph illustrating the connection between vertices of the mesh.
Out[109]=  

This partitions the graph into four parts. 
This plots the partitioned graph, coloring nodes according to the part to which they belong.
Out[115]=  

MinimumBandwidthOrdering
MinimumBandwidthOrdering[g]  attempt to find a vertex ordering r that minimizes the bandwidth of the underlying undirected graph of the graph g 
MinimumBandwidthOrdering[m]  attempt to find a pair of orderings {r, c} that minimizes the bandwidth of the matrix m[[r, c]], where m is a matrix and r and c are permutations; for symmetric m, r and c are the same 
Finding orderings that minimize the bandwidth of graphs and matrices.
For a graph
{V, E} with vertex ordering
f, the bandwidth of the graph is defined as
where
f is a mapping from
V to
{1, 2, ..., V}.
The bandwidth of a vertex
p is defined as
For a matrix
m={a_{ij}}, the bandwidth is defined as
For a symmetric matrix, the envelope size is defined as
which is the sum of the distance from the first element in each row to the diagonal position. If there is neither a diagonal element in a row nor elements before the diagonal, the distance is assumed to be zero.
The profile of a matrix is simply the envelope size plus the size of the matrix.
This defines a small graph. 
This finds the bandwidth of the graph with the previous ordering.
Out[120]=  

This finds a vertex ordering that attempts to minimize the bandwidth.
Out[121]=  

This renumbers the vertices using the minimum bandwidth ordering.
Out[122]=  

This defines a rectangular matrix. 
This finds the row and column ordering that attempt to minimize the bandwidth of the matrix.
Out[128]=  

Plots of the matrix before and after the ordering show a decrease in bandwidth.
Out[133]=  

This defines a sparse matrix and finds the minimal bandwidth ordering. 
This plots the matrix as a graph, coloring the vertices so that those that are among the first ordered are blue and those ordered last are red. For a symmetric matrix, the row and column orderings are the same ( r=c).
Out[5]=  

This shows the matrix before and after the ordering.
Out[9]=  

Algorithms for minimizing bandwidth/envelope size
The problem of finding the minimum bandwidth ordering of a graph has been proven to be NPcomplete; hence, practical algorithms are all heuristics aimed at finding approximate optimal orderings.
Basic algorithms
For a connected undirected graphs, the reverse CuthillMcKee (RCM) algorithm [
8] starts by finding a pseudodiameter of the graph. Then from one end
v of the diameter, vertices are ordered and form a number of level sets, where vertices within the same level set have the same distance to
v. Within the same level set, priority is given to vertices that are connected to the lowestordered vertex possible. Once all vertices are ordered, their order is reversed. While reversing the order has little effect on the bandwidth, it was found [
13] that reversing the order often reduces the fillin when the ordering is used for the factorization of sparse matrices. In our implementation we also provide a variant algorithm, RCMD, which breaks the tie between vertices that are connected to the same lowestordered vertex by ordering vertices with the lowest degree first.
The result from RCM or RCMD can be further refined and improved. The node centroid and hill climbing algorithms in [
9,
12] are two possible procedures. In the hill climbing algorithm, critical vertices (those that have a bandwidth equal to the bandwidth of the graph) are identified and attempts are made at swapping the order of a critical vertex with noncritical vertices. This continues until neither the number of critical vertices nor the bandwidth can be reduced. In the node centroid algorithm, nearcritical vertices (those with a bandwidth close to the bandwidth of the graph) are identified and their new orders are calculated as the average of their order and their neighbors' orders. The vertices are then sorted and reordered. The node centroid algorithm is always followed by the hill climbing algorithm. The hill climbing refinement, or the combined node centroid and hill climbing algorithms, can be chosen using
RefinementMethod>"HillClimbing" or
RefinementMethod>"NodeCentroidHillClimbing".
The node centroid and hill climbing procedures can also be combined with a multilevel procedure. Here, a sequence of smaller and smaller graphs is generated from the original graph. An initial ordering is found for the smallest graph using RCM or RCMD. Orderings of the coarser graphs are then interpolated into finer graphs, and refined using the node centroid and hill climbing procedures. The multilevel method can be chosen with
RecursionMethod>"Multilevel".
The Sloan algorithm [
10,
11] aims to minimize the envelope size of an ordering of an undirected graph. It works by first finding the graph's pseudodiameter, formed by the pair
u and
v. Vertices fall into four mutually exclusive classes: ordered (those that are already ordered), active (those that are connected to one more or more active vertices), preactive (those that are connected to one or more active vertices, but not to ordered vertices) and inactive (neither ordered, active, nor preactive). Starting from
u, vertices are prioritized. The priority of a vertex
i is given by
w_{1}incr[i]+w_{2}dist[i, v], where
incr[i] is the number of preactive and inactive vertices that will become active if
i is ordered next, plus 1 if
i is currently preactive,
dist[i, v] is the distance between
i and
v, and
w_{1} and
w_{2} are two positive weights. The algorithm works by selecting among active and preactive vertices a vertex that has the highest priority. Once a vertex is selected, the priority of the remaining active and inactive vertices is updated. For efficiency in deciding the vertex of the highest priority (without using a sort), a binary heap data structure is used. The Sloan algorithm tends to give an ordering that has a small envelope size at the expense of a larger bandwidth, as compared with the RCM algorithm and variants.
Algorithms for unsymmetric matrices
For a graph
g,
MinimumBandwidthOrdering attempts to find the vertex ordering
r that minimizes the bandwidth of the underlying undirected graph of
g. However, for an unsymmetric matrix
a of dimension
m×
n, following [
12],
MinimumBandwidthOrdering works with the symmetric matrix
which represents the bipartite graph of the rows and columns of a. An ordering for
b is first computed. This is then converted to an ordering of
a. Specifically, suppose the ordering
p={r_{1}, r_{2}, ..., r_{m+n}} minimizes the bandwidth of
b[[p, p]]. Then the row ordering of
a is given by the elements of
p that are less than or equal to
m and the column ordering is given by those that is greater than
m, minus
m. That is,
Options for MinimumBandwidthOrdering
Options for MinimumBandwidthOrdering.
Method
The option
Method specifies the method used to minimize the bandwidth or envelope size. Possible values for this option are
Automatic,
"RCM",
"RCMD", and
"Sloan". By default,
Method is set to
Automatic.
The following example illustrates the difference between bandwidth reduction algorithms and the envelope size reduction algorithm.
This defines a sparse matrix. 
This reorders the matrix using the RCMD method and the Sloan method. 
It is clear from the plots that the RCMD method gives a matrix with much smaller bandwidth.
Out[156]=  

This defines two functions that calculate the bandwidth and envelope size of a matrix. 
This shows that the RCMD method tries to minimize the bandwidth, and thus gives smaller bandwidth.
Out[11]=  

This shows that the Sloan method attempts to minimize the envelope size.
Out[157]=  

RefinementMethod
The
RefinementMethod option specifies the refinement method used to further improve the bandwidth following the application of one of the previous methods. Possible values for this option are
Automatic (the default),
None,
"HillClimbing", and
"NodeCentroidHillClimbing".
RecursionMethod
The
RecursionMethod option specifies whether to employ a multilevel process. Possible values for this option are
None (the default) and
"Multilevel".
Applications
One of the applications of minimum bandwidth ordering is in optimizing cache performance of numerical calculations. For example, in multiplying a sparse matrix with a vector, if the matrix is preordered to minimize the bandwidth, then elements of the vector will not be accessed randomly, thus improving cache performance. Because ordering itself takes time, this improvement in cache performance would be beneficial only if the matrixvector product operation is to be performed repeatedly many times.
This generates a banded matrix, and then randomly permutes it. 
This performs a matrixvector product 50 times.
Out[15]=  

This permutes the matrix to minimize the bandwidth and performs a corresponding permutation of the vector. 
This shows that the matrixvector product now takes less time.
Out[19]=  

This checks that the answer is the same, subject to a permutation.
Out[20]=  

PageRanks, PageRankVector, LinkRanks, and LinkRankMatrix
PageRanks[g]  gives the page rank of the graph g as a rule list 
PageRankVector[g]  gives the page rank of the graph g as a vector 
PageRanks and PageRankVector functions.
This shows the page rank and the graph together.
Out[31]=  

This boosts the page rank of "home" by adding more feedback links.
Out[34]=  

PageRanks algorithm
Roughly speaking, for a directed graph, the page rank of a vertex is the probability a random internet surfer visits that vertex, assuming that the directed graph represents the internet, with a vertex
i having a link to a vertex
j if there is an edge from
i to
j.
Imaging a random surfer visiting a page
i at time step
k. At the next step, the surfer will choose randomly, at equal probability, a node
j from among
i's outneighbors,
{li→l}.
Some modification has to be made to the graph to avoid the surfer being trapped at a sink (a node that does not have any outerlinks). For any node
i that has a zero outer degree, we will link it to every node, including itself. With this modification, we can define the following.
The page rank of a page
iV is defined as the probability that the surfer, at time step
k, is at page
i.
Define a matrix
P to be of the same structure as the graph, with entry values
P (i, j)=1/deg (i) for all
{ji→j}, with
deg (i) the outer degree of
i. Thus provided that
deg (i)≠0, the sum of the
i^{th} row is 1. If
deg (i)=0, we will modify row
i by setting every entry in that row to be
1/n, with
n the number of vertices. Thus the modified matrix
P' is
with
D=de^{T}/n. Here
e is the
ndimensional column vector of all 1s, and
d is a column vector of 0s, except that
d (i)=1/n for
{ideg (i)=0}. Thus
D is a matrix with row
i of entry values
1/n if
deg (i)=0, and entry values 0 for other rows.
We need one more modification. If the directed graph given by
P' is not strongly connected, then the surfer will be trapped in a component of the directed graph. Thus to ensure than every page can be visited, we assume that the surfer, on visiting a page
i, will then have a small probability (the teleport probability) of visiting any page randomly. Therefore, to summarize, from a node
i{ldeg (l)≠ 0}, there is a probability of
c/deg (i) of visiting any of its
deg (i) neighbors, and a probability
(1c)/n of visiting any node (including itself); from a node
i{ldeg (l)= 0}, there is a probability of
c/n+ (1c)/n = 1/n of visiting any node. In matrix terms, a page rank vector
x must satisfy
Denoting
A=P''^{T}, the task of calculating page rank is that of finding a fixed point of
x=Ax.
The fixed point calculation is done iteratively, starting from a vector of 1s. If the average change in the page ranks is less than a tolerance, the process is assumed to have converged. The final result is an approximation of the surfer's probability, scaled so that the sum is
n.
An alternative model for dealing with a sink is not to remove it by linking it to every node, but to just link to itself. Under this model, the only way to get out of a sink is through teleporting. Hence at this node, the surfer has a probability of
c to stay at the node, and
(1c)/n of visiting any node. This model can be accessed through the option
RemoveSinks>False.
Options for PageRanks and PageRankVector
The following options are accepted.
Options for PageRanks and PageRankVector.
Tolerance
The
Tolerance option specifies the convergence tolerance of the iterative algorithm for the page rank calculation. If the average change is less than the tolerance, the iterative process terminates. Possible values for this option are
Automatic, or a positive machinesized number.
TeleportProbability
The
TeleportProbablity option specifies the probability that the internet surfer may choose to visit nodes randomly, instead of following the outlinks of a vertex.
Possible values for this option are positive machinesized numbers less than 1, with a default value of 0.15. Smaller values of this option make the page ranks a more accurate reflection of a normal internet surfer's behavior, but makes the iterative process that calculates the page ranks converge more slowly.
RemoveSinks
The
RemoveSinks option specifies whether sinks (nodes with no outerlinks) are removed by linking with every node. The default value is
True. Possible values are
True or
False.
This calculates the PageRanks of the graph without removing the sink (node 2).
Out[3]=  

To understand the results, assume the probability ( PageRanks) of the surfer at node i is r_{i}, and the teleport probability is p. Then with RemoveSinks>True, the surfer would visit node 1 either from node 2 due to the link added from node 2 to each node after removing the sink (with a probability of (1p)r_{2}), or from node 1 itself due to teleporting (with a probability of pr_{1}/2), or from node 2 due to teleporting (with a probability of pr_{2}/2). Hence r_{1}= (1p)r_{2}/2+pr_{1}/2+pr_{2}/2. Similarly node 2 is visited either due to teleporting from node 1 (probability pr_{1}/2), or from node 1 following the link (probability (1p)r_{1}), or from node 2 following the added link after removal of the sink (probability (1p)r_{2}), or from node 2 due to teleporting (probability pr_{2}/2). Thus the probability satisfies the following equations, normalized to r_{1}+r_{2}=2.
Out[4]=  

When p=0.15, substitute for p.
Out[5]=  

When the sink is not removed, the following equations are satisfied.
Out[6]=  
Out[7]=  

PseudoDiameter
PseudoDiameter[g]  gives the pseudodiameter of the undirected graph g, and the two vertices that achieve this diameter 
Finding the pseudodiameter of undirected graphs
A graph geodesic is a shortest path between two vertices of a graph. The graph diameter is the longest possible length of all graph geodesics of the graph.
PseudoDiameter finds an approximate graph diameter.
The algorithm used works by starting from a vertex
u, and finds a vertex
v that is farthest away from
u. This process is repeated by treating
v as the new starting vertex, and ends when the graph distance no longer increases. A vertex from the last level set that has the smallest degree is chosen as the final starting vertex
u, and a traversal is done to see if the graph distance can be increased. This graph distance is taken to be the pseudo diameter.
If the graph is disconnected, then the diameter and vertices for each connected component are returned.
This shows that a pentagon graph has a diameter of 2.
Out[10]=  

This is a plot showing the graph with the two vertices of the pseudodiameter highlighted in red.
Out[11]=  

This defines a grid graph and finds its pseudodiameter.
Out[13]=  

This finds the graph geodesic between vertices 1 and 25, and highlights the graph geodesic in red.
Out[19]=  

The Aggressive option for PseudoDiameter
  
Aggressive  False  whether to make extra effort in finding the optimal graph diameter 
Option for PseudoDiameter.
The option
Aggressive specifies whether to make extra effort in finding the optimal graph diameter. Possible values are
True or
False (the default). With the setting
Aggressive>True, a traversal is carried out from each vertex in the last level set. This can give a larger pseudodiameter on occasions.
This finds the average pseudodiameter of random graphs represented by sparse matrices of dimension 100×100 with 4% nonzeros, with the Aggressive option set to False or True. The largest pseudodiameter is chosen if there is more than one component.
Out[39]=  

StrongComponents and WeakComponents
StrongComponents[g]  returns a list of all strongly connected components in a directed graph 
WeakComponents[g]  returns a list of all weakly connected components in the undirected graph g 
Finding strongly connected components.
A strongly connected component in a directed graph is a set of vertices such that there is a path between any two nodes in the set.
StrongComponents[g] returns a list of all strongly connected components in the directed graph
g.
A weakly connected component of a directed graph is a set of vertices such that for each pair of vertices, there is a path between them. The graph
g is considered as undirected.
WeakComponents[g] returns a list of all weakly connected components in the undirected graph
g.
This shows a simple directed graph.
Out[45]=  

This shows that there are no strongly connected components of the graph consisting of more than just one vertex.
Out[46]=  

If an edge from 3 to 2 is added, then {2, 3, 4, 5} becomes a strongly connected component.
Out[51]=  
Out[52]=  

This defines a disconnected graph.
Out[56]=  

This graph has two weakly connected components.
Out[57]=  

This finds the strongly connected components.
Out[58]=  

Applications
Finding the number of connected components in an undirected graph
StrongComponents can be used to work out how many connected components there are in an undirected graph.
This shows a symmetric matrix representing a disconnected graph. 
This shows that the graph has 2 disconnected components. 
A graph specified as a rule list can be first symmetrized, and then its connected components can be found using
StrongComponents.
This specifies a graph using a rule list and plots it as an undirected graph. 
This symmetrizes the graph to make it undirected, then finds the two connected components of the undirected graph. 
This shows how the number of components changes with the average degree of the graph for an undirected graph with 100 vertices. RandomSymmetricSparseMatrix was defined earlier. 
Ordering a matrix into block triangular form
The output from
StrongComponents can also be used to reorder a matrix into block triangular form.
This generates a random directed graph of dimensions 100×100, with average degree 2. The RandomSparseMatrix function was defined earlier. 
This displays the matrix. 
This finds the strongly connected vertices. 
This orders the matrix by symmetrically permuting its rows and columns with the list of strongly connected vertices. The reordered matrix is shown. 
The reordered matrix is block triangular, with one big block. The other components are single vertices. 
To make the matrix block triangular with as many nonzero entries on the diagonal as possible, first use MaximalBipartiteMatching to permute entries to the diagonal, and then use StrongComponents to find the strongly connected components. 
Now the block is much smaller. The rest of the strongly connected components consist of single vertices. 
ToCombinatoricaGraph
ToCombinatoricaGraph[g]  returns the Combinatorica representation of the graph g 
ToCombinatoricaGraph[g,n]  returns the graph g, adding additional unconnected vertices, if necessary, to create a graph with n vertices 
Converting to a Combinatorica graph object.
ToCombinatoricaGraph converts a graph in any format acceptable by the
Graph Utilities Package to a
Combinatorica graph object.
This defines a simple graph. 
This shows the Combinatorica object.
Out[65]=  

This uses the optional second argument to pad the graph with additional unconnected vertices.
Out[66]=  

The Method Option for ToCombinatoricaGraph
  
Method  Automatic  method used to lay out the graph 
Option for ToCombinatoricaGraph.
The option
Method specifies the method used to lay out the graph. The coordinates thus calculated are embedded in the resulting
Combinatorica graph object. Possible values are the same as values for the
Method option of
GraphPlot. The default is
Automatic, which uses the default method of
GraphPlot, except when the input is an
Combinatorica object, in which case the coordinates that come with the object are not changed.
This defines a simple graph. 
This shows the previous graph converted to a Combinatorica object and contrasts it with the drawing by GraphPlot.
Out[84]=  

This uses the spring embedding method to find the coordinates.
Out[85]=  

This defines a Combinatorica object.
Out[89]=  

This adds extra vertices, using the "SpringEmbedding" method.
Out[92]=  

VertexList
VertexList[g]  give a list of all vertices of the graph g 
Getting a list of all vertices.
VertexList returns a list of vertices of the graph
g.
For graphs specified using a rule list, the vertices are returned in the order in which they first appear.
Functions such as
GraphCoordinates and
PageRankVector return information on vertices in the order given by
VertexList.
This gives a list of vertices of a simple graph.
Out[94]=  

This shows the relationship between the vertex list and the coordinates.
Out[98]=  

This draws the graph using the coordinate information.
Out[99]=  

This shows that the ordering of vertices from VertexList may not be the same as the lexicographical ordering of vertex labels.
Out[93]=  

References
[1] G. Di Battista, P. Eades, R. Tamassia, and I. G. Tollis,
Graph Drawing: Algorithms for the Visualization of Graphs, Englewood Cliffs, NJ: PrenticeHall, 1999.
[2] T. M. J. Fruchterman and E. M. Reingold, "Graph Drawing by ForceDirected Placement,"
Software—Practice and Experience,
21(11), 1991 pp. 11291164.
[3] P. Eades, "A Heuristic for Graph Drawing,"
Congressus Numerantium,
42, 1984 pp. 149160.
[4] N. Quinn and M. Breuer, "A ForceDirected Component Placement Procedure for Printed Circuit Boards,"
IEEE Transactions on Circuits and Systems, CAS26(6), 1979 pp. 377388.
[5] T. Kamada and S. Kawai, "An Algorithm for Drawing General Undirected Graphs,"
Information Processing Letters,
31(1), 1989 pp. 715.
[6] D. Harel and Y. Koren, "Graph Drawing by HighDimensional Embedding," in
Proceedings of 10^{th} Int. Symp. Graph Drawing (GD'02);
Lecture Notes in Computer Science, Vol. 2528, London: Springer Verlag, 2002 pp. 207219.
[7] C. Walshaw, "A Multilevel Algorithm for ForceDirected GraphDrawing,"
Journal of Graph Algorithms and Applications,
7(3), 2003 pp. 253285.
[8] E. Cuthill and J. McKee, "Reducing the Bandwidth of Sparse Symmetric Matrices," in
Proceedings of the 24^{th} National Conference of ACM, New York: ACM Publications, 1969 pp. 157172.
[9] A. Lim, B. Rodrigues, and F. Xiao, "A CentroidBased Approach to Solve the Bandwidth Minimization Problem," in
Proceedings of the 37^{th} Annual Hawaii International Conference on System Sciences (HICSS'04), Track 3, Vol. 3, Washington D.C.: IEEE Computer Society, 2004 p.30075.1.
[10] S. T. Barnard, A. Pothen, and H. D. Simon, "A Spectral Algorithm for Envelope Reduction of Sparse Matrices,"
Journal of Numerical Linear Algebra with Applications,
2(4), 1995 pp. 311334.
[11] S. Sloan, "A Fortran Program for Profile and Wavefront Reduction,"
International Journal for Numerical Methods in Engineering,
28(11), 1989 pp. 26512679.
[12] J. K. Reid and J. A. Scott, "Ordering Symmetric Sparse Matrices for Small Profile and Wavefront,"
International Journal for Numerical Methods in Engineering,
45(), 1999 pp. 17371755.
[13] J. A. George, "Computer Implementation of the FiniteElement Method," Report STAN CS71208, Ph.D. thesis, Department of Computer Science, Stanford University, Stanford, California, 1971.
[14] K. Sugiyama, S. Tagawa, and M. Toda, "Methods for Visual Understanding of Hierarchical Systems,"
IEEE Transactions on Systems, Man, and Cybernetics,
11(2), 1981 pp. 109125.
[15] E. R. Gansner, E. Koutsofios, S. C. North and K.P. Vo, "A Technique for Drawing Directed Graphs,"
Software Engineering,
19(3), 1993, 214230.
[16] A. Clauset, "Finding Local Community Structure in Networks,"
Physical Review E,
72, 026132, 2005.
[17] Y. F. Hu, "Efficient, HighQuality ForceDirected Graph Drawing,"
The Mathematica Journal,
10(1), 2005 pp. 3771.