# Graph Utilities Package

The Graph Utilities Package contains a number of functions useful for graph theory applications.

Functions in the Graph Utilities Package.

In[1]:= |

A graph can be represented by a list of rules or its adjacency matrix. Graphs in the *Combinatorica* Package format are also supported.

### AdjacencyMatrix

AdjacencyMatrix[g] | give the SparseArray object representing the graph g |

AdjacencyMatrix[g,n] | give the SparseArray object representing the graph g, adding additional unconnected vertices, if necessary, to create a graph with n vertices |

Finding the adjacency matrix of a graph.

Given a graph in any of the formats supported by the Graph Utilities Package, AdjacencyMatrix returns a matrix representing this graph.

The graph g can be specified by a rule list, an adjacency matrix, or the *Combinatorica* representation of a graph.

The rows and columns of the SparseArray object correspond to vertices in the order returned by VertexList[g].

In[1]:= |

In[2]:= |

In[3]:= |

Out[3]//MatrixForm= | |

In[4]:= |

Out[4]= |

In[5]:= |

Out[5]//MatrixForm= | |

In[6]:= |

Out[6]//MatrixForm= | |

In[7]:= |

Out[7]//MatrixForm= | |

### Bicomponents

Bicomponents[g] | gives the biconnected components of the undirected graph g |

Finding biconnected components.

Bicomponents[g] returns a list of all biconnected components in the undirected graph g.

A vertex v is a cutpoint of a graph g if g is not connected after removing the vertex v and all edges. A graph is biconnected if it has no cutpoint. A biconnected component (bicomponent) is a maximal subgraph that has no cutpoint.

In[8]:= |

Out[8]= |

In[9]:= |

In[10]:= |

Out[10]= |

In[11]:= |

Out[11]= |

### ClosenessCentrality

ClosenessCentrality[g] | finds the closeness centrality |

Finding the closeness centrality.

The closeness centrality of a vertex u is defined as the inverse of the sum of the distance from u to all other vertices. The closeness centrality of a vertex in a disconnected graph is based on the closeness centrality of the component where this vertex belongs.

In[10]:= |

Out[13]= |

In[20]:= |

Out[20]= |

In[17]:= |

In[21]:= |

Out[21]= |

In[15]:= |

Out[15]= |

#### Options for ClosenessCentrality

option name | default value | |

Weighted | True | whether edge weight is used in calculating shortest path |

Normalize | True | whether to normalize the output |

Options for ClosenessCentrality.

##### Weighted

The Weighted option specifies whether edge weights are to be taken into account when calculating distances. Possible values are True (the default) or False. Weighted->False means that edges are assumed to have unit length.

In[22]:= |

Out[22]= |

In[23]:= |

Out[23]= |

##### Normalize

The option specifies whether the output is normalized to give a maximal centrality of 1. Possible values are True or False (the default).

In[25]:= |

Out[25]= |

In[26]:= |

Out[26]= |

### CommunityModularity, CommunityStructureAssignment, and CommunityStructurePartition

CommunityStructurePartition[g] | give the partition of a graph g into communities |

CommunityStructureAssignment[g] | give the assignment of the vertices of a graph g into communities |

CommunityModularity[g,partition] | give the community modularity of a partition |

CommunityModularity[g,assignment] | give the community modularity of an assignment |

Functions for finding and assessing community structure.

CommunityStructurePartition gives the partition of a graph into communities. The partition groups the vertices into communities, such that there is a higher density of edges within communities than between them. CommunityStructureAssignment gives the result as an assignment.

In[29]:= |

Out[30]= |

In[31]:= |

Out[31]= |

In[34]:= |

Out[34]= |

In[33]:= |

Out[33]= |

In[35]:= |

Out[35]= |

#### Algorithms for Finding Community Structure

Much relational information can be represented in graphs. One graph feature is community structure, which is the gathering of vertices into groups such that there is a higher density of edges within groups than between them. The discovery and analysis of community structure in graphs is useful in a number of applications. For example, in software engineering, it is sometimes necessary to group source code in directories so that interrelated routines are located in the same directory. In social networks, it is often necessary to identify groups of friends who form a close-knit community. In parallel computing, it is desirable to partition computing tasks, so that highly related tasks are placed on the same processor.

A number of algorithms exist that identify community structure in a network. One such algorithm is that of [16]. The key concept is a quantity called community modularity. Suppose the graph of concern is , and the vertex set is partitioned into subsets such that each subset belongs to one community. Then the community modularity of this partition is defined as .

Here, is the percentage of the number of edges that have both ends in the community , and is the percentage of edges that start from the community In other words,

Clearly, Typically, for a random partition of , is close to zero, while the partition of that gives a large positive value of indicates a significant community structure.

Therefore, the algorithm of [16] starts by assigning every vertex into its own community and calculates the change in that will result from merging two communities into one. It then merges the two communities that give the highest positive change in . This process is repeated on the merged network. The process terminates if the highest change is found to be negative.

#### The Weighted Option for CommunityStructurePartition, CommunityStructureAssignment, and CommunityModularity

The following option is accepted.

The Weighted option specifies whether edge weights are to be taken into account. Possible values for this option are True or False (the default).

In[36]:= |

In[36]:= |

Out[41]= |

In[42]:= |

Out[44]= |

### EdgeList

EdgeList[g] | give a list of edges in the graph g |

In[45]:= |

Out[45]= |

In[102]:= |

Out[102]= |

In[103]:= |

Out[103]= |

In[106]:= |

Out[106]= |

In[104]:= |

Out[104]= |

In[107]:= |

Out[107]= |

### ExpressionTreePlot

ExpressionTreePlot[e] | plot the expression tree of e |

ExpressionTreePlot[e,pos] | plot the expression tree of e with its root placed at position pos |

ExpressionTreePlot[e,pos,lev] | plot the expression tree of e up to level lev with its root placed at position pos |

Functions for plotting an expression tree.

ExpressionTreePlot plots the expression tree of an expression. It has the same options as TreePlot.

In[52]:= |

Out[52]= |

In[53]:= |

Out[53]= |

### GraphCoordinates and GraphCoordinates3D

GraphCoordinates[g] | calculate a visually appealing two-dimensional layout and return the coordinates of the vertices |

GraphCoordinates3D[g] | calculate a visually appealing three-dimensional layout and return the coordinates of the vertices |

Graph drawing functions that return coordinates.

GraphCoordinates returns the coordinates of the vertices as computed using a graph drawing algorithm. This is useful when you need to draw a graph repeatedly, using the same layout but different styles.

GraphCoordinates accepts the same options as GraphPlot.

In[54]:= |

In[55]:= |

Out[55]= |

In[56]:= |

Out[56]= |

In[58]:= |

Out[58]= |

In[8]:= |

Out[8]= |

GraphCoordinates only returns the position of the vertices, so when used for LayeredGraphPlot, curved edges will not be reproduced, although multiedges and self-loops will still be drawn.

In[59]:= |

In[60]:= |

Out[60]= |

In[61]:= |

Out[61]= |

In[62]:= |

Out[62]= |

#### Options for GraphCoordinates and GraphCoordinates3D

GraphCoordinates and GraphCoordinates3D accept the same options as GraphPlot and GraphPlot3D.

### GraphDistance, GraphPath, and GraphDistanceMatrix

GraphDistance[g,start,end] | give the distance from vertex i to vertex j in the graph g |

GraphPath[g,start,end] | find a shortest path between vertices start and end in graph g |

GraphDistanceMatrix[g] | give a matrix in which the entry is the length of a shortest path in g between vertices i and j |

GraphDistanceMatrix[g,Parent] | return a three-dimensional matrix in which the entry is the length of a shortest path from i to j and the entry is the predecessor of j in a shortest path from i to j |

Finding shortest distance, shortest path, and all-pairs shortest distances and paths.

GraphDistance returns the graph distance from one vertex to another. Infinity is returned if no path exists from i to j. By default every edge is assumed to have an edge weight of 1.

GraphPath and GraphDistanceMatrix find a shortest path and all-pairs shortest paths.

In[98]:= |

In[72]:= |

Out[72]= |

In[87]:= |

Out[87]= |

In[90]:= |

Out[90]= |

In[76]:= |

Out[76]= |

In[91]:= |

Out[91]= |

In[93]:= |

Out[93]//MatrixForm= | |

In[95]:= |

In[96]:= |

Out[96]= |

In[79]:= |

Out[79]//MatrixForm= | |

In[80]:= |

Out[81]//MatrixForm= | |

In[82]:= |

Out[82]= |

In[97]:= |

Out[97]= |

#### Algorithms for Shortest Paths

In the following it is assumed that is the edge weight of edge , and is the current estimate of the length of the shortest path from vertex to vertex .

##### Single-Source Shortest Path Algorithms

Single-source shortest path algorithms find the shortest path from one vertex to all other vertices.

Dijkstra's algorithm assumes that edge weights are non-negative.

: the set of vertices whose shortest distance to has been found.

: the set of vertices not in that "touches", stored in a heap.

1. Picks in the vertex with the smallest value. This step has complexity with a binary heap or with a Fibonacci heap, and in all cases it will be repeated times.

2. Places vertex in , inserts all neighbors of in if vertex is not in . If , it sets and updates the heap. Each insertion/update to the heap has complexity with a binary heap, and with a Fibonacci heap, and in all, update operations and insertions are needed.

So the overall complexity is with a binary heap, and with a Fibonacci heap.

The Bellman-Ford algorithm works even if edge weights are negative. It works by looping through each edge : if , set . Since this looping is carried out times, the Bellman-Ford algorithm has a complexity of , which is much higher than Dijkstra's algorithm. Therefore it is only used if the graph contains negative weights. The algorithm can detect negative weight cycles.

##### All-Pairs Shortest Path Algorithms

All-pairs shortest path algorithms find shortest paths between all vertex pairs.

The Floyd-Warshall** **algorithm sets all distances to initially, except for neighboring vertices, for which the distance is set to the edge weights. It then loops over all vertex pairs , and relaxes if , in which case it sets . This is so that at any iteration , the distance represents the shortest distance between vertices and , passing only through vertices This algorithm has a complexity of . It works even for negative weights, but does not detect negative weight cycles.

If the edge weights are non-negative, the Dijkstra algorithm can be applied repeatedly to each vertex, for single-source shortest paths. This gives a complexity of with a binary heap, and with a Fibonacci heap.

If there are negative edge weights, then the Dijkstra algorithm cannot be applied directly. The Johnson algorithm first makes all edge weights positive, by associating a value to every vertex , such that if . The edge weights are then modified by .

To modify the edge weights, a virtual vertex is added that is linked to every vertex, with an edge weight of 0. Then for any vertex , is defined as the distance of to . So for every edge , ; otherwise is not the length of a true shortest path. Hence .

This modification does not change the shortest path; hence the Dijkstra algorithm can then be applied to the weight-modified graph. After that, the distance is modified back by

The Johnson algorithm has the same complexity as the Dijkstra algorithm when edge weights are non-negative; otherwise it has the same complexity as the Bellman-Ford algorithm.

#### Options for GraphPath, GraphDistanceMatrix, and GraphDistance

option name | default value | |

Method | Automatic | algorithm to find shortest paths |

Weighted | True | whether edge weight is used in calculating shortest paths |

Options for GraphPath and GraphDistanceMatrix.

Option for GraphDistance.

##### Method

The option Method specifies the algorithm used to find the shortest paths. By default Method is set to Automatic, which uses Johnson's algorithm. Other possible values for this option are , , , and .

##### Weighted

The option Weighted specifies whether edge weights are to be taken into account when finding shortest paths. Possible values are True or False.

### GraphEdit

GraphEdit is a *J/Link* application. On executing this function, a Java window will be opened. Users can then enter a new graph or modify an existing graph, and get the resulting graph back in *Mathematica*.

The graph g may be specified by a rule list, an adjacency matrix, or the *Combinatorica* representation of a graph.

Vertices and edges can be added by clicking and dragging with the mouse. Vertices can be moved after setting "Drawing mode" to "Move" and dragging the mouse on the vertices. Vertices can be deleted after setting "Drawing mode" to "Delete" and selecting the vertices to be deleted by clicking.

Users can hold the Ctrl key and click the vertex (using the first mouse button) in the GraphEdit window to edit the vertex labels.

On closing the window, a drawing of the graph, a representation of the graph as a list of rules, the coordinates, and the vertex labels are returned.

### HamiltonianCycles and FindHamiltonianCycles

HamiltonianCycles[g,n] | give a list of n Hamiltonian cycles |

HamiltonianCycles[g] | give a list of one Hamiltonian cycle |

FindHamiltonianCycle[g] | attempt to find a Hamiltonian cycle |

In[2]:= |

In[10]:= |

Out[10]= |

In[11]:= |

Out[11]= |

In[9]:= |

Out[9]= |

#### Algorithms for Finding Hamiltonian Cycles

#### Options for FindHamiltonianCycle

Options for FindHamiltonianCycle.

##### MaxIterations

The option MaxIterations specifies the maximum number of iterations to try. Possible values are positive machine integers.

##### RandomSeed

The option specifies the random seed used in choosing random neighbors. Possible values are machine integers.

### LineScaledCoordinate

LineScaledCoordinate[{{x_{1},y_{1}},{x_{2},y_{2}},...,{x_{k},y_{k}}},r] | |

give the coordinate of a point in the polyline , at a scaled distance of r from point | |

LineScaledCoordinate[{{x_{1},y_{1}},{x_{2},y_{2}},...,{x_{k},y_{k}}}] | |

equivalent to LineScaledCoordinate[ |

Finding the coordinates of a point on a polyline.

LineScaledCoordinate is often used in conjunction with the EdgeRenderingFunction option of GraphPlot, GraphPlot3D, or LayeredGraphPlot to add text or graphics along the edge of a graph.

In[2]:= |

Out[2]= |

### MaximalBipartiteMatching

MaximalBipartiteMatching[g] | give the maximal matching of the bipartite graph g |

Finding the maximal bipartite matching.

MaximalBipartiteMatching gives a maximal set of nonadjacent edges between the two vertex sets of the bipartite graph.

The bipartite graph represented by an m×n matrix consists of the row and column vertex sets and , with a vertex and connected if the matrix element .

The bipartite graph represented by a rule list consists of vertex sets r=Union[{i_{1}, i_{2}, ...}] and c=Union[{j_{1}, j_{2}, ...}], with vertices and connected if the rule is included in the rule list.

MaximalBipartiteMatching returns a list of index pairs , where the number of pairs k is not larger than either vertex set.

In[5]:= |

Out[5]= |

#### Applications

MaximalBipartititeMatching can be used to permute as many entries as possible onto the diagonal of a sparse matrix.

In[6]:= |

Out[6]= |

In[7]:= |

Out[7]= |

In[8]:= |

Out[8]= |

In[9]:= |

Out[9]= |

In[12]:= |

Out[12]= |

In[13]:= |

In[15]:= |

Out[15]= |

MaximalBipartititeMatching can be used to find the structural rank of a sparse matrix. The structural rank of a matrix is the number of entries in the maximal matching of the bipartite graph of the matrix. It is an upper bound on the numeric rank of the matrix. A matrix has structural full rank if it can be permuted so that the diagonal has no zero entries.

In[24]:= |

Out[24]= |

### MaximalIndependentEdgeSet

MaximalIndependentEdgeSet[g] | give a maximal independent edge set of an undirected graph g |

Finding the maximal independent edge set.

MaximalIndependentEdgeSet gives an approximate maximal set of pairwise nonadjacent edges of g. A maximal independent edge set of a graph is also called a maximal matching.

In[24]:= |

In[27]:= |

Out[27]= |

In[31]:= |

Out[31]= |

In[47]:= |

In[50]:= |

Out[50]= |

In[51]:= |

Out[51]= |

In[52]:= |

Out[52]= |

In[53]:= |

Out[53]= |

#### Option for MaximalIndependentEdgeSet

option name | default value | |

Weighted | True | whether edges with higher weights are preferred when forming the maximal independent edge set |

Option for MaximalIndependentEdgeSet.

##### Weighted

The Weighted option specifies whether edges with higher weights are preferred when forming the maximal independent edge set. Possible values are True (the default) or False.

### MaximalIndependentVertexSet

MaximalIndependentVertexSet[g] | give a maximal independent vertex set of an undirected graph g |

MaximalIndependentVertexSet[g,w] | give a maximal independent vertex set of g with vertices weighted by w |

Finding the maximal independent vertex set.

MaximalIndependentVertexSet gives an approximate maximal set of vertices such that no two vertices form an edge. It treats the input as an undirected graph.

The length of the vector w must be the same as the number of vertices in g.

In[55]:= |

In[56]:= |

Out[56]= |

In[57]:= |

Out[57]= |

In[58]:= |

Out[58]= |

In[59]:= |

Out[59]= |

In[60]:= |

Out[60]= |

### MinCut

MinCut[g] | partition the undirected graph g into parts, where the number of edge cuts is approximately minimized |

MinCut treats the input as an undirected graph and tries to find a partition of the vertices into parts so that each part has roughly the same number of vertices, and the number of edges between these parts (known as the edge separator) is minimized.

In[75]:= |

In[76]:= |

Out[76]= |

In[77]:= |

Out[77]= |

In[78]:= |

Out[78]= |

In[95]:= |

Out[96]= |

In[99]:= |

Out[99]= |

In[97]:= |

Out[97]= |

In[98]:= |

Out[98]= |

##### Applications

One of the important uses of the MinCut function is in partitioning graphs and meshes into subdomains of roughly equal size, so that the number of edges cut by the partition is minimized. This is important in areas such as parallel computing, where each processor works on one subdomain and the communication overhead between processors must be minimized. The communication overhead is proportional to the number of edges cut, so the number of edges cut by the partition must be minimized.

Suppose a parallel finite element structural analysis is to be carried out on a three-dimensional structure. An irregular mesh is generated to model the structure.

In[108]:= |

Out[109]= |

In[110]:= |

In[115]:= |

Out[115]= |

### MinimumBandwidthOrdering

MinimumBandwidthOrdering[g] | attempt to find a vertex ordering r that minimizes the bandwidth of the underlying undirected graph of the graph g |

MinimumBandwidthOrdering[m] | attempt to find a pair of orderings that minimizes the bandwidth of the matrix , where m is a matrix and r and c are permutations; for symmetric m, r and c are the same |

Finding orderings that minimize the bandwidth of graphs and matrices.

For a graph with vertex ordering , the bandwidth of the graph is defined as

The bandwidth of a vertex is defined as

For a matrix , the bandwidth is defined as

For a symmetric matrix, the envelope size is defined as

which is the sum of the distance from the first element in each row to the diagonal position. If there is neither a diagonal element in a row nor elements before the diagonal, the distance is assumed to be zero.

The profile of a matrix is simply the envelope size plus the size of the matrix.

In[117]:= |

In[118]:= |

Out[118]= |

In[119]:= |

Out[119]= |

In[120]:= |

Out[120]= |

In[121]:= |

Out[121]= |

In[122]:= |

Out[122]= |

In[123]:= |

Out[123]= |

In[127]:= |

In[128]:= |

Out[128]= |

In[133]:= |

Out[133]= |

In[1]:= |

In[3]:= |

In[5]:= |

Out[5]= |

In[9]:= |

Out[9]= |

#### Algorithms for Minimizing Bandwidth/Envelope Size

The problem of finding the minimum bandwidth ordering of a graph has been proven to be NP-complete; hence, practical algorithms are all heuristics aimed at finding approximate optimal orderings.

##### Basic Algorithms

For a connected undirected graph, the reverse Cuthill-McKee (RCM) algorithm [8] starts by finding a pseudo-diameter of the graph. Then from one end of the diameter, vertices are ordered and form a number of level sets, where vertices within the same level set have the same distance to Within the same level set, priority is given to vertices that are connected to the lowest-ordered vertex possible. Once all vertices are ordered, their order is reversed. While reversing the order has little effect on the bandwidth, it was found [13] that reversing the order often reduces the fill-in when the ordering is used for the factorization of sparse matrices. In this implementation a variant algorithm, RCMD, is also provided, which breaks the tie between vertices that are connected to the same lowest-ordered vertex by ordering vertices with the lowest degree first.

The result from RCM or RCMD can be further refined and improved. The node centroid and hill-climbing algorithms in [9, 12] are two possible procedures. In the hill-climbing algorithm, critical vertices (those that have a bandwidth equal to the bandwidth of the graph) are identified and attempts are made at swapping the order of a critical vertex with noncritical vertices. This continues until neither the number of critical vertices nor the bandwidth can be reduced. In the node centroid algorithm, near-critical vertices (those with a bandwidth close to the bandwidth of the graph) are identified and their new orders are calculated as the average of their order and their neighbors' orders. The vertices are then sorted and reordered. The node centroid algorithm is always followed by the hill-climbing algorithm. The hill-climbing refinement, or the combined node centroid and hill-climbing algorithms, can be chosen using RefinementMethod->"HillClimbing" or RefinementMethod->"NodeCentroidHillClimbing".

The node centroid and hill-climbing procedures can also be combined with a multilevel procedure. Here, a sequence of smaller and smaller graphs is generated from the original graph. An initial ordering is found for the smallest graph using RCM or RCMD. Orderings of the coarser graphs are then interpolated into finer graphs and refined using the node centroid and hill-climbing procedures. The multilevel method can be chosen with RecursionMethod->"Multilevel".

The Sloan algorithm [10, 11] aims to minimize the envelope size of an ordering of an undirected graph. It works by first finding the graph's pseudo-diameter, formed by the pair and . Vertices fall into four mutually exclusive classes: ordered (those that are already ordered), active (those that are connected to one more or more active vertices), preactive (those that are connected to one or more active vertices, but not to ordered vertices), and inactive (neither ordered, active, nor preactive). Starting from , vertices are prioritized. The priority of a vertex is given by , where is the number of preactive and inactive vertices that will become active if is ordered next, plus 1 if is currently preactive; is the distance between and ; and and are two positive weights. The algorithm works by selecting from the active and preactive vertices a vertex that has the highest priority. Once a vertex is selected, the priority of the remaining active and inactive vertices is updated. For efficiency in deciding the vertex of the highest priority (without using a sort), a binary heap data structure is used. The Sloan algorithm tends to give an ordering that has a small envelope size at the expense of a larger bandwidth, as compared with the RCM algorithm and variants.

##### Algorithms for Asymmetric Matrices

For a graph g, MinimumBandwidthOrdering attempts to find the vertex ordering that minimizes the bandwidth of the underlying undirected graph of g. However, for an unsymmetric matrix of dimension ×, following [12], MinimumBandwidthOrdering works with the symmetric matrix

which represents the bipartite graph of the rows and columns of . An ordering for is first computed. This is then converted to an ordering of . Specifically, suppose the ordering minimizes the bandwidth of . Then the row ordering of is given by the elements of that are less than or equal to , and the column ordering is given by those that are greater than , minus . That is,

#### Options for MinimumBandwidthOrdering

option name | default value | |

Method | Automatic | method to use |

RefinementMethod | Automatic | algorithms used to further improve the quality of the ordering |

RecursionMethod | None | recursion method to use |

Options for MinimumBandwidthOrdering.

##### Method

The option Method specifies the method used to minimize the bandwidth or envelope size. Possible values for this option are Automatic, , , and . By default, Method is set to Automatic.

The following example illustrates the difference between bandwidth reduction algorithms and the envelope size reduction algorithm.

In[145]:= |

In[151]:= |

In[152]:= |

In[156]:= |

Out[156]= |

In[161]:= |

In[11]:= |

Out[11]= |

In[157]:= |

Out[157]= |

##### RefinementMethod

The RefinementMethod option specifies the refinement method used to further improve the bandwidth following the application of one of the previous methods. Possible values for this option are Automatic (the default), None, , and .

##### RecursionMethod

The RecursionMethod option specifies whether to employ a multilevel process. Possible values for this option are None (the default) and .

#### Applications

One of the applications of minimum bandwidth ordering is in optimizing cache performance of numerical calculations. For example, in multiplying a sparse matrix with a vector, if the matrix is preordered to minimize the bandwidth, then elements of the vector will not be accessed randomly, thus improving cache performance. Because ordering itself takes time, this improvement in cache performance would be beneficial only if the matrix-vector product operation were to be performed repeatedly many times.

In[10]:= |

In[11]:= |

In[14]:= |

Out[15]= |

In[16]:= |

In[19]:= |

Out[19]= |

In[20]:= |

Out[20]= |

### NeighborhoodSubgraph

NeighborhoodSubgraph[g,i,r] | give a subgraph consisting of vertices that can be reached from vertex i within r hops |

NeighborhoodSubgraph[g,i] | give a subgraph consisting of vertices that can be reached from vertex i |

Finding a subgraph in the neighborhood.

In[64]:= |

Out[66]= |

In[67]:= |

Out[67]= |

In[68]:= |

Out[68]= |

In[69]:= |

Out[69]= |

### NeighborhoodVertices

NeighborhoodVertices[g,i,r] | give vertices that can be reached from vertex i within r hops |

NeighborhoodVertices[g,i] | give vertices that can be reached from vertex i |

Finding vertices in the neighborhood.

In[44]:= |

Out[46]= |

In[60]:= |

Out[61]= |

In[62]:= |

Out[62]= |

### PageRanks, PageRankVector, LinkRanks, and LinkRankMatrix

PageRanks[g] | give the page rank of the graph g as a rule list |

PageRankVector[g] | give the page rank of the graph g as a vector |

PageRanks and PageRankVector functions.

In[29]:= |

Out[30]= |

In[31]:= |

Out[31]= |

In[32]:= |

Out[34]= |

#### PageRanks Algorithm

Roughly speaking, for a directed graph, the page rank of a vertex is the probability a random internet surfer visits that vertex, assuming that the directed graph represents the internet, with a vertex having a link to a vertex if there is an edge from to .

Imagine a random surfer visiting a page at time step . At the next step, the surfer will choose randomly, at equal probability, a node from among 's out-neighbors,

Some modification has to be made to the graph to avoid the surfer being trapped at a sink (a node that does not have any outer links). Any node that has a zero outer degree will be linked to every node, including itself. With this modification, the following can be defined.

The page rank of a page is defined as the probability that the surfer, at time step , is at page .

Define a matrix to be of the same structure as the graph, with entry values for all , with the outer degree of . Thus provided that , the sum of the row is 1. If , modify row by setting every entry in that row to be , with the number of vertices. Thus the modified matrix is

with . Here is the -dimensional column vector of all 1s, and is a column vector of 0s, except that for . Thus is a matrix with row of entry values if , and entry values 0 for other rows.

One more modification is needed. If the directed graph given by is not strongly connected, then the surfer will be trapped in a component of the directed graph. Thus, to ensure that every page can be visited, assume that the surfer, on visiting a page , has a small probability (the teleport probability) of visiting any page randomly. Therefore, to summarize, from a node , there is a probability of of visiting any of its neighbors and a probability of visiting any node (including itself); from a node , there is a probability of of visiting any node. In matrix terms, a page rank vector must satisfy

Denoting , the task of calculating page rank is that of finding a fixed point of .

The fixed-point calculation is done iteratively, starting from a vector of 1s. If the average change in the page ranks is less than a tolerance, the process is assumed to have converged. The final result is an approximation of the surfer's probability, scaled so that the sum is

An alternative model for dealing with a sink is not to remove it by linking it to every node, but just to link to itself. Under this model, the only way to get out of a sink is through teleporting. Hence at this node, the surfer has a probability of to stay at the node, and of visiting any node. This model can be accessed through the option RemoveSinks->False.

#### Options for PageRanks and PageRankVector

The following options are accepted.

option name | default value | |

Tolerance | Automatic | tolerance used for convergence check |

TeleportProbability | 0.15 | probability of visiting random nodes |

RemoveSinks | True | whether to remove sinks by linking them with every node |

Options for PageRanks and PageRankVector.

##### Tolerance

The Tolerance option specifies the convergence tolerance of the iterative algorithm for the page rank calculation. If the average change is less than the tolerance, the iterative process terminates. Possible values for this option are Automatic, or a positive machine-size number.

##### TeleportProbability

The TeleportProbablity option specifies the probability that the internet surfer may choose to visit nodes randomly, instead of following the outlinks of a vertex.

Possible values for this option are positive machine-size numbers less than 1, with a default value of 0.15. Smaller values of this option make the page ranks a more accurate reflection of a normal internet surfer's behavior, but make the iterative process that calculates the page ranks converge more slowly.

##### RemoveSinks

The RemoveSinks option specifies whether sinks (nodes with no outlinks) are removed by linking with every node. The default value is True. Possible values are True or False.

In[2]:= |

Out[2]= |

In[3]:= |

Out[3]= |

In[4]:= |

Out[4]= |

In[5]:= |

Out[5]= |

In[6]:= |

Out[6]= |

In[7]:= |

Out[7]= |

### PseudoDiameter

PseudoDiameter[g] | give the pseudo-diameter of the undirected graph g and the two vertices that achieve this diameter |

Finding the pseudo-diameter of undirected graphs.

A graph geodesic is a shortest path between two vertices of a graph. The graph diameter is the longest possible length of all graph geodesics of the graph. PseudoDiameter finds an approximate graph diameter.

The algorithm used works by starting from a vertex u, and finds a vertex v that is farthest away from u. This process is repeated by treating v as the new starting vertex and ends when the graph distance no longer increases. A vertex from the last level set that has the smallest degree is chosen as the final starting vertex u, and a traversal is done to see if the graph distance can be increased. This graph distance is taken to be the pseudo-diameter.

If the graph is disconnected, then the diameter and vertices for each connected component are returned.

In[8]:= |

In[10]:= |

Out[10]= |

In[11]:= |

Out[11]= |

In[12]:= |

In[13]:= |

Out[13]= |

In[15]:= |

In[19]:= |

Out[19]= |

In[20]:= |

Out[20]= |

In[21]:= |

Out[21]= |

In[22]:= |

Out[22]= |

#### The Aggressive Option for PseudoDiameter

option name | default value | |

Aggressive | False | whether to make extra effort in finding the optimal graph diameter |

Option for PseudoDiameter.

The option Aggressive specifies whether to make an extra effort in finding the optimal graph diameter. Possible values are True or False (the default). With the setting Aggressive->True, a traversal is carried out from each vertex in the last level set. This can give a larger pseudo-diameter on occasions.

In[39]:= |

Out[39]= |

### StrongComponents and WeakComponents

StrongComponents[g] | return a list of all strongly connected components in a directed graph |

WeakComponents[g] | return a list of all weakly connected components in the undirected graph g |

Finding strongly connected components.

A strongly connected component in a directed graph is a set of vertices such that there is a path between any two nodes in the set.

StrongComponents[g] returns a list of all strongly connected components in the directed graph g.

A weakly connected component of a directed graph is a set of vertices such that for each pair of vertices, there is a path between them. The graph g is considered as undirected.

WeakComponents[g] returns a list of all weakly connected components in the undirected graph g.

In[44]:= |

Out[45]= |

In[46]:= |

Out[46]= |

In[50]:= |

Out[51]= |

In[52]:= |

Out[52]= |

In[53]:= |

Out[53]= |

In[54]:= |

Out[54]= |

In[55]:= |

In[56]:= |

Out[56]= |

In[57]:= |

Out[57]= |

In[58]:= |

Out[58]= |

#### Applications

##### Finding the Number of Connected Components in an Undirected Graph

StrongComponents can be used to work out how many connected components there are in an undirected graph.

A graph specified as a rule list can be first symmetrized, and then its connected components can be found using StrongComponents.

##### Ordering a Matrix into Block Triangular Form

The output from StrongComponents can also be used to reorder a matrix into block triangular form.

### ToCombinatoricaGraph

ToCombinatoricaGraph[g] | return the Combinatorica representation of the graph g |

ToCombinatoricaGraph[g,n] | return the graph g, adding additional unconnected vertices, if necessary, to create a graph with n vertices |

Converting to a *Combinatorica *graph object.

ToCombinatoricaGraph converts a graph in any format acceptable by the Graph Utilities Package to a *Combinatorica* graph object.

In[64]:= |

In[65]:= |

Out[65]= |

In[66]:= |

Out[66]= |

#### The Method Option for ToCombinatoricaGraph

Option for ToCombinatoricaGraph.

The option Method specifies the method used to lay out the graph. The coordinates thus calculated are embedded in the resulting *Combinatorica* graph object. Possible values are the same as values for the Method option of GraphPlot. The default is Automatic, which uses the default method of GraphPlot, except when the input is a *Combinatorica* object, in which case the coordinates that come with the object are not changed.

In[81]:= |

*Combinatorica*object and contrasts it with the drawing by GraphPlot.

In[84]:= |

Out[84]= |

In[85]:= |

Out[85]= |

In[89]:= |

Out[89]= |

In[90]:= |

Out[90]= |

In[91]:= |

Out[91]= |

In[92]:= |

Out[92]= |

### VertexList

VertexList[g] | give a list of all vertices of the graph g |

Getting a list of all vertices.

VertexList returns a list of vertices of the graph g.

For graphs specified using a rule list, the vertices are returned in the order in which they first appear.

Functions such as GraphCoordinates and PageRankVector return information on vertices in the order given by VertexList.

In[94]:= |

Out[94]= |

In[96]:= |

In[97]:= |

Out[97]= |

In[98]:= |

Out[98]= |

In[99]:= |

Out[99]= |

In[93]:= |

Out[93]= |

## References

[1] G. Di Battista, P. Eades, R. Tamassia, and I. G. Tollis, *Graph Drawing: Algorithms for the Visualization of Graphs*, Englewood Cliffs, NJ: Prentice-Hall, 1999.

[2] T. M. J. Fruchterman and E. M. Reingold, "Graph Drawing by Force-Directed Placement," *Software—Practice and Experience*, **21**(11), 1991 pp. 1129-1164.

[3] P. Eades, "A Heuristic for Graph Drawing," *Congressus Numerantium*, **42**, 1984 pp. 149-160.

[4] N. Quinn and M. Breuer, "A Force-Directed Component Placement Procedure for Printed Circuit Boards," *IEEE Transactions on Circuits and Systems, ** CAS-26*(6), 1979 pp. 377-388.

[5] T. Kamada and S. Kawai, "An Algorithm for Drawing General Undirected Graphs," *Information Processing Letters*, **31**(1), 1989 pp. 7-15.

[6] D. Harel and Y. Koren, "Graph Drawing by High-Dimensional Embedding," in *Proceedings of 10* *Int. Symp. Graph Drawing (GD'02)*;* *Lecture Notes in Computer Science, Vol. 2528, London: Springer Verlag, 2002 pp. 207-219.

[7] C. Walshaw, "A Multilevel Algorithm for Force-Directed Graph-Drawing," *Journal of Graph Algorithms and Applications*, **7**(3), 2003 pp. 253-285.

[8] E. Cuthill and J. McKee, "Reducing the Bandwidth of Sparse Symmetric Matrices," in *Proceedings of the 24** National Conference of ACM*, New York: ACM Publications, 1969 pp. 157-172.

[9] A. Lim, B. Rodrigues, and F. Xiao, "A Centroid-Based Approach to Solve the Bandwidth Minimization Problem," in *Proceedings of the 37* *Annual Hawaii International Conference on System Sciences (HICSS'04)*, Track 3, Vol. 3, Washington D.C.: IEEE Computer Society, 2004 p. 30075.1.

[10] S. T. Barnard, A. Pothen, and H. D. Simon, "A Spectral Algorithm for Envelope Reduction of Sparse Matrices," *Journal of Numerical Linear Algebra with Applications*, **2**(4), 1995 pp. 311-334.

[11] S. Sloan, "A Fortran Program for Profile and Wavefront Reduction," *International Journal for Numerical Methods in Engineering*, **28**(11), 1989 pp. 2651-2679.

[12] J. K. Reid and J. A. Scott, "Ordering Symmetric Sparse Matrices for Small Profile and Wavefront," *International Journal for Numerical Methods in Engineering*, **45**(), 1999 pp. 1737-1755.

[13] J. A. George, "Computer Implementation of the Finite-Element Method," Report STAN CS-71-208, Ph.D. thesis, Department of Computer Science, Stanford University, Stanford, California, 1971.

[14] K. Sugiyama, S. Tagawa, and M. Toda, "Methods for Visual Understanding of Hierarchical Systems," *IEEE Transactions on Systems, Man, and Cybernetics*, **11**(2), 1981 pp. 109-125.

[15] E. R. Gansner, E. Koutsofios, S. C. North and K.-P. Vo, "A Technique for Drawing Directed Graphs," *Software Engineering*, **19**(3), 1993, 214-230.

[16] A. Clauset, "Finding Local Community Structure in Networks," *Physical Review E*, **72**, 026132, 2005.

[17] Y. F. Hu, "Efficient, High-Quality Force-Directed Graph Drawing," *The Mathematica Journal*, **10**(1), 2005 pp. 37-71.