- Journal List
- BMC Bioinformatics
- v.8; 2007
- PMC1821340

# An efficient grid layout algorithm for biological networks utilizing various biological attributes

^{#}

^{1}Masao Nagasaki,

^{}

^{#}

^{1}Euna Jeong,

^{1}Mitsuru Kato,

^{1}and Satoru Miyano

^{1}

^{1}Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan

^{}Corresponding author.

^{#}Contributed equally.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## Abstract

### Background

Clearly visualized biopathways provide a great help in understanding biological systems. However, manual drawing of large-scale biopathways is time consuming. We proposed a grid layout algorithm that can handle gene-regulatory networks and signal transduction pathways by considering edge-edge crossing, node-edge crossing, distance measure between nodes, and subcellular localization information from Gene Ontology. Consequently, the layout algorithm succeeded in drastically reducing these crossings in the apoptosis model. However, for larger-scale networks, we encountered three problems: (i) the initial layout is often very far from any local optimum because nodes are initially placed at random, (ii) from a biological viewpoint, human layouts still exceed automatic layouts in understanding because except subcellular localization, it does not fully utilize biological information of pathways, and (iii) it employs a local search strategy in which the neighborhood is obtained by moving one node at each step, and automatic layouts suggest that simultaneous movements of multiple nodes are necessary for better layouts, while such extension may face worsening the time complexity.

### Results

We propose a new grid layout algorithm. To address problem (i), we devised a new force-directed algorithm whose output is suitable as the initial layout. For (ii), we considered that an appropriate alignment of nodes having the same biological attribute is one of the most important factors of the comprehension, and we defined a new score function that gives an advantage to such configurations. For solving problem (iii), we developed a search strategy that considers swapping nodes as well as moving a node, while keeping the order of the time complexity. Though a naïve implementation increases by one order, the time complexity, we solved this difficulty by devising a method that caches differences between scores of a layout and its possible updates.

### Conclusion

Layouts of the new grid layout algorithm are compared with that of the previous algorithm and human layout in an endothelial cell model, three times as large as the apoptosis model. The total cost of the result from the new grid layout algorithm is similar to that of the human layout. In addition, its convergence time is drastically reduced (40% reduction).

## Background

Modeling and simulations of large scale biological pathways are some of the most important tasks in Bioinformatics. Many applications, e.g., Cell Illustrator [1,2], Cytoscape [3], Pajek [4], PATIKA [5,6], and CADLIVE [7,8] have been developed in this area. Related to these topics, the visualization of biopathways is considered to play a key role in understanding biological systems. However, manual drawing of large-scale biopathways is a time consuming work, hence suitable biopathway layout algorithms and their applications are strongly demanded.

Biopathways are categorized into three types, i.e., metabolic pathways, signal transduction pathways, and gene-regulatory networks. For metabolic pathways, several algorithms have been already proposed [9-13], and some of them succeeded in capturing the flow of the reactions well. In contrast, few layout algorithms that provide a convenient biological understanding have been proposed for signal transduction pathways [14,15] and gene-regulatory networks [16,17]. Thus, our new layout algorithm is focused on signal transduction pathways and gene-regulatory networks. For signal transduction pathways and gene-regulatory networks, extant layout algorithms can be categorized into two types; force-directed and grid layout algorithms.

Force-directed algorithms are used in [16,17] by taking into account the directional constraint following different types of molecular and simple regional constraints from subcellular localizations. These algorithms have been successfully integrated into PATIKA. However, as pointed out in [14], force-directed algorithms may not be suitable for compact layouts of complex biopathways. Furthermore, intricately shaped regions such as torus-shaped region cannot be handled well as regional constraints in these force-directed algorithms. Hence, they are not suitable for models containing torus-shaped plasma membrane and nuclear membrane although such types of models are common as biopathways.

A grid layout algorithm (referred to as LK-grid layout algorithm) was initially proposed by Li and Kurata. The grid layout algorithm restricts the positions of all nodes to grid points. Li and Kurata defined a cost function for two nodes that depends on some distance between these nodes and the topology of their connections in the graph. They applied LK-grid layout algorithm to a yeast cell-cycle pathway and concluded that this algorithm can geometrically classify the pathway into functional categories without using biological information. Moreover, they noticed that the algorithm generates compact layouts while avoiding overlaps between nodes. [15] proposed CB-grid layout algorithm, in which so as to reduce edge-edge crossings and node-edge crossings, a penalty for these cases is added to the cost function. The algorithm can also deal with any complex regional constraints following subcellular localizations, and besides search space is reduced due to these constrains. As a result, in the apoptosis model, the layout algorithm succeeded in a drastic reduction of edge-edge crossings and node-edge crossings, while placing nodes in biologically proper regions.

However, in the case of larger-scale networks, this algorithm encountered three problems. First, a layout with randomly placed nodes is used as the initial layout. This random layout contains a large number of edge-edge crossings and node-edge crossings; subsequently, many iterations will be required to obtain a locally optimal layout. Secondly, although one of the features of CB-grid layout algorithm is to use the subcellular localization information, it still does not fully utilize biological characteristics. For example, it does not consider such biological attributes as types of entities (protein, mRNA, and microRNA) or types of processes (phosphorylation, binding, and translation), although in human layouts these biological attributes are apt to contribute to the comprehension of interesting biopathways easier. Thirdly, according to a greedy strategy, CB-grid layout algorithm updates a layout by moving one node at each step until the layout reaches an optimum. However, resulting layouts are just local optima, hence their quality fundamentally depends on the initial layout. Although in [15] a multi-step CB-grid layout algorithm was also proposed to solve this drawback, it requires higher time complexity and hence is not suitable for practical applications.

To overcome these three problems, we propose a new grid layout algorithm. For the first problem, we propose a new force-directed algorithm whose output is suitable as the initial layout of grid layout algorithms. For the second problem, we introduce the concept that assigns a score i.e., a negative cost, to a layout depending on how nodes with the same attribute are aligned. This concept is realized with a combo score function, which is combined with the cost function defined in CB-grid layout algorithm. For the third problem, the search strategy in CB-grid layout algorithm is improved by adding the swap operation while keeping the time complexity. By the swap operation, the new grid layout can also consider layouts generated by exchanging the positions of two nodes in the current layout at each step.

The Methods section is organized as follows: (i) first, we introduce the previous grid layout, i.e., CB-grid layout algorithm; (ii) for the first improvement in the initial layout of CB-grid layout algorithm, the new force-directed algorithm termed Eades initial layout algorithm is described; (iii) for the second improvement, CCB-grid layout algorithm, which is CB-grid layout algorithm with the combo score function is described; (iv) for the third improvement, SCCB-grid layout algorithm, which enhances CCB-grid layout algorithm by adding the swap operation is presented. In the Results and Discussion section, the performances of these new algorithms are compared and verified by applying them to the signal transduction pathway of an endothelial cell, which is larger than the pathways in [14] and [15].

## Methods

### CB-grid layout algorithm: Introduction of the grid layout algorithm

Given a graph *G *= (*V*, *E*) with nodes *V *and edges *E*, a *layout L *= (*V*, *E*, *U*, *P*) of *G *consists of the underlying graph *G*, grid points *U *and a function *P *: *V *→ *U *such that *P *(*v*_{α}) ≠ *P *(*v*_{β}) for any two distinct nodes *v*_{α}, *v*_{β } *V*. This definition does not allow overlaps between nodes in the layout. For a layout *L*, this paper uses the following notations.

• *W*_{L}: a set of vacant points of *L*.

• *E*_{v}: the set of all edges connected to node *v*.

• |*V*|: the number of nodes in *V*.

• |*W*|: the number of vacant points in *L*, instead of |*W*_{L}| if there is no confusion possible.

We define the following operations.

• *T*_{v → p }*L*: the layout generated by moving a node *v *to a vacant point *p * *W*_{L}.

• ${S}_{{v}_{\alpha}\leftrightarrow {v}_{\beta}}$*L*: the layout generated by swapping nodes *v*_{α }and *v*_{β}.

• *D*_{v }*L*: the layout generated by removing a node *v *and all edges connected to *v*.

In addition, we define the following functions.

• $Cros{s}_{{e}_{i},{e}_{j}}$ (*L*): a binary function that returns 1 if an edge *e*_{i }crosses with an edge *e*_{j }and 0 otherwise.

• $Cros{s}_{{v}_{i},{e}_{j}}$ (*L*): a binary function that returns 1 if an edge *e*_{j }crosses with a node *v*_{i }and 0 otherwise.

• $Distanc{e}_{{v}_{i},{v}_{j}}$ (*L*): a function that returns ${w}_{{v}_{i},{v}_{j}}\cdot md({v}_{i},{v}_{j})$, where ${w}_{{v}_{i},{v}_{j}}$ is the weight to the couple of nodes *v*_{i }and *v*_{j}, and *md *(*v*_{i}, *v*_{j}) is the Manhattan distance between *v*_{i }and *v*_{j}.

In our previous approach [15] (mainly referred to as CB-grid layout algorithm), the *layout cost **C *(*L*) of *L *was defined as follows:

$C(L)={W}_{ee}{\displaystyle \sum _{{e}_{i},{e}_{j}\in E}Cros{s}_{{e}_{i},{e}_{j}}}(L)+{W}_{ne}{\displaystyle \sum _{{v}_{k}\in V,{e}_{l}\in E}Cros{s}_{{v}_{k},{e}_{l}}}(L)+{W}_{dc}{\displaystyle \sum _{{v}_{m},{v}_{n}\in V}Distanc{e}_{{v}_{m},{v}_{n}}}(L),\text{}\left(1\right)$

where *W*_{ee}, *W*_{ne}, and *W*_{d }are called respectively *edge-edge crossing weight, node-edge crossing weight*, and *distance cost weight*.

The CB-grid layout algorithm repeats the operation of moving a unique node to a vacant point one-by-one until it reaches a locally optimal layout. At each step, the algorithm calculates costs of all layouts that can be generated by moving one of all nodes to one of all vacant points. The layout with the lowest cost is selected as a starting layout for the next step. After reaching convergence, the algorithm outputs a locally optimal layout. If the cost calculation of all possible adjacent layouts is implemented in a naïve way, high time complexity is required. To overcome this problem, the previous method [15] introduced Δ matrix that stores each possible cost difference at the previous step and succeeded in reducing the time complexity at each step from *O *(|*W*| (|*V*|^{2 }+ |*E*|^{2}) to *O *(|*V*|^{2 }+ |*E*|^{2 }+ |*W*||${E}_{{v}_{\beta}}$| (|*V*| + |*E*|)), where *v*_{β }is the node moved at the previous step.

When CB-grid layout algorithm was applied to several biopathways, we encountered three problems. Thus, we propose new grid layout algorithms that solve these problems. Problems and solutions are summarized as follows:

1. Improving the choice of the initial layout: since a locally optimal layout depends noticeably on the initial layout, we first apply Eades initial layout algorithm to a random layout, and use its output as the initial layout. In the previous approach, a random layout was directly used as the initial layout.

2. Improving the cost function: we introduce the concept of a combo score that gives a good score, i.e., a negative cost when nodes with the same biological attribute are aligned (CCB-grid layout algorithm). In CB-grid layout algorithm, the biological attributes, except subcellular localization, were ignored.

3. Improving the search strategy: we propose a better search strategy, which allows us to obtain improved results, keeping the time complexity. For obtaining a better layout, the search space is extended by adding the swap operation. At each step, all layouts obtained by swapping two nodes are also considered (SCCB-grid layout algorithm).

In the remainder of this section, we describe these three new algorithms mentioned above.

### Eades initial layout algorithm: generating a new initial layout for grid layout algorithms

In the previous paper [15], a random layout was used as an initial layout for CB-grid layout algorithm. When the initial layout is far from the global optimum, the local optimum obtained tends to be unacceptable. Therefore, we decided to develop Eades algorithm [18] and use its output as the initial layout. Eades algorithm is one of the force-directed algorithms, consisting of the following two steps.

1. Two types of forces are defined for each pair of nodes. If two nodes are adjacent, there exists an attractive force *a*_{c1 }log(*d*/*a*_{c2}) between them, where *a*_{c1 }and *a*_{c2 }are constants, and *d *is the distance between the two nodes. On the other hand, if two nodes are not adjacent, there exists a repulsive force *r*_{c}/$\sqrt{d}$ between them, where *r*_{c }is a constant. At each step, the positions of all the nodes are updated according to the sum of the repulsive and attractive forces between them.

2. The above step is iterated a predetermined number of times, and the final result is obtained.

We have customized two points in Eades algorithm. First, nodes in Eades algorithm can be placed anywhere. All the nodes in the initial layout for CB-grid layout algorithm, however, should be placed on the grid points that satisfy the subcellular localization. Thus, the output of Eades algorithm cannot be used directly as an input for CB-grid layout algorithm.

To handle this problem, we propose to move each node to the closest vacant point that satisfies the subcellular localization after moving nodes at each step.

Second improvement is the following one. Since Eades algorithm doesn't consider edge-edge crossings and node-edge crossings in its implementation, the resulting layout could contain a lot of such crossings. For example, suppose a biological pathway with a subcellular localization, membrane, which slimly surrounds other subcellular localizations as shown in Figure 1(a), the graph in (a) could be a layout resulting from Eades algorithm. In this case, the layout might contain a large number of edge-edge crossings and node-edge crossings because edges cross over other subcellular localizations. In order to avoid this problem, we propose to gather nodes around a particular grid point for each subcellular localization as shown in Figure 1(b). Eades algorithm with the above improvements is called *Eades initial layout algorithm*.

### CCB-grid layout algorithm: utilizing various biological attributes

When humans draw biopathway models, nodes with the same attribute are usually arranged according to a rule. In CB-grid layout algorithm, this type of information is completely ignored. To implement this type of property, we introduce the concept of combo scores called **combo1 **and **combo2 **(see Figure Figure2).2). Note that a combo score is applied only to nodes having an attribute since some nodes do not have any attributes. We denote the set of nodes having an attribute by *V' * *V*. In this algorithm, (i) upperGrid(*p*, *i*)/lowerGrid(*p*, *i*) returns the upper/lower *i*th grid point over/under a grid point *p * *P*, and (ii) Attr(*v*) is the attribute of a node *v * *V'*, and *CW*_{a }= (1 + *C*/|${{V}^{\prime}}_{a}$|), where *C *is a constant and normally set to |*V*|, and ${{V}^{\prime}}_{a}$ is the set of nodes having an attribute *a*.

**Pseudo codes of combo score functions: combo1 and combo2**. (a)

**combo1**: a score function that considers nodes with one vertical grid distance from the target node. (b)

**combo2**: a score function that considers nodes with up to two vertical grid distances

**...**

The combo score is designed such that the more nodes with the same attribute are aligned vertically, the higher the score is. The combo score is defined between two nodes, and a combo score of a layout *L *is defined to be the sum of all the combo scores occurring in *L*. We say that two nodes have a *combo relation *when a combo score occurs between them. Note that the horizontal alignment score is not implemented because if the above combo score supported both the vertical and horizontal directions, the numbers of edge-edge crossings and node-edge crossings would be considerably increased. Therefore, we should choose only one direction for combo scores. In this paper, we defined combo scores in the vertical direction. We have considered two types of combo scores, i.e., **combo1 **and **combo2 **for layouts in Figure 3(a) and 3(b), respectively. Let nodes *v*_{a }to *v*_{f }in Figure Figure33 have the same attribute. The **combo1 **considers only the nodes with one vertical grid distance from the target node. In contrast, **combo2 **considers the nodes with up to two vertical grid distances from the target node. For the layout in Figure 3(a), the number of combo relations with **combo1 **and **combo2 **are 8 and 12, respectively. If node *v*_{f }is moved as shown in Figure 3(b), the number of combo relations with **combo1 **is the same as before, whereas that with **combo2 **is 14. Thus, only by using **combo2**, we can improve the combo score when node *v*_{f }is moved as shown in Figure 3(a) and 3(b). As shown in the dotted rectangle in Figure 3(a), a pair of vertically aligned nodes often occurs during the process of updating a layout. In this case, Figure 3(b) should be a better layout than Figure 3(a). For this reason, we decide to employ **combo2**. Henceforth, for a node *v * *V *in a layout *L*, *Combo*_{v }(*L*) denotes the same combo score as **combo2 **(*v*, *L*). The total score $\sum _{v\in V}Comb{o}_{v}(L)$ for *L *is denoted by *Combo *(*L*).

**An example that compares the features of combo1 and combo2 score functions**. (a) An intermediate layout of CCB-grid layout algorithm. In this layout, all six nodes have the same attribute. (b) The next candidate layout that is generated from (a) by moving

**...**

If *CW*_{a }returns the same value for any attribute *a*, many of the nodes with the same attribute will be vertically aligned easily since they have a greater chance to neighbor one another. So as to reduce the biases among the attributes, we define *CW*_{a }to be inversely related to the total number of the nodes whose attribute is *a*.

By modifying the layout score of CB-grid layout algorithm, we can define the layout cost *C *(*L*) of a layout *L *with the new concept of the combo score as follows:

$\begin{array}{lll}C(L)\hfill & =\hfill & {W}_{ee}{\displaystyle \sum _{{e}_{i},{e}_{j}\in E}Cros{s}_{{e}_{i},{e}_{j}}}(L)+{W}_{ne}{\displaystyle \sum _{{v}_{k}\in V,{e}_{l}\in E}Cros{s}_{{v}_{k},{e}_{l}}}(L)+{W}_{dc}{\displaystyle \sum _{{v}_{m},{v}_{n}\in V}Distanc{e}_{{v}_{m},{v}_{n}}}(L)\hfill \\ \hfill & -\hfill & {W}_{cs}\left(\frac{1}{2}{\displaystyle \sum _{{v}_{o}\in {V}^{\prime}}Comb{o}_{{v}_{o}}(L)}\right),\hfill \end{array}\text{}\left(2\right)$

where *W*_{cs }is called *combo score weight*. CB-grid layout algorithm improved by the above modification is named *Combo score, Cross cost and Biological information grid layout algorithm *(CCB-grid layout algorithm). The reason for multiplying the sum of the combo scores by 1/2 is that combo scores are counted twice since a combo score between nodes *v*_{α }and *v*_{β }is included in both $Comb{o}_{{v}_{\alpha}}$ (*L*) and $Comb{o}_{{v}_{\beta}}$ (*L*). The algorithm is the same as *C-optimization *(*L*) step in [15] except for the use of the above layout cost *C *(*L*), i.e., the algorithm for calculating Δ matrix is also the same.

For calculating the combo score for each node, only four nodes need to be checked at most, i.e., its time complexity is constant, while for calculating the edge-edge crossing cost, the node-edge crossing cost, and the distance cost for each node, these time complexities depend on |*E*|, |*V*|, and |*W*|, respectively. Thus, without using Δ matrix, the time complexity related to combo scores is *O *(|*V*||*W*|) at each step.

At each step, we need to calculate the difference between the combo score of the previous layout *L *and that of the current layout that is generated by moving a node *v *to a vacant point *p*, i.e., *Combo*(*T*_{v→p }*L*) – *Combo*(*L*). We can efficiently calculate the difference of the combo score ${\Delta}_{vp}^{cs}$ (*L*) as follows:

${\Delta}_{vp}^{cs}(L)=\{\begin{array}{cc}{W}_{cs}(Comb{o}_{v}({T}_{v\to p}L)-Comb{o}_{v}(L)+Ad{j}_{v}({T}_{v\to p}L)-Ad{j}_{v}(L))& if\text{}v\in {V}^{\prime}\\ 0& ifv\notin {V}^{\prime}\end{array},\text{}\left(3\right)$

where

$Ad{j}_{v}=\{\begin{array}{cc}C{W}_{\text{Attr}(v)}& \text{if}\begin{array}{l}\text{isCombo}(v,\text{upperGrid}(\text{P}(v,1)))=true\hfill \\ \text{isCombo}(v,\text{lowerGrid}(\text{P}(v,1)))=true\hfill \end{array}\\ 0& \text{otherwise}.\end{array}.\text{}\left(4\right)$

We introduced *Adj*_{v }(*L*) due to the following reason. First, suppose that three nodes with the same attribute are aligned vertically. We call them *v*_{α}, *v*_{β}, and *v*_{γ }beginning from the bottom. There are three combo relations among the three nodes: one is between *v*_{α }and *v*_{β}, another between *v*_{β }and *v*_{γ}, and the third between *v*_{α }and *v*_{γ}. Although *v*_{β }is involved in these three combo relations, the combo relation between *v*_{α }and *v*_{γ }is not considered in $Comb{o}_{{v}_{\beta}}$ (*L*). Therefore, *Adj*_{v }(*L*) is needed to correct this type of undercount.

### SCCB-grid layout algorithm: extension of the search space due to the swap operation

Another drawback of CB-grid layout algorithm is that only one node can be moved to a vacant point at each step. For example, the layout shown in Figure 4(a) is optimal for CB-grid layout algorithm despite the fact the layout in Figure 4(b) should be selected as the better layout. This limitation is due to the strategy of CB-grid layout algorithm. Thus, we have devised a new algorithm by allowing the swap operations between two nodes while keeping the time complexity. With this improvement, the layout in Figure 4(a) will be arranged as shown in Figure 4(b). The new algorithm is named CCB-grid layout with the swap operation (SCCB-grid layout algorithm). The layout cost function is the same as in CCB-grid layout algorithm. However, a naïve implementation would increase the time complexity to calculate the layout cost for swapped layouts.

**An optimal layout of CB-grid and improved layout with the swap operation**. (a) An optimal layout for CB-grid layout algorithm. (b) From (a) a better layout will be generated with the swap operation.

In the previous approach [15], Δ matrix stores cost differences that are induced only by moving nodes to vacant points. As a result, if a grid point of interest was occupied at the previous step, we cannot exploit Δ matrix to calculate cost differences corresponding to that grid point. Since grid points of interest on the swap operation are obviously occupied at the previous step, Δ matrix cannot be used. However, if Δ matrix also stores cost differences related to occupied points, Δ matrix can be exploited for this problematic case, too. We then propose an extended Δ matrix, which considers occupied points as well as vacant points. Since the definition of the cost differences for vacant points cannot be applied directly to occupied points, we decide to calculate the cost differences for the occupied points by calculating it without taking into account the node occupying that grid point and all edges connected to it. In the remainder of this section, we will show how to calculate the extended Δ matrix and then compare the time complexity of the extended Δ matrix and the original Δ matrix.

Henceforth, let us refer to the extended Δ matrix as Δ matrix. Given a layout *L*, at the first step, we update Δ (*L*) matrix as follows:

${\Delta}_{{v}_{\alpha}p}(L)=\{\begin{array}{ll}{F}_{{v}_{\alpha}}({T}_{{v}_{\alpha}\to p}L)-{F}_{{v}_{\alpha}}(L)\hfill & if\text{}p\in {W}_{L}\hfill \\ {F}_{{v}_{\alpha}}({T}_{{v}_{\alpha}\to p}{D}_{{v}_{\gamma}}L)-{F}_{{v}_{\alpha}}({D}_{{v}_{\gamma}}L)\hfill & if\text{}p=P({v}_{\gamma}).\hfill \end{array}\text{}\left(5\right)$

${F}_{{v}_{\alpha}}$ is the following function:

${F}_{{v}_{\alpha}}(L)={W}_{ee}{\displaystyle \sum _{{e}_{i}\in {E}_{{v}_{\alpha}},{e}_{j}\in E}Cros{s}_{{e}_{i},{e}_{j}}}(L)+{W}_{ne}{\displaystyle \sum _{{e}_{k}\in E}Cros{s}_{{v}_{\alpha},{e}_{k}}}(L)+{W}_{dc}{\displaystyle \sum _{{v}_{l}\in V}Distanc{e}_{{v}_{\alpha},{v}_{l}}}(L).\text{}\left(6\right)$

If the previous layout is updated by moving node *v*_{β }to vacant point *q*, Δ (${T}_{{v}_{\alpha}\to p}$*L*) can be updated efficiently by using Δ (*L*) as follows:

${\Delta}_{{v}_{\alpha}p}({T}_{{v}_{\beta}\to q}L)=\{\begin{array}{lll}{\Delta}_{{v}_{\beta}p}(L)-{\Delta}_{{v}_{\beta}q}(L),\hfill & \text{if}{v}_{\alpha}={v}_{\beta},p\in {W}_{{T}_{{v}_{\beta}\to q}L}\hfill & \left(\text{case}1\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{0},\hfill & \text{if}{v}_{\alpha}\ne {v}_{\beta},p\in {W}_{{T}_{{v}_{\beta}\to q}L}\backslash P({v}_{\beta})\hfill & \left(\text{case}2\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{1},\hfill & \text{if}{v}_{\alpha}\ne {v}_{\beta},p=P({v}_{\beta})\hfill & \left(\text{case}3\right)\hfill \\ {\Delta}_{{v}_{\beta}p}(L)-{\Delta}_{{v}_{\beta}q}(L)+DIF{F}_{2},\hfill & \text{if}{v}_{\alpha}={v}_{\beta},p=P({v}_{\gamma})\hfill & \left(\text{case}4\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{3}\hfill & \text{if}{v}_{\alpha}\ne {v}_{\beta},p=P({v}_{\gamma})\hfill & \left(\text{case}5\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{4},\hfill & \text{if}{v}_{\alpha}\ne {v}_{\beta},p=q\hfill & \left(\text{case}6\right),\hfill \end{array}\text{}\left(7\right)$

where *DIFF*_{0 }to *DIFF*_{4 }are defined in the following way:

$DIF{F}_{0}={Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\alpha}\to p}{T}_{{v}_{\beta}\to q}L)-{Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\beta}\to q}L)-{Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\alpha}\to p}L)+{Q}_{{v}_{\alpha},{v}_{\beta}}(L)\text{}\left(8\right)$

$DIF{F}_{1}={Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\alpha}\to p}{T}_{{v}_{\beta}\to q}L)-{Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\beta}\to q}L)\text{}\left(9\right)$

$DIF{F}_{2}={Q}_{{v}_{\beta},{v}_{\gamma}}({T}_{{v}_{\beta}\to q}L)-{Q}_{{v}_{\beta},{v}_{\gamma}}(L)\text{}\left(10\right)$

$DIF{F}_{3}={Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\alpha}\to p}{T}_{{v}_{\beta}\to q}{D}_{{v}_{\gamma}}L)-{Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\beta}\to q}{D}_{{v}_{\gamma}}L)-{Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\alpha}\to p}{D}_{{v}_{\gamma}}L)+{Q}_{{v}_{\alpha},{v}_{\beta}}({D}_{{v}_{\gamma}}L)\text{}\left(11\right)$

$DIF{F}_{4}=-{Q}_{{v}_{\alpha},{v}_{\beta}}({T}_{{v}_{\alpha}\to p}L)+{Q}_{{v}_{\alpha},{v}_{\beta}}(L),\text{}\left(12\right)$

where *Q *shall be defined below.

If the previous layout is updated by swapping two nodes ${v}_{{\beta}_{1}}$ and ${v}_{{\beta}_{2}}$, Δ (${S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}$*L*) is then updated efficiently by using Δ (*L*) as follows:

${\Delta}_{{v}_{\alpha}p}({S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}L)=\{\begin{array}{lll}{\Delta}_{{v}_{{\beta}_{1}}p}(L)-{\Delta}_{{v}_{{\beta}_{1}}P({v}_{{\beta}_{2}})}(L)+DIF{F}_{5},\hfill & \text{if}{v}_{\alpha}={v}_{{\beta}_{1}},p\in {W}_{L}\hfill & \left(\text{case}1\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{6},\hfill & \text{if}{v}_{\alpha}\ne {v}_{{\beta}_{1}},p\in {W}_{L}\backslash \{P({v}_{{\beta}_{1}}),P({v}_{{\beta}_{2}})\}\hfill & \left(\text{case}2\right)\hfill \\ {\Delta}_{{v}_{{\beta}_{1}}p}(L)-{\Delta}_{{v}_{{\beta}_{1}}P({v}_{{\beta}_{2}})}(L)+DIF{F}_{7},\hfill & \text{if}{v}_{\alpha}={v}_{{\beta}_{1}},p=P({v}_{\gamma})\hfill & \left(\text{case}3\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{8},\hfill & \text{if}{v}_{\alpha}\ne {v}_{{\beta}_{1}},{v}_{\alpha}\ne {v}_{{\beta}_{2}},p=P({v}_{\gamma})\hfill & \left(\text{case}4\right)\hfill \\ {\Delta}_{{v}_{\alpha}p}(L)+DIF{F}_{9},\hfill & \text{if}{v}_{\alpha}\ne {v}_{{\beta}_{1}},{v}_{\alpha}\ne {v}_{{\beta}_{2}},p=P({v}_{{\beta}_{2}})\hfill & \left(\text{case5}\right),\hfill \end{array}\text{}\left(13\right)$

where *DIFF*_{5 }to *DIFF*_{9 }are defined in the following way:

$DIF{F}_{5}={Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{{\beta}_{1}}\to p}{S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{1}}}L)-{Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}L)-{Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{{\beta}_{1}}\to p}L)+{Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}(L)\text{}\left(14\right)$

$DIF{F}_{6}={\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{\alpha}\to p}{S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}L)-{\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}L)-{\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{\alpha}\to p}L)+{\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}(L)\text{}\left(15\right)$

$\begin{array}{lll}DIF{F}_{7}\hfill & =\hfill & {Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{{\beta}_{1}}\to p}{S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}{D}_{{v}_{\gamma}}L)-{Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{{\beta}_{1}}\to p}{D}_{{v}_{\gamma}}L)-{Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}{D}_{{v}_{\gamma}}L)\hfill \\ \hfill & +\hfill & {Q}_{{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({D}_{{v}_{\gamma}}L)+{Q}_{{v}_{{\beta}_{1}},{v}_{\gamma}}({T}_{{v}_{{\beta}_{1}}\to P({v}_{{\beta}_{2}})}{D}_{{v}_{{\beta}_{2}}}L)-{Q}_{{v}_{{\beta}_{1}},{v}_{\gamma}}({D}_{{v}_{{\beta}_{2}}}L)\hfill \end{array}\text{}\left(16\right)$

$\begin{array}{lll}DIF{F}_{8}\hfill & =\hfill & {\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{\alpha}\to p}{S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}{D}_{{v}_{\gamma}}L)-{\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({S}_{{v}_{{\beta}_{1}}\leftrightarrow {v}_{{\beta}_{2}}}{D}_{{v}_{\gamma}}L)-{\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({T}_{{v}_{\alpha}\to p}{D}_{{v}_{\gamma}}L)\hfill \\ \hfill & +\hfill & {\widehat{Q}}_{{v}_{\alpha},{v}_{{\beta}_{1}},{v}_{{\beta}_{2}}}({D}_{{v}_{\gamma}}L)\hfill \end{array}\text{}\left(17\right)$

$\begin{array}{lll}DIF{F}_{9}\hfill & =\hfill & {Q}_{{v}_{\alpha},{v}_{{\beta}_{2}}}({T}_{{v}_{\alpha}\to P({v}_{{\beta}_{2}})}{T}_{{v}_{{\beta}_{2}}\to P({v}_{{\beta}_{1}})}{D}_{{v}_{{\beta}_{1}}}L)-{Q}_{{v}_{\alpha},{v}_{{\beta}_{2}}}({T}_{{v}_{{\beta}_{2}}\to P({v}_{{\beta}_{1}})}{D}_{{v}_{{\beta}_{1}}}L)\hfill \\ \hfill & -\hfill & {Q}_{{v}_{\alpha},{v}_{{\beta}_{1}}}({T}_{{v}_{\alpha}\to P({v}_{{\beta}_{2}})}{D}_{{v}_{{\beta}_{2}}}L)+{Q}_{{v}_{\alpha},{v}_{{\beta}_{1}}}({D}_{{v}_{{\beta}_{2}}}L).\hfill \end{array}\text{}\left(18\right)$

The case of *v*_{α }= ${v}_{{\beta}_{2}}$ is not considered in Equation (13) because equations of this case can be obtained by simply replacing ${v}_{{\beta}_{1}}$ with ${v}_{{\beta}_{2}}$ in case 1 and 3.

${Q}_{{v}_{a},{v}_{b}}$ (·) and ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}$ (·) in *DIFF*_{0 }to *DIFF*_{9 }are partial cost functions depending on the two nodes *v*_{a }and *v*_{b }and the three nodes *v*_{a}, *v*_{b}, and *v*_{c}, respectively, they are the sums of the corresponding partial edge-edge crossing costs, node-edge crossing costs and distance costs as follows:

${Q}_{{v}_{a},{v}_{b}}(L)={W}_{v}{Q}_{{v}_{a},{v}_{b}}^{dc}(L)+{W}_{ee}{Q}_{{v}_{a},{v}_{b}}^{ee}(L)+{W}_{ve}{Q}_{{v}_{a},{v}_{b}}^{ve}(L)\text{}\left(19\right)$

${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}(L)={W}_{v}{\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{dc}(L)+{W}_{ee}{\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ee}(L)+{W}_{ve}{\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ve}(L),\text{}\left(20\right)$

where ${Q}_{{v}_{a},{v}_{b}}^{ee}$ (·) and ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ee}$ (·) are related to edge-edge crossings, while ${Q}_{{v}_{a},{v}_{b}}^{ne}$ (·) and ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ne}$ (·) are related to node-edge crossings, and ${Q}_{{v}_{a},{v}_{b}}^{dc}$ (·) and ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{dc}$ (·) are related to the distance cost. The details are described as below.

(a) ${Q}_{{v}_{a},{v}_{b}}^{ee}$ (·) is a partial edge-edge crossing cost function of ${E}_{{v}_{a}}$ and ${E}_{{v}_{b}}$, and is defined as follows:

${Q}_{{v}_{a},{v}_{b}}^{ee}(L)=\{\begin{array}{ll}{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}},{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{e}_{{v}_{a}},{e}_{{v}_{b}}}(L)}\hfill & \text{if}({v}_{a},{v}_{b})\notin E\hfill \\ {\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}},{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{e}_{{v}_{a}},{e}_{{v}_{b}}}(L)}+{\displaystyle \sum _{e\in E}Cros{s}_{e,({v}_{a},{v}_{b})}(L)}\hfill & \text{if}({v}_{a},{v}_{b})\in E\hfill \end{array}.\text{}\left(21\right)$

Similarly, ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ee}$ (·) is a partial edge-edge crossing cost function of ${E}_{{v}_{a}}$, ${E}_{{v}_{b}}$, and ${E}_{{v}_{c}}$, and is defined as follows:

${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ee}(L)=\{\begin{array}{ll}\begin{array}{c}{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}},{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{e}_{{v}_{a}},{e}_{{v}_{b}}}(L)}\\ +{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}},{e}_{{v}_{c}}\in {E}_{{v}_{c}}}Cros{s}_{{e}_{{v}_{a}},{e}_{{v}_{c}}}(L)}\end{array}\text{}(={\widehat{Q}}^{ee})\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\notin E,\\ ({v}_{a},{v}_{c})\notin E\end{array}\hfill \\ {\widehat{Q}}^{ee}+{\displaystyle \sum _{e\in E\backslash {E}_{{v}_{c}}}Cros{s}_{e,({v}_{a},{v}_{b})}(L)}\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\in E,\\ ({v}_{a},{v}_{c})\notin E\end{array}\hfill \\ {\widehat{Q}}^{ee}+{\displaystyle \sum _{e\in E\backslash {E}_{{v}_{b}}}Cros{s}_{e,({v}_{a},{v}_{c})}(L)}\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\notin E,\\ ({v}_{a},{v}_{c})\in E\end{array}\hfill \\ {\widehat{Q}}^{ee}+{\displaystyle \sum _{e\in E\backslash {E}_{{v}_{c}}}Cros{s}_{e,({v}_{a},{v}_{b})}(L)}+{\displaystyle \sum _{e\in E\backslash {E}_{{v}_{b}}}Cros{s}_{e,({v}_{a},{v}_{c})}(L)}\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\in E,\\ ({v}_{a},{v}_{c})\in E\end{array}\hfill \end{array}.\text{}\left(22\right)$

(b) ${Q}_{{v}_{a},{v}_{b}}^{ne}$ is a partial node-edge crossing cost function of *v*_{a}, *v*_{b}, ${E}_{{v}_{a}}$, and ${E}_{{v}_{b}}$, and is defined as follows:

${Q}_{{v}_{a},{v}_{b}}^{ne}(L)=\{\begin{array}{lll}{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}}}Cros{s}_{{v}_{b},{e}_{{v}_{a}}}(L)}+{\displaystyle \sum _{{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{v}_{a},{e}_{{v}_{b}}}(L)}\hfill & (={Q}^{ne})\hfill & \text{if}({v}_{a},{v}_{b})\notin E\hfill \\ {Q}^{ne}+{\displaystyle \sum _{v\in V}Cros{s}_{v,({v}_{a},{v}_{b})}(L)}\hfill & \hfill & \text{if}({v}_{a},{v}_{b})\in E\hfill \end{array}\text{}\left(23\right)$

Similarly, ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ne}$ (·) is a partial node-edge crossing cost function of *v*_{a}, *v*_{b}, *v*_{c}, ${E}_{{v}_{a}}$, ${E}_{{v}_{b}}$, and ${E}_{{v}_{c}}$, and is defined as follows:

${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{ne}(L)=\{\begin{array}{ll}\begin{array}{c}{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}}}Cros{s}_{{v}_{b},{e}_{{v}_{a}}}(L)}+{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}}}Cros{s}_{{v}_{c},{e}_{{v}_{a}}}(L)}\\ +{\displaystyle \sum _{{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{v}_{a},{e}_{{v}_{b}}}(L)}+{\displaystyle \sum _{{e}_{{v}_{c}}\in {E}_{{v}_{c}}}Cros{s}_{{v}_{a},{e}_{{v}_{c}}}(L)}\end{array}\text{}(={\widehat{Q}}^{ne})\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\notin E,\\ ({v}_{a},{v}_{c})\notin E\end{array}\hfill \\ {\widehat{Q}}^{ne}+{\displaystyle \sum _{v\in V\backslash {v}_{c}}Cros{s}_{v,({v}_{a},{v}_{b})}(L)}\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\in E,\\ ({v}_{a},{v}_{c})\notin E\end{array}\hfill \\ {\widehat{Q}}^{ne}+{\displaystyle \sum _{v\in V\backslash {v}_{b}}Cros{s}_{v,({v}_{a},{v}_{c})}(L)}\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\notin E,\\ ({v}_{a},{v}_{c})\in E\end{array}\hfill \\ {\widehat{Q}}^{ne}+{\displaystyle \sum _{v\in V\backslash {v}_{c}}Cros{s}_{v,({v}_{a},{v}_{b})}(L)}+{\displaystyle \sum _{v\in V\backslash {v}_{b}}Cros{s}_{v,({v}_{a},{v}_{c})}(L)}\hfill & \text{if}\begin{array}{c}({v}_{a},{v}_{b})\in E,\\ ({v}_{a},{v}_{c})\in E\end{array}\hfill \end{array}.\text{}\left(24\right)$

(c) ${Q}_{{v}_{a},{v}_{b}}^{dc}$ is a partial distance cost function of *v*_{a }and *v*_{b}, and is defined as follows:

${Q}_{{v}_{a},{v}_{b}}^{dc}(L)=Distanc{e}_{{v}_{a},{v}_{b}}(L).\text{}\left(25\right)$

Similarly, ${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{dc}$ (·) is a partial distance cost function of *v*_{a}, *v*_{b}, and *v*_{c}, and is defined as follows:

${\widehat{Q}}_{{v}_{a},{v}_{b},{v}_{c}}^{dc}(L)=Distanc{e}_{{v}_{a},{v}_{b}}(L)+Distanc{e}_{{v}_{a},{v}_{c}}(L).\text{}\left(26\right)$

Thus far, we found out a method to efficiently calculate Δ matrix. The purpose of extending Δ matrix is to calculate the cost difference of the swap operation. When nodes ${v}_{{\alpha}_{1}}$ and ${v}_{{\alpha}_{2}}$ are swapped, we can calculate $Swa{p}_{{v}_{{\alpha}_{1}},{v}_{{\alpha}_{2}}}$ using these Δ costs as follows:

$Swa{p}_{{v}_{{\alpha}_{1}}{v}_{{\alpha}_{2}}}({S}_{{v}_{{\alpha}_{1}}\leftrightarrow {v}_{{\alpha}_{2}}}L)={\Delta}_{{v}_{{\alpha}_{1}}P({v}_{{\alpha}_{2}})}(L)+{\Delta}_{{v}_{{\alpha}_{2}}P({v}_{{\alpha}_{1}})}(L)+{R}_{{v}_{{\alpha}_{1}},{v}_{{\alpha}_{2}}}({S}_{{v}_{{\alpha}_{1}}\leftrightarrow {v}_{{\alpha}_{2}}}L)-{R}_{{v}_{{\alpha}_{1}},{v}_{{\alpha}_{2}}}(L),\text{}\left(27\right)$

where

${R}_{{v}_{a},{v}_{b}}(L)={W}_{ee}{\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}},{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{e}_{{v}_{a}},{e}_{{v}_{b}}}(L)+{W}_{ve}}\left({\displaystyle \sum _{{e}_{{v}_{a}}\in {E}_{{v}_{a}}}Cros{s}_{{v}_{b},{e}_{{v}_{a}}}(L)}+{\displaystyle \sum _{{e}_{{v}_{b}}\in {E}_{{v}_{b}}}Cros{s}_{{v}_{a},{e}_{{v}_{b}}}(L)}\right).\text{}\left(28\right)$

In SCCB-grid layout algorithm, the combo score also needs to be considered. Given a layout such that a node *v*_{α }is moved to a vacant point *p*, ${\Delta}_{{v}_{\alpha}p}^{cs}$ can be calculated as shown in Equation (3). In contrast, if two nodes ${v}_{{\alpha}_{1}}$ and ${v}_{{\alpha}_{2}}$ are swapped, the difference of combo scores, *Combo *(${S}_{{v}_{{\alpha}_{1}}\leftrightarrow {v}_{{\alpha}_{2}}}$*L*) – *Combo *(*L*), is effectively calculated as follows:

$Swa{p}_{{v}_{{\alpha}_{1}},{v}_{{\alpha}_{2}}}^{cs}(L)=pSwa{p}_{{v}_{{\alpha}_{1}}{v}_{{\alpha}_{2}}}(L)+pSwa{p}_{{v}_{{\alpha}_{2}}{v}_{{\alpha}_{1}}}(L),\text{}\left(29\right)$

where

$pSwa{p}_{vu}(L)=\{\begin{array}{cc}{W}_{cs}(Comb{o}_{v}({S}_{v\leftrightarrow u}L)-Comb{o}_{v}(L)+Ad{j}_{v}({S}_{v\leftrightarrow u}L)-Ad{j}_{v}(L)& \text{if}v\in {V}^{\prime}\\ 0& \text{if}v\notin {V}^{\prime}\end{array}.\text{}\left(30\right)$

A pseudo code of SCCB-grid layout algorithm is described in Figure Figure55.

If node *v*_{β }is moved at the previous step, the time complexity of calculating Δ matrix is *O *((|*V*| + |*E*|)|${E}_{{v}_{\beta}}$||*U*|). If two ${v}_{{\beta}_{1}}$ and ${v}_{{\beta}_{2}}$ are swapped at the previous step, the time complexity of calculating Δ matrix was *O *((|*V*| + |*E*|) (|${E}_{{v}_{{\beta}_{1}}}$| + |${E}_{{v}_{{\beta}_{2}}}$|) |*U*|) = *O *((|*V*| + |*E*|) |${E}_{{v}_{{\beta}^{\prime}}}$||*U*|), where |${E}_{{v}_{{\beta}^{\prime}}}$| = (|${E}_{{v}_{{\beta}_{1}}}$| + |${E}_{{v}_{{\beta}_{2}}}$|)/2. In addition, the time complexity of all the swap operations considered at each step is *O *(|*E*|^{2}). Therefore, the time complexity of SCCB-grid layout algorithm is *O *(|*E*|^{2 }+ |*U*||${E}_{{v}_{{\beta}^{\prime}}}$| (|*V*| + |*E*|)) at each step.

Since the time complexity of CB-grid layout algorithm is *O *(|*V*|^{2 }+ |*E*|^{2 }+ |*W*||${E}_{{v}_{\beta}}$| (|*V*| + |*E*|)) at each step [15], the time complexity of SCCB-grid layout algorithm is *O*(|*V*||${E}_{{v}_{\beta}}$| (|*V*| + |*E*|)) larger than that of CB-grid layout algorithm (note that *v*_{β }and *v*_{β' }are not distinguished here). Here, we consider two cases, |*V*| ≤ |*W*| (case 1) and |*V*| > |*W*| (case 2) and show these two algorithms have the same time complexity with high probability. For case 1, the above difference is negligible since *O *(|*V*||${E}_{{v}_{\beta}}$| (|*V*| + |*E*|)) ≤ *O *(|*W*||${E}_{{v}_{\beta}}$|(|*V*| + |*E*|)). In contrast, the *O*(|*V*||${E}_{{v}_{\beta}}$| (|*V*| + |*E*|)) difference cannot be neglected in case 2. However, if we assume that all nodes can be moved to form the next layout with equal probability, |*V*||${E}_{{v}_{\beta}}$| = 2 |*E*|, and *O*(|*V*||${E}_{{v}_{\beta}}$| (|*V*| + |*E*|)) = *O *(|*V*|^{2 }+ |*E*|^{2}) subsequently. Therefore, the time complexity of SCCB-grid layout algorithm will be the same as that of CB-grid layout algorithm even in the case 2. For the above reasons, the time complexities of SCCB-grid and CB-grid layout algorithms are the same in practice.

## Results and Discussion

### Data and Parameters

To evaluate our algorithms on a large-scale signal transduction pathway with a gene regulatory network, we create the pathway model of an endothelial cell with Cell Illustrator [1,2] by extracting information from [19]. The model consists of 309 nodes and 371 edges (three times as large as the apoptosis model in [15], which consists of 117 nodes and 126 edges), and the maximum degree of a node is ten (eight in the apoptosis model). Grid widths and heights are fixed to 100 pixels; the total numbers of vertical and horizontal grid points are 36 and 40, respectively. We used the following information pertaining to seven GO subcellular localizations: extracellular space (GO:0005615), cytoplasm (GO:0005737), nucleus (GO:0005634), mitochondrion (GO:0005739), plasma membrane (GO:0005886), nuclear membrane (GO:0005635), and mitochondria membrane (GO:0005740). We also used the following information pertaining to sixteen processes and entities used as attributes of nodes: migration, phosphorylation, protein with a modification, ligand, assembly, transcription, translation, mRNA, ligand and receptor, receptor, unknown, protein, exchange, trimer, ubiquitination, and degradation.

Usually, these types of biological models have many nodes termed as degradation. The degradation process always has only one edge. To exploit this property, we apply these layout algorithms after removing degradation nodes (97 nodes). After applying layout algorithms, we attach each eliminated degradation node just below the entity to which it was initially connected. Thus, in practice, the numbers of nodes and edges in the model given to layout algorithms are 212 and 274, respectively. Note that when the performances of algorithms are compared with the numbers of edge-edge crossings and node-edge crossings in the latter part of this section, crossings that are caused by degradations and edges connected to them are not taken into account.

We apply the following rule to edge-edge crossing weight *W*_{ee}, node-edge crossing weight *W*_{ne}, combo score weight *W*_{cs}, and distance cost weight *W*_{dc }of a layout cost, in Equation (2), to ensure that the importance of the distance cost is less than those of the others:

$\mathrm{min}({W}_{ee},{W}_{ne},{W}_{cs})>{W}_{dc}\cdot \underset{L}{\mathrm{max}}{\displaystyle \sum _{v,u\in V}Distanc{e}_{v,u}(L)}.\text{}\left(31\right)$

In our study, *W*_{dc}, *W*_{ee}, *W*_{ne}, and *W*_{cs }were set to 1, 70, 150, and 110, respectively. Also, the constant C in *CW*_{a }was set to 12.

Using the combo score, many nodes can be aligned vertically. However, in many cases, the nodes cannot be moved once they have combo relations. Plasma membrane, nuclear membrane, and mitochondrial membrane are thin and torus shaped, thus, vertical alignments of the nodes on these subcellular localizations will not be of interest for users (e.g., the width of plasma membrane in our model is only two grids). Therefore, in this paper, we decided to ignore combo scores in plasma membrane, nuclear membrane, and mitochondrial membrane.

### Comparison of layouts

Figure Figure66 shows the number of edge-edge crossings, the number of node-edge crossings, combo scores, and total costs of the layouts with CB-grid, CCB-grid, and SCCB-grid layout algorithms, and the human layout. We generate ten initial layouts by applying Eades initial layout algorithm to ten random layouts. These initial layouts are commonly used for each layout algorithm (CB Eades, CCB Eades, and SCCB Eades in Figure Figure6).6). In addition, we use the ten random layouts directly as initial layouts of CB-grid layout algorithms (CB random in Figure Figure6,6, which corresponds to the previous layout algorithm) to confirm the significance of preparing proper initial layouts. Figure Figure88 and and99 respectively show the best layouts of CB-grid and SCCB-grid layout algorithms, which have the lowest total cost among ten resulting layouts of each algorithm. The human layout is shown in Figure Figure1010.

**Comparisons of edge-edge crossings, node-edge crossings, combo score, and total cost among the results of four grid layout algorithms and the human layout**. Costs and scores of the generated layouts with the CB random, CB Eades, CCB Eades, SCCB Eades,

**...**

**A resulting layout of CB-grid layout algorithm**. A resulting layout of CB-grid layout algorithm in an endothelial signal transduction pathway. The pathway model is the same as that in Figure 10.

**A resulting layout of SCCB-grid layout algorithm**. A resulting layout of SCCB-grid layout algorithm in an endothelial signal transduction pathway. The pathway model is the same as that in Figure 10.

**The human layout**. The human layout of an endothelial signal transduction pathway. This pathway model is arranged with CB-grid and SCCB-grid layout algorithms in Figure 8 and Figure 9, respectively.

In [15], the initial layout for CB-grid layout algorithm was a random layout, which had a large number of edge-edge crossings and node-edge crossings. Many iterations will, therefore, be needed until convergence. This fact prompted us to use the output of Eades initial layout algorithm as an initial layout. Figure Figure77 shows the number of iterations until convergence. As shown in this figure, CB-grid Eades successfully reduces the number of iterations when compared to CB-grid random (40% reduction on average). Moreover, the total score of CB-grid Eades is greatly improved over that of CB-grid random (see Figure 6(d)). A discussion in [15] was suggesting that reducing edge-edge crossings and node-edge crossings will lead to a better approximation of the human layout. In contrast as shown in Figure 6(a) and 6(b), the human layout also has several edge-edge and node-edge crossings, and has a higher combo score than that of CB-grid layout algorithm. Based on these facts, we proposed an additional scoring criterion – combo score – in CCB-grid layout algorithm. As seen through the value of combo scores (see Figure 6(c)), CCB-grid layout algorithm drastically improves this score, and this score becomes closer to that of the human layout. However, the numbers of edge-edge crossings and node-edge crossings in CCB-grid layout algorithm increase, comparing to CB-grid Eades (see Figure 6(a) and 6(b)). In this paper, the swap operation is proposed to increase the number of candidate layouts at each step. As shown in Figure 6(a) and 6(b), SCCB-grid layout algorithm succeeds in reducing edge-edge crossings and node-edge crossings, i.e., the above drawback of CCB-grid layout algorithm is partially diminished. In addition, as shown in Figure 6(c), the combo score of SCCB-grid layout algorithm is also improved slightly.

**Comparisons of the total numbers of iterations for optimal layouts among four grid layout algorithms**. Total number of iterations for optimal layouts with CB random, CB Eades, and SCCB Eades from the same initial layout. Ten initial layouts are applied

**...**

We also apply grid-layout algorithms to Fas-induced apoptosis pathway model [20] and ASE cell fate simulation model [21] to obtain a more generalized comparison. Resulting layouts and the number of crossings in each layout are summarized in Additional file 1. These models including the endothelial cell model are also available as Additional file 2, and the application of SCCB-grid layout algorithm for these models can be downloaded from [22].

## Conclusion

For better biopathway layouts, three improvements to CB-grid layout algorithm were proposed: (i) the improvement of initial layouts (ii) the improvement of cost function (iii) the improvement of search strategy itself without increasing the time complexity. For (i), Eades initial layout algorithm was proposed and the improvement was confirmed with a signal transduction pathway of an endothelial cell. For (ii), CCB-grid layout algorithm, which includes combo score function, was proposed and the improvement was verified with the same signal transduction pathway. For (iii), SCCB-grid layout algorithm was proposed. Due to (i) and (iii), our layout algorithm can be started from the better layout, and more robust to the condition of the initial layout than extant methods. In addition, we succeeded in utilizing the biological attributes that are not considered in extant methods due to combo score.

However, our layout algorithm has limitations and problems, which should be addressed in future work. Firstly, if the parameters of the combo score are not correctly selected, once a node gets a combo relation, the node no longer moves to other grid points anymore. Thus, it is important to devise a method that automatically selects the suitable parameters for the combo score function, edge-edge crossing function, and node-edge crossing function. Secondly, in our algorithm, only undirected graphs are considered to be laid out. On the other hand, for metabolic pathways, [11,13] proposed layout algorithms that decompose a digraph to hierarchical structural parts and directed cycle parts by considering the direction of edges in order to capture the flow of reactions. Therefore, the grid layout algorithm will also need to handle digraphs, utilizing its property that is effective especially in the grid-based layout. Finally, it should be addressed that grid layout algorithms including our new approach requires high time complexity and are not suitable for the real-time drawing. Thus, we would like to devise a further optimized grid layout algorithm to enable the real-time drawing.

## Authors' contributions

The basic idea was conceived by MK and MN. This idea was developed by KK and MN who then conceived a new idea and developed it. EJ created the endothelial model in Figure Figure10.10. SM supervised the whole study. The final manuscript was read and approved by all authors.

## Supplementary Material

**Additional file 1:**

Resulting layouts of applying LK-grid layout algorithm, CB-grid layout algorithm and SCCB-grid layout algorithm to Fas-induced apoptosis pathway model and ASE cell fate simulation model are shown. Comparison of these results are also included.

^{(3.0M, pdf)}

**Additional file 2:**

Biopathway model files. Endothelial cell model, Fas-induced apoptosis pathway model and ASE cell fate simulation model are included.

^{(76K, zip)}

## Acknowledgements

Computation time was provided by the Super Computer System, Human Genome Center, Institute of Medical Science, University of Tokyo.

## References

- Nagasaki M, Doi A, Matsuno H, Miyano S. Genomic Object Net: I. A platform for modelling and simulating biopathways. Applied Bioinformatics. 2003;2:181–184. [PubMed]
- Doi A, Nagasaki M, Fujita S, Matsuno H, Miyano S. Genomic Object Net: II. Modelling biopathways by hybrid functional Petri net with extension. Applied Bioinformatics. 2003;2:185–188. [PubMed]
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [PMC free article] [PubMed] [Cross Ref]
- Networks/Pajek http://vlado.fmf.uni-lj.si/pub/networks/pajek/
- Demir E, Babur O, Dogrusoz U, Gursoy A, Nisanci G, Atalay RC, Ozturk M. PATIKA: an integrated visual environment for collaborative construction and analysis of cellular pathways. Bioinformatics. 2002;18:996–1003. doi: 10.1093/bioinformatics/18.7.996. [PubMed] [Cross Ref]
- Dogrusoz U, Erson EZ, Giral E, Demir E, Babur O, Cetintas A, Colak R. PATIKAweb: a Web interface for analyzing biological pathways through advanced querying and visualization. Bioinformatics. 2006;22:374–375. doi: 10.1093/bioinformatics/bti776. [PubMed] [Cross Ref]
- Kurata H, Matoba N, Shimizu N. CADLIVE for constructing a large-scale biochemical network based on a simulation-directed notation and its application to yeast cell cycle. Nucleic Acids Research. 2003;31:4071–4084. doi: 10.1093/nar/gkg461. [PMC free article] [PubMed] [Cross Ref]
- Kurata H, Masaki K, Sumida Y, Iwasaki R. CADLIVE dynamic simulator: direct link of biochemical networks to dynamic models. Genome Research. 2005;15:590–600. doi: 10.1101/gr.3463705. [PMC free article] [PubMed] [Cross Ref]
- Brandes U, Dwyer T, Schreiber F. Visualizing related metabolic pathways in two and a half dimensions. Proceedings of the 11th International Symposium on Graph Drawing. 2003. pp. 111–122.
- Karp PD, Paley SM. Automated drawing of metabolic pathways. Proceedings of the 3rd International Conference on Bioinformatics and Genome Research. 1994. pp. 225–238.
- Becker MY, Rojas I. A graph layout algorithm for drawing metabolic pathways. Bioinformatics. 2001;17:461–467. doi: 10.1093/bioinformatics/17.5.461. [PubMed] [Cross Ref]
- Sirava M, Schafer T, Eiglsperger M, Kaufmann M, Kohlgacher O, Bornberg-Bauer E, Lenhof HP. BioMiner-modeling, analyzing, and visualizing biochemical pathways and networks. Bioinformatics. 2002;18:S219–230. [PubMed]
- Wegner K, Kummer U. A new dynamical layout algorithm for complex biochemical reaction networks. BMC Bioinformatics. 2005;6 [PMC free article] [PubMed]
- Li W, Kurata H. A grid layout algorithm for automatic drawing of biochemical networks. Bioinformatics. 2005;21:2036–2042. doi: 10.1093/bioinformatics/bti290. [PubMed] [Cross Ref]
- Kato M, Nagasaki M, Doi A, Miyano S. Automatic drawing of biological networks using cross cost and subcomponent data. Genome Informatics. 2005;16:22–31. [PubMed]
- Genc B, Dogrusoz U. A constrained, force-directed layout algorithm for biological pathways. Proceedings of the 11th International Symposium on Graph Drawing. 2003. pp. 314–319.
- Dogrusoz U, Gral E, Cetintas A, Civril A, Demir E. A compound graph layout algorithm for biological pathways. Proceedings of the 12th International Symposium on Graph Drawing. 2004. pp. 442–447.
- Eades P. A heuristic for graph drawing. Congressus Nemerantium. 1984;42:149–160.
- Pober JS. Endothelial activation: Intracellular signaling pathways. Arthritis Research. 2002;4:S109–116. doi: 10.1186/ar576. [PMC free article] [PubMed] [Cross Ref]
- Matsuno H, Tanaka Y, Aoshima H, Doi A, Matsui M, Miyano S. Biopathways representation and simulation on hybrid functional Petri net. In Silico Biology. 2003;3:389–404. [PubMed]
- Saito A, Nagasaki M, Doi A, Ueno K, Miyano S. Cell fate simulation model of gustatory nuerons with microRNAs double-negative feedback loop by hybrid functional Petri net with extension. Genome Informatics. 2006;17:100–111. [PubMed]
- http://www.csml.org/download/SCCBLayout_BMC_inst.exe

**BioMed Central**

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (2.0M) |
- Citation

- Automatic drawing of biological networks using cross cost and subcomponent data.[Genome Inform. 2005]
*Kato M, Nagasaki M, Doi A, Miyano S.**Genome Inform. 2005; 16(2):22-31.* - Fast grid layout algorithm for biological networks with sweep calculation.[Bioinformatics. 2008]
*Kojima K, Nagasaki M, Miyano S.**Bioinformatics. 2008 Jun 15; 24(12):1433-41. Epub 2008 Apr 18.* - A grid layout algorithm for automatic drawing of biochemical networks.[Bioinformatics. 2005]
*Li W, Kurata H.**Bioinformatics. 2005 May 1; 21(9):2036-42. Epub 2005 Jan 27.* - An algorithm for targeted convergence of Euler or Newton iterations.[C R Acad Sci III. 2001]
*Thomas R, d'Ari R.**C R Acad Sci III. 2001 Apr; 324(4):285-96.* - The architecture of visual narrative comprehension: the interaction of narrative structure and page layout in understanding comics.[Front Psychol. 2014]
*Cohn N.**Front Psychol. 2014; 5:680. Epub 2014 Jul 1.*

- Visualization of protein interaction networks: problems and solutions[BMC Bioinformatics. ]
*Agapito G, Guzzi PH, Cannataro M.**BMC Bioinformatics. 14(Suppl 1)S1* - Application of Approximate Pattern Matching in Two Dimensional Spaces to Grid Layout for Biochemical Network Maps[PLoS ONE. ]
*Inoue K, Shimozono S, Yoshida H, Kurata H.**PLoS ONE. 7(6)e37739* - A multilevel layout algorithm for visualizing physical and genetic interaction networks, with emphasis on their modular organization[BioData Mining. ]
*Tuikkala J, Vähämaa H, Salmela P, Nevalainen OS, Aittokallio T.**BioData Mining. 52* - An efficient biological pathway layout algorithm combining grid-layout and spring embedder for complicated cellular location information[BMC Bioinformatics. ]
*Kojima K, Nagasaki M, Miyano S.**BMC Bioinformatics. 11335* - LucidDraw: Efficiently visualizing complex biochemical networks within MATLAB[BMC Bioinformatics. ]
*He S, Mei J, Shi G, Wang Z, Li W.**BMC Bioinformatics. 1131*

- PubMedPubMedPubMed citations for these articles

- An efficient grid layout algorithm for biological networks utilizing various bio...An efficient grid layout algorithm for biological networks utilizing various biological attributesBMC Bioinformatics. 2007; 8()76

Your browsing activity is empty.

Activity recording is turned off.

See more...