A General Introduction To Graph Visualization Techniques - PDF

Please download to get full document.

View again

of 14
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Documents

Published:

Views: 4 | Pages: 14

Extension: PDF | Download: 0

Share
Related documents
Description
A General Introduction To Graph Visualization Techniques Raga ad M. Tarawneh 1, Patric Keller 2, and Achim Ebert 1 1 Computer Graphics and HCI Group University of Kaiserslautern, Germany {tarawneh,
Transcript
A General Introduction To Graph Visualization Techniques Raga ad M. Tarawneh 1, Patric Keller 2, and Achim Ebert 1 1 Computer Graphics and HCI Group University of Kaiserslautern, Germany {tarawneh, 2 Software Engineering: Dependability Group University of Kaiserslautern, Germany Abstract Generally, a graph is an abstract data type used to represent relations among a given set of data entities. Graphs are used in numerous applications within the field of information visualization, such as VLSI (circuit schematics), state-transition diagrams, and social networks. The size and complexity of graphs easily reach dimensions at which the task of exploring and navigating gets crucial. Moreover, additional requirements have to be met in order to provide proper visualizations. In this context, many techniques already have been introduced. This survey aims to provide an introduction on graph visualization techniques helping the reader to gain a first insight into the most fundamental techniques. Furthermore, a brief introduction about navigation and interaction tools is provided ACM Subject Classification A.1 Introduction and Survey, B.7.2 Design Aids Keywords and phrases Graph Visualization, Layout Algorithms, Graph Drawing, Interaction Techniques Digital Object Identifier /OASIcs.VLUDS Introduction One goal of information visualization is to provide techniques for converting (abstract) information e.g., in form of textual description into visual representations facilitating the perception and handling of hidden structures from underlying data sets [18]. In cases in which corresponding data elements have inherent relationships among each other, graph visualization methods are commonly applied to support the better understanding. Definition 1. Formally, a graph G = (V, E) is a mathematical structure consisting of two sets, V the set of vertices (or nodes) of the graph, and E the set of edges. Each edge has a set of one or two vertices associated to it, which are called endpoints [53]. Many application areas use graphs to represent existing structures: For example, in social networks people of a group my represent the vertices of a graph where the different relations among them are represented by a set of edges. In other areas, like biology and chemistry graphs are widely used to represent molecular and genetic maps, as well as protein production paths. In the field of software engineering, graphs are used e.g., to represent the structure of complex software systems, or to represent the internal behavior/states of compilers. In the object-oriented field, graphs are used to depict the relations among different classes, e.g., UML diagrams. In general, any hierarchical structure may be represented as a tree, which is a subtype of a graph. An example for this sort of structure is the file structure of an Raga ad M. Tarawneh, Patric Keller, and Achim Ebert; licensed under Creative Commons License ND Proceedings of IRTG 1131 Visualization of Large and Unstructured Data Sets Workshop Editors: Christoph Garth, Ariane Middel, Hans Hagen; pp OpenAccess Series in Informatics Schloss Dagstuhl Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany 152 A General Introduction To Graph Visualization Techniques operation system. The organization structure of an institute may also be represented as a tree. For more information we refer to [53, 22]. Although graph visualization techniques are widely used in many application domains, they have some limitations one has to deal with. For example, the size of the represented graph may become an issue, e.g., providing layouts for very large graphs is possible, but often comes along with the loss of readability, at least for untrained users. This is associated with the limited human cognitive power and the screen space constraints given by the visualization devices. Providing a suitable technique helping the user interacting and navigating through the data is another important issue. The goal of graph visualization techniques is to increase the comprehension level of the data by providing intuitive, intelligible layouts as well as suitable interaction mechanisms. This survey is organized as follows: In Section 2 we present a general overview concerning layout algorithms and a set of criteria to generate clean layouts. In order to increase the comprehension level of the visualized information, many interaction techniques have been proposed in the literature. In this context we present a brief introduction in Section 3. We conclude the survey by Section 4. 2 Graph Layout Algorithms As mentioned in Section 1, a graph consists of a set of nodes connected by a set of edges. The trivial way to display this sort of data is to use node-link diagrams. They depict the relations among the data elements in form of lines [53, 22]. In [25], another visualization approach is proposed to display graph structures by exploiting space-filling techniques or space-nested layouts which implicitly represent the relations. This section provides an overview describing both approaches and the used algorithms. 2.1 Node-Link Layouts The basic requirement of the node-link layout concerns the computation of the coordinates of the nodes and the representation of the lines. To increase the readability a clean layout should comply with the following criteria [29]: Nodes and edges should be evenly distributed. Edge-crossings should be minimized. Depict symmetric sub-graphs in the same way. Minimize the edge bending ratio. Minimize the edge lengths, which helps readers detecting the relations among different nodes faster. In cases where the data is inherently structured distribute the nodes into different layers. This increases the understandability of the underlying graph. For example, in data-flow diagrams it is recommended to separate the graph elements into different layers in a way that the final representation reflects the original semantics. Many other criteria have been proposed in the literature, for more details please refer to [53, 29]. It is worth mentioning that it is hard to combine most of the criteria. Some of them conflict with others. In contrast, others are hard to realize in an efficient way. Many solutions have been proposed [29, 53] to overcome these issues. Most algorithms in practice represent a trade-off. Specifying the required criteria is an application dependent process. Prioritizing a set of criteria is an important pre-condition for finding suitable layout algorithms. The work of [41] concentrates on the topic of how to prioritize such criteria. R. M. Tarawneh, P. Keller, and A. Ebert The Spring Layout Algorithm The spring layout algorithm is widely known as force-directed layout, which was originally proposed by Eades in 1984 [12]. Due to its simplicity and its ability to produce a symmetric layout, the force-directed layout is considered to be one of the popular node-link layouts. The spring layout algorithm represents the graph as a physical system, in which the graph nodes are considered to be a charged particles connected to each other using a set of springs. The first model was proposed by [12]. Each node is associated with two types of forces: attraction forces and repulsive forces. Given the node coordinates and the spring attributes the method aims at reducing the total energy of the the spring system by repositioning the nodes. The attraction force f a is applied to the neighbor nodes which are connected by a spring, while the repulsive force f r is applied to all graph nodes. These forces are defined as follows: f a (d) = k a log(d), f r (d) = k r d 2 (1) where k a and k r are constants and d is the current distance between two nodes. Figure 5b shows a small example that emerged from applying this algorithm. Although, the forcedirected approach produces clean, symmetric layouts with respect to graphs having moderate sizes, it is considered to be one of the expensive algorithms. Its time complexity exceeds O(n 3 ) (see [53, 29, 22]), where n is the total number of nodes. Moreover, force-directed layouts lack in terms of predictability ([53, 29, 22, 58]), meaning that running the algorithm twice, produces different results. This leads to problems in maintaining the users mental map during the interaction with unstable layouts [58]. Despite the mentioned disadvantages, the force-directed layout algorithm has been widely used in many visualization frameworks [56]. Furthermore, the algorithm itself has been revisited and optimized many times to overcome its characteristic drawbacks (see [27, 14, 16, 13, 24, 53, 29, 24, 7]) Topological Feature-Based Layout The feature-based graph drawing concept has been proposed by Archambault et al. [2]. The concept is called TopoLayout, which is a multi-level, feature-based approach. The pipeline of this approach consists of four main steps, the first one is called the decomposition phase in which the graph is decomposed into many sub-graphs based on the topological features of each internal sub-graph. For example, if the nodes in one sub-graph are topologically connected among each other in form of a tree, then the set of nodes are grouped together representing a meta-node. Currently, TopoLayout detects seven topological features, including trees, complete graphs, bi-connected components, clusters, and the undefined structure that is called unknown feature. For more details we refer to [2]. The second step called the feature layout phase in which the meta-nodes or the grouped sub-graphs are laid out using one of the layout algorithms (tuned with its topological feature). The third phase called the crossing reduction phase aims to eliminate the crossing ratio in the produced layout. Finally, the overlap elimination phase aims to change the node sizes in the final layout to ensure that no nodes overlap each other. The final result for TopoLayout is a tree representing the graph hierarchy, in which each node represents a sub-graph in the original graph and each meta-edge represents the relation between tow sub-graphs in the original graph. This layout technique helps in drawing relatively large graphs. Also, it provides the user with details about the internal structure of the graph, which can be useful in extracting more information about the graph itself (see Figure 1). GrouseFlocks [3] was introduced to provide an interactive way to explore large input graphs through cuts of a superimposed hierarchy. V L U D S 1 1 154 A General Introduction To Graph Visualization Techniques Figure 1 Layout generated by using the TopoLayout algorithm of [20]. The goal was to provide the user with the ability to see several different possible hierarchies of the same graph. Before we introduce the tree layout concepts, it is worth to mention that both forcedirected algorithms and the TopoLayout algorithm work perfectly with undirected graphs. Unfortunately, not many algorithms were designed for visualizing directed graphs. The Sugiyama algorithm was one of the first algorithm for drawing directed graphs [50]. The basic approach is to first layer the graph nodes, which means assigning a layer for each node and placing the nodes into the corresponding layer. Also, the algorithm includes two steps for reducing the edge-crossings and the node-overlappings. In general, directed graph layout algorithms are difficult to implement, this is due to the complexity of directed graphs. Therefore, many of the Sugiyama algorithm steps are considered to be NP-hard (see [17]), and some of them are NP-complete (see [11]) Planar Graphs Planar graphs are graphs that can be drawn without edge crossings in linear time. They emerge in various fields: CAD systems, circuit schematics, VLSI schematics, entity relationships diagrams and information system design [53, 29, 22]. To generate a planar layout for a general graph, some pre-requisites have to be fulfilled [22]: Testing whether it is possible to draw the given graph without edges crossings or not. Finding a planar layout algorithm satisfying the required application constrains. Drawing a planar graph is supported by two well known algorithms, the first one called Fraysseix [9], Pach [28] and Pollack (FPP) [46] generates a drawing of a graph G on a grid of size (2n 4) (n 2) in n log(n) time. Later, the FPP algorithm was improved to run in linear time [28]. The second algorithm has been proposed by Schnyder [46]. It attempts to find a straight line embedding on a grid of n 2 nodes and runs in linear time. An example of a planar graph is shown in Figure 4b. 2.2 Tree Layout Many layout algorithms have been already proposed. In general, this may be attributed to the tree structure s simplicity and popularity. As a good starting point for tree layout algorithms we refer [53, 29] Node-Link Tree layout Algorithms One of the basic approaches to draw a tree is to use node-link diagrams in which the parent-child relations are depicted as edges (see Section 2.1). The classical tree layout R. M. Tarawneh, P. Keller, and A. Ebert 155 algorithm proposed by [42] is one of the early methods (see Figure 2a), it produces clean trees-representations in 2D and its implementation is straight forward. However, the technique is not declared space efficient because of its preference for one of two dominating growth directions, i.e., horizontal growth or vertical growth. To cope with this problem some compact tree layout algorithms have been implemented to produce a classical tree appearance in more compact fashion [10, 53, 29, 58, 22, 7]. Another example of a node-link tree layout is the radial layout algorithm which was proposed by Eades [10]. A radial layout including its variations, places the root in the middle of co-centric circles and distributes the children of a sub-tree into circular shape according to their depth in the tree recursively. The radial layout uses space in more efficient way than the classical method. But it lacks the understandability of classical tree layouts, e.g. it is difficult to find the root of the tree (see Figure 2b) [53, 10, 39, 59]. As a sibling of the radial layout, the balloon layout has been introduced in [6]. Here, sibling sub-trees are drawn in a circle centered at their parents. This layout is effective in showing the tree structure. The balloon layout can be obtained by projecting a cone tree onto a plane [43, 53, 29, 58, 22] (see Figure 2c). H-Tree produces a classical layout for binary trees and works perfectly for balanced trees. But, again, it is hard to identify the root position [47] (see Figure 2d). All tree layout algorithms produce predictable results in at least linear time (the usual the complexity reaches from O(n log(n)) to O(n)) [53, 29]. As a result of the comparison of different tree layout algorithms, we observed that the classical tree layout perfectly depicts the hierarchy structure of the tree, with sacrificing the screen space. While the radial layout, the h-tree layout, and the balloon tree layout use the screen space more efficiently but with difficulties in finding the root [53, 29, 58, 39]. (a) Classical tree layout, produced with [19]. (b) Radial tree layout Example. (c) Balloon tree layout: produced by [22]. (d) H-Tree layout: produced by [22]. Figure 2 Tree Layout Examples. V L U D S 1 1 156 A General Introduction To Graph Visualization Techniques Space-Filling Techniques Space-filling techniques can be subdivided in two types: the Space-Division layout and the Space-Nested layout. In the Space-Division layout, the parent-child relation is depicted implicitly by attaching the children to their parent. Sunburst algorithm uses radial or circular space-filling techniques. The general belief of the developer community is that radial layout methodology better convey a hierarchy s structure without sacrificing the efficient screen space usage [49, 26]. One of the problems of this layout is that it is difficult to distinguish between the child-parent relationships and the sibling relationships, because both of them are expressed using adjacency. Moreover, due to the different number of children for each parent, the nodes sizes are difficult to control, the final layout might occupy a large space for node, which has many children. While other nodes are represented using a tiny thin rectangle that is not enough to show the node s label or the node s color (see Figure 3a). Whereas, in Space-Nested layouts the child-parent relationship is drawn using nested boxes. The idea is to place the children within their parent node. A good common example is the Treemap, (see Figure 3b) [25, 48]. Nodes are represented as rectangles, each rectangle is subdivided into number of sub-rectangles equal to its children number. The subdivision process is performed recursively. This technique is popular because it uses the screen-space efficiently, and it shows the size of the leaves in a tree. However, this technique lacks the ability of showing the hierarchical structure of the tree. Also, due to the subdivision process it is highly possible to produce long and thin rectangles, which leads to problems in with the interaction (especially in selecting or highlighting the rectangles) [25, 48, 58, 22, 55]. (a) SunBurst layout. (b) TreeMap layout. Figure 3 Examples of space-filling techniques [19]. 2.3 Matrix Visualization The matrix visualization is another technique that represents graph nodes relations implicitly (see Figure 5a). Here, each row and each column represent a node. The edge between two nodes is represented by the cell at which the corresponding row and column intersect. Edge attributes can be shown using different visual parameters such as color, size and shape. The scalability is limited, but the layout can produce clean representations of graphs having moderate size. However, the way the data is represented makes it difficult to detect the graph paths. For more details please refer to [1, 21]. R. M. Tarawneh, P. Keller, and A. Ebert 157 (a) Hyperbolic tree layout, produced with [52]. (b) Planar Graph Example. (c) TreeCube layout, produced by [51]. (d) Cone trees, produced by [43]. Figure 4 Graph layout Examples D Layout In addition to 2D representations, many layout algorithms have been extended to 3D. The reason behind it is that we are familiarized with 3D in the real world. So it is often more natural for us to explore data in 3D space. One example for a 3D layout is Treecube (see Figure 4c), a technique that has been proposed by [51] as an extension for the traditional treemap layout; it uses nested cubes to represent the parent-child relationships. The hyperbolic layout algorithm appeared for the first time in [32, 33], then it has successively been used by many others (e.g., [38, 37, 36]). The idea was to distribute the data entities over the hyperbolic space. Figure 4a shows an hyperbolic tree layout for a walrus-directory graph, which has been generated using the Walrus package [52]. This method can be displayed in 2D and 3D, providing a distorted view of the tree, which makes the interaction with large trees easier [22]. It is worth to mention that most of the force-directed techniques could be generalized easily to three dimensions (see [8]). Conetree [43] is a technique that was originally developed to layout trees in 3D space. It places father nodes at the top of a cone with its children arranged evenly in the cone base. The layout has many layers; each one represents a tree level, in which all cones have the same height. The cone-base diameter for each layer is reduced in bottom-up fashion. This helps the lowest layer to fit into the width of the box containing the full cone tree, see Figure 4d. Based on [34], 3D visualization techniques face multiple challenges: First, objects in 3D may occlude each other which causes an issue while exploring the data set. Second, providing a suitable layout algorithm that assures less object-overlapping and reduces the edge-crossing is also considered as a big challe
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks