Connecting the Dots in Visual Analysis - PDF

Please download to get full document.

View again

of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Public Notices

Published:

Views: 4 | Pages: 8

Extension: PDF | Download: 0

Share
Related documents
Description
Connecting the Dots in Visual Analysis Yedendra B. Shrinivasan Eindhoven University of Technology The Netherlands David Gotz IBM Research USA Jie Lu IBM Research USA ABSTRACT During visual analysis, users
Transcript
Connecting the Dots in Visual Analysis Yedendra B. Shrinivasan Eindhoven University of Technology The Netherlands David Gotz IBM Research USA Jie Lu IBM Research USA ABSTRACT During visual analysis, users must often connect insights discovered at various points of time. This process is often called connecting the dots. When analysts interactively explore complex datasets over multiple sessions, they may uncover a large number of findings. As a result, it is often difficult for them to recall the past insights, views and concepts that are most relevant to their current line of inquiry. This challenge is even more difficult during collaborative analysis tasks where they need to find connections between their own discoveries and insights found by others. In this paper, we describe a context-based retrieval algorithm to identify notes, views and concepts from users past analyses that are most relevant to a view or a note based on their line of inquiry. We then describe a related notes recommendation feature that surfaces the most relevant items to the user as they work based on this algorithm. We have implemented this recommendation feature in HARVEST, a web based visual analytic system. We evaluate the related notes recommendation feature of HARVEST through a case study and discuss the implications of our approach. Index Terms: Retrieval models 1 INTRODUCTION H.3.3 [Information Search and Retrieval] Interactive visualizations allow users to investigate various characteristics of a dataset and to reason based on patterns, trends and outliers. During complex visual analyses, users must derive insights by connecting discoveries made at different stages of an investigation. However, during a long investigation process that can span hours, days or even weeks, it becomes difficult for users to recall the details of their past discoveries. Yet these details may form the key connections between their past work and current line of inquiry. We believe that the difficulty in recalling past work often leads users to overlook important connections. The challenge, therefore, is to develop techniques that assist in connecting the dots by uncovering connections to users past work that would normally go unnoticed. To address the challenge of recalling past work, users often externalize interesting findings or new hypotheses using either annotations on top of visualizations or through bookmarks in electronic notes. These notes help users to manually revisit and review their past analysis. However, as the number of notes and annotations grows larger, users again have difficulty recalling the details of each previous discovery. Therefore, users must be enabled to more easily retrieve related views (visualization states with one or more visualizations), notes and concepts (including data characteristics investigated in the views and entities from notes) from their past analyses. These related views, notes and concepts can then help them to find interesting connections within their analysis. In this paper, we describe a context-based retrieval algorithm that retrieves views, notes and concepts from users past analysis related to a view or a note based on their line of inquiry. Whenever users create a view or record a note, we derive a context description for the view or note from their line of inquiry. Our algorithm then uses these context descriptions to retrieve the most relevant views, notes and concepts from past analyses. Using our context-based retrieval algorithm, we have implemented a related notes recommendation feature in HARVEST, a web based visual analytic system. As users create new views during their analysis, HARVEST dynamically applies our algorithm to recommend the most relevant notes from past analyses. An overview of related notes is presented as a ranked list of notes along with a thumbnail of associated views in the note-taking interface. An overview of related concepts is also shown using a tag cloud. Both overviews are updated after each exploration action. We evaluate the related notes recommendation feature of HARVEST through a case study and discuss the implications of our approach. Specifically, we believe that the related notes recommendation feature helps users to maintain greater awareness of relevant information and assists in connection discovery during visual analysis. 2 CONNECTING THE DOTS We encounter a lot of information during daily activities. We process that information to learn new things, perform tasks or make decisions, and store that processed information in our memory. However, our memory is limited in its ability to store and recall relevant information from the past [4]. To overcome these limitations, we have learnt to work around by taking notes, capturing pictures and videos, or associating with a local environment [12]. In addition, we also create to-do lists and automatic reminders using personal information management systems [15]. These external attention pointers help us remember information that would otherwise be forgotten. Thus, we try to connect the dots using these attention pointers and make sense of information encountered in our daily activities. Also, when we read a text, we process information from it to understand the story conveyed by its authors. For this, we need to connect the dots at various parts of the text and make sense of it. A good text provides relevant attention pointers in the text that helps a reader to connect the dots. For example, authors of academic text use cross-referencing as a reminder that helps readers to locate relevant pieces of information from other locations. Similarly, authors of fiction text use sequences of events or people and context descriptions as attention pointers that help readers to connect the dots. During a visual analysis, analysts encounter much information by interactively exploring large datasets using visualizations. They also formulate some interesting findings during this exploration process. Due to the volume of information discovered during a long analysis task, they often externalize interesting findings or new hypotheses using either annotation on top of visualizations or through bookmarks in electronic notes. They organize those findings into a case and present them to others [11, 17]. They must often connect insights discovered at various points of time and make sense of them [10]. However, during a long investigation process that can span hours, days or even weeks, it becomes difficult for users to recall the details of their past discoveries. Therefore, it is difficult to connect the dots during a visual analysis. Hence, we think it will be helpful for the users to retrieve notes, views and concepts that are related to a given view or note based on their line of inquiry. Also, during a visual analysis, the most relevant items from past analyses related to their current line of inquiry can be recommended for maintaining awareness of relevant information and to assist in connection discovery. 3 RELATED WORK First, we present a number of sense making models that highlight the critical role of connection discovery during information analysis. We then discuss related work that specifically addresses connecting the dots during visual analysis. 3.1 Sense Making Models Kuhlthau [14] considers a sense making process as an information search process in which a person is forming a personal point of view [5]. She identifies six stages in an information search process from a user s perspective: initiation, selection, exploration, formulation, collection and presentation. She modeled the cognitive, affective and actions aspects involved in these six stages by conducting longitudinal user studies involving various public library users, students and academic researchers. Finding relevant information to the current topic and being aware of related information are some of the important actions during the exploration and collection stages. These actions help to avoid premature closure of an information search process. Similarly, Ellis [6] classifies information seeking activities into eight categories: starting, chaining, browsing, differentiating, monitoring, extracting, verifying and ending. She models the process of connection discovery in the information search process in two categories: chaining and monitoring. Chaining involves following a referential connection between information sources. Monitoring involves maintaining awareness by tracking related information sources. Pirolli and Card [16] identify two major loops in the sense making process during an intelligence analysis task: the information foraging loop and the sense making loop. They found that analysts look back into the processed information (evidence file) obtained during the information foraging loop from the sense making loop to search for evidences or relations that support a hypothesis. If no supporting information is found, analysts continue to forage new information. 3.2 Visual Analysis In general, to support the reasoning process in information visualization [17], users are provided with three type of linked views: a data view, a knowledge view and a navigation view. The data view has interactive visualization tools; the navigation view provides an overview of exploration process, for instance, history tree and action trails; and the knowledge view helps to record and organize notes. Currently, during an analysis, the connection discovery process is supported by exploiting the relationships shared between either views and notes, or entities in notes. Using Links between Views and Notes Several information visualization tools support links between views and notes. In Aruvi [17], users can externalize findings using notes along with links to the views. They can revisit views via notes and review and revise their analysis. To support the review process, it also provides an overview of key visualization and data aspects in an exploration process using a user interest model [18]. They can also retrieve visualizations from the past analysis using keyword and similarity search mechanisms. Sense.us [11], a web site supporting asynchronous collaboration across a variety of visualization types, supports view sharing, discussion, graphical annotation, and social navigation. It has a doubly-linked discussion mechanism that supports situated conversation about visualizations. For this, both data and view parameters of visualization states are indexed and associated with the corresponding comments. Thus, during an asynchronous collaboration, all comments associated with a view are retrieved. Using Entities A combination of text analytics and information visualization has been widely used to analyze massive textual data. Text analytics is used to extract entities from the text and the relationship between those entities is visualized. The Have Green framework [20] uses an interactive graph visualization to represent concepts and relationships extracted through its analytical capabilities. In Jigsaw [19], multiple coordinated views are used to visualize the connections between entities extracted from a collection of text documents. A graph view is used to visualize text documents and entities shared among these documents. In addition to graph visualization, a list view is used to show the connection between entities. A scatterplot view is used to explore pairwise connections between entities. However, in Have Green and Jigsaw text analysis is used on the input data, but not applied to a user s notes. Analyst s Notebook [13] visualizes the relationship among entities extracted from a user s notes using graph visualization. In Entity Workspace [1], users can record notes or place text snippets, entities and their relationship from notes and documents are extracted and a document-entity graph is constructed. Using this graph model, analysts can re-find facts quickly, notice connections between entities, abstract information structure and identify documents and entities to explore further. During a collaborative analysis, the most valuable notes from other analysts related to the current topic (text) are recommended to an analyst using an entity graph. Thus the entity workspace identifies related entities and helps analysts to connect the dots while investigating a text document corpus. Also, in InsightFinder [2], users notes are used to build a context model. Using this context model, the most relevant page units are recommended to them while browsing the internet. During a visual analysis, users formulate findings after some exploration as identified in Pirolli and Card s sense making model and Kuhlthau s information seeking process model. For connection discovery in visual analysis, approaches based on links between views and notes or entities in notes are not sufficient. The users line of inquiry has to be considered in combination with view and data parameters of views and entities in notes. We now present our approach to connect the dots in visual analysis, by considering the users line of inquiry, view and data parameters of views, and entities in notes in an integrated way. 4 APPROACH To support the connection discovery process in visual analysis, we enable users to retrieve views, notes and concepts from past analyses related to a view or note. Figure 1 shows our approach. Whenever they create a view of their data (in data view) or record a note (in knowledge view), we derive a context description for the view or note from their line of inquiry. Our algorithm then uses these context descriptions to retrieve the most relevant views and notes from past analyses. The context description is derived from a model of visual analytic activity called action trails [10]. Action trails represent users analytic activity as graphs of semantic analytic steps, or actions. Actions can be classified into broad categories: exploration actions, insight actions, and meta-actions. An exploration action alters the visualization specifications in a visual analytics system and creates a new view. Insight actions record or organize notes and Figure 1: A context-based retrieval system that retrieves related notes, views and concepts for a view or a note based on the users line of inquiry. This retrieval system is used to support the connecting the dot process during a visual analysis. views, while meta-actions (e.g., revisit, undo, redo) allow users to review and structure their lines of inquiry. Action trails contain valuable information about the concepts that are most relevant to a user s analysis and how the user s interests evolve over time. We therefore extract a set of concepts from the action trail to form the context description for each view or note. We extract two types of concepts. Action concepts are derived from the attributes associated with exploration actions (e.g., data and view parameters). Entities are concepts extracted from a user s notes and represent items such as people, places or companies. For each concept associated with a view or note, we derive a concept weights from the user s action trail to determine its degree of salience at the time the view or note was created. For a view or note focused by the user, we compute the relevance score to existing views and notes by comparing the context descriptions of existing views and notes with that of the given view or note. Using the relevance score, the related views and notes are retrieved. An overview of the related concepts is also provided. Thus, this context-based retrieval algorithm surfaces the most relevant information from the past analyses of the users based on their line of inquiry during a visual analysis. Using this context-based retrieval algorithm, we have implemented a recommendation feature in HARVEST, a web based visual analytics system which is shown in figure 2. The recommendation feature shows a list of related notes (figure 2(c)) along with thumbnails of the view displayed while recording those related notes (figure 2(d)) to the current view (figure 2(a)). Also, it provides an overview of related concepts using a tag cloud (figure 2(e)). In the following sections, we describe the context-based retrieval algorithm (section 5) and present the design considerations (section 6) and implementation details (section 7) of the recommendation feature in HARVEST. 5 CONTEXT-BASED RETRIEVAL ALGORITHM In this section, we describe the details of our context-based retrieval algorithm. First, we present a visual analysis use case. Next, we support our argument for a context description based on action concepts and entities from action trails with the use case. We then use the context description as the basis for the relevance metric used to identify related views, notes, and concept. 5.1 Use Case Figure 3 shows a portion of an action trail for an analyst investigating product sales data. She starts her analysis by focusing on sales that are more than $50,000 (figure 3(1)). She compares sales of each product using a scatterplot visualization and bookmarks it (figure 3(2)). Then, she studies quarterly sales of the products by aggregating the sales represented on the y-axis of the scatterplot based on a quarterly time period (figure 3(3)). Next, she uses a tree map to visualize the sale figures in various regions (figure 3(4)). Further, she clusters the products by their category to get an overview of the sales performance by product category in various regions (figure 3(5)). This view triggers her to reconsider the products sales comparison that she investigated some time back. She therefore revisits the comparison view she bookmarked earlier. Then she narrows down to the east and south regions (figure 3(6)). This revisit and reuse of a view creates a branch in her action trail. She further slices the products in the x-axis of the scatterplot by their category; and slices sales in the y-axis of the scatterplot by quarterly period (figure 3(7)). This slicing creates a scatterplot matrix showing sales of various product categories in different quarters of the year. She finds out that product categories A, C and D have shown profit consistently in the east and south regions. She records this finding using a note. Then, she continues her analysis by studying yearly sales (figure 3(8)) and sales distribution across regions using a map (figure 3(9)). 5.2 Action Concepts as Context In the products sales use case, the user started her analysis with general sales data and moved on to investigate quarterly and yearly sales trends. Region was another aspect considered in the investigation; she focused on all regions, then narrowed down to the east and south regions, and finally moved on to see the actual geographical sales distribution. She also investigated the sales of individual products as well as product categories (groups of products). The action concepts associated with this action trail (e.g., the east region and product category) correspond to the user s information interests. However, some of the action concepts were more predominant at certain times than others. For instance, she was interested only in sales of more than $50,000 throughout the investigation. In contrast, she shifted her focus among other action concepts such Figure 2: A user investigating a finance dataset in HARVEST, a web based visual analytics system. (a) The data view shows a visualization created by the steps shown in the user s action trail (f). (b) A note-taking interface. (c) A ranked list of related notes. (d) Thumbnail of the view displayed while recording those related notes. (e) Related concepts overview - An overview of related entities from notes (underlined) and related action concepts from action trails. as quarterly sales, product categories, and regions. Her interest in these action concepts varied over time. Therefore, during an exploration process, users evolving information interests can be viewed as a time-varying set of weighted
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks