Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advanced Review Graphical user interfaces Wendy L. Martinez∗ This article provides a brief introduction to graphical user interfaces or GUIs. The first section defines graphical user interfaces, describes interface components, and the different types of GUIs. This is followed by a short discussion of GUI design principles and descriptions of some tools for easily creating GUIs. The article concludes with some examples of GUIs that would be of interest to researchers and statisticians. 2011 John Wiley & Sons, Inc. WIREs Comp Stat 2011 3 119–133 DOI: 10.1002/wics.150 Keywords: MATLAB; CRAN R; GUI design; software development INTRODUCTION G raphical user interfaces or GUIs are everywhere today, even on our smart phones. In fact, the major personal computer operating systems like Microsoft Windows and the one used on the Apple Macintosh are GUI based. Thus, it is a good idea to learn more about them and to take advantage of this human–computer interface. The result will be a wider audience for one’s research accomplishments. This article first provides some background information on graphical interfaces, to include definitions, the historical development of GUIs, user interface components, and different types of graphical interfaces. Most readers of this article are probably researchers and statisticians and are not GUI developers. However, it is possible that some might want to implement their latest algorithms and methods in a graphical interface to make it easier for a wider audience to use them. Thus, the background section is followed by some design principles and guidance in developing GUIs. Next is a description of some tools for creating GUIs, with a focus on software that many data analysts use. This includes the MATLAB and R computing environments. The article concludes with some GUI examples that work with these two analytic packages. BACKGROUND This section includes a definition of GUIs, followed by a discussion of the historical development of GUIbased mechanisms for human–computer interactions. ∗ Correspondence to: [email protected] Department of Defense, Fredericksburg, VA 22405, USA DOI: 10.1002/wics.150 Vo lu me 3, March/April 2011 This is followed by a description of the major types of GUIs, with an emphasis on those used for data analysis. This section is concluded with a list of individual components that can be included in a GUI application. Definition of GUIs A GUI (pronounced ‘gooey’) provides an easy way for humans and computers to interact and communicate.1,2 GUIs take advantage of the graphics capabilities of computers to make this communication easier by hiding the details of the programming language from the user. This type of interface uses windows, icons, menus, buttons, drop-down lists, dialog boxes, and more, as a means of communication between humans and computers. These graphical widgets are usually activated when the user manipulates them with a mouse or other pointing device. GUIs are typically event-driven, which means that some task is executed whenever an event is detected, such as the user clicking on a button or a menu item.3 For example, computer programs can be opened and executed by clicking on their icons, or dialog boxes can be invoked when a menu item is selected. It might help to compare a GUI with the command line interface. With this type of interface, the user is presented with a blank screen and a prompt. All communications with the computer are accomplished via the keyboard, where commands are entered by the user. The command line interface assumes that the user knows the language required to communicate with the computer and to accomplish the tasks. Some examples of command line interfaces are the MS-DOS and Linux (in console mode) operating systems. History of GUIs Some4 trace the beginning of GUI concepts to Vannevar Bush who worked at the Massachusetts 2011 Jo h n Wiley & So n s, In c. 119 wires.wiley.com/compstats Advanced Review Institute of Technology (MIT). In 1945, he published an article5 in The Atlantic Monthly, where he described a tool called memex. This article was based on ideas he had starting in the early 1930s. He envisioned a tool that combined technologies such as high resolution microfilm, multiple graphical screen viewers, cameras, and keyboards, with the goal of storing data in such a way that it is accessible and linkable. This memex laid out ideas that have become reality today with the advent of personal computers and the World Wide Web. Another early pioneer in developing GUI concepts was Ivan Sutherland. He was a graduate student at MIT and wrote his dissertation6–8 in 1963 on something he called Sketchpad, a man–machine graphical communication system. Sutherland’s Sketchpad system allowed the human and computer to communicate using line drawings created on a CRT screen with a light pen. He also included the ability to zoom in and out on the display and to store objects in memory for later recall. In the 1950s, Douglas Englebart was inspired by Vannevar Bush’s ideas, and he started thinking about how a machine could be used to enhance the human’s ability to exploit information and turn it into knowledge.9,10 As a radar operator during the war, Englebart had some experience with processing information graphically, and he saw the computer as a tool for augmenting the human intellect rather than replacing the human. Englebart and his staff at the Augmentation Research Center at the Stanford Research Institute worked on many innovative ideas for human–computer interaction and demonstrated them at a 1968 computer conference in San Francisco. One of the interfaces presented at the conference was the first mouse, and it became accepted as the best and most natural way to maneuver a cursor on a viewing screen. The group also demonstrated a system called oN-Line System (or NLS) that included collaboration between multiple networked computers. The first usable GUI was developed by researchers at the Xerox Palo Alto Research Center (PARC) in 1977.4 This was called the Xerox Star. The developers were very careful to design the human–computer interface before they worked on the actual application. However, the GUI turned out to be too slow and was not a commercial success—perhaps because of its high cost. Steve Jobs of Apple Computer learned of these developments at PARC, including the Xerox Star. He ended up hiring several of the PARC researchers, and this inspired him to eventually use GUIs in Apple computers. A predecessor to the Macintosh (successfully launched in 1984) was 120 the Lisa computer.4 This computer initially had a command line interface, but was soon converted to a GUI interface—mostly because of the newlyhired PARC people. The Apple Computer company extended the original GUI ideas at PARC to include a menu bar, drop-down menus, overlapping windows, icons, and more. In a broad sense, one might say that the methods for human–computer interaction that were eventually incorporated into this original Apple Macintosh provided the basis for the GUIs we have today. There were other companies working on creating GUIs for personal computers and workstations in the 1980s. Some examples of the GUI applications include VisiOn (made by VisiCorp), Interface Manager (the original Microsoft Windows), GEM (a GUI for DOS and the Atari computer), Workbench for the Commodore Amiga, and the X Window System for UNIX. See Reimer4 for an excellent description and screenshots of these applications and more. Types of GUIs This section includes a list and description of the general types of GUIs in use on personal computers today.3 One of the most common types is based on menus and dialog boxes, and it is one we are familiar with in operating systems like Microsoft Windows. It is also commonly found in most commercial statistical data analysis packages, such as SPSS (http://www. spss.com/), Minitab (http://www.minitab.com/enUS/default.aspx), Statistica (http://www.statsoft. com/), JMP (http://www.jmp.com/), Systat (http:// www.systat.com/), S-Plus (http://spotfire.tibco.com/ products/s-plus/statistical-analysis-software.aspx), and others. Another type of GUI is based on a spreadsheet or tabular interface.3 It might seem strange to think of this as a GUI, but a little thought should convince the reader that it is a form of human–computer interface for data analysis. The major part of this type of interface consists of a table of cells, where each cell can contain numbers, text, or formulas. The analysis is accomplished by performing computations on the rows or columns using either internal functions or formulas. Most spreadsheet programs (or interfaces) include other GUI components, such as menus and dialog boxes. Examples of this type of interface are Microsoft Excel, Gnumeric (http:// projects.gnome.org/gnumeric/) (part of the GNOME (http://www.gnome.org/) desktop environment), and Open Office (Calc module). Some data analysis packages have a notebook interface. This can be considered as a type of word processor that also includes the 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces ability to perform computations. The document itself can have text and code that is executed by the underlying computer algebra engine. Some examples of software that have notebook interfaces are Mathematica (http://www.wolfram.com/), Mathcad (http://www.ptc.com/products/mathcad/), Scientific WorkPlace (http://www.mackichan.com/), and Maple (http://www.maplesoft.com/). Most of these can be thought of as a type of note pad, where one types in algebraic or mathematical expressions, as if one where writing them on the board or typing them using an equation editor (like MathType) (http://www.dessci.com/en/products/mathtype/). The computational engine then interprets the expressions and provides answers in a similar fashion or traditional mathematical notation. Most also include the ability to produce images and graphics based on the code. With the advent of the World Wide Web, we also have web-based interfaces. These consist of web pages or applications that contain forms and possibly other types of interface components. The forms are filled in and then some data analysis is performed on another server or computer. This type of application has some advantages, such as platform portability, client/server architecture, and availability of browsers. GUI Components The actual appearance of a GUI depends on the operating system and platform, as well as the purpose of the GUI. However, most of them have similar components. Some of the main elements or UIs (user interfaces) of GUIs are described below. • Windows: This is one of the main graphical mechanisms used to group and isolate functionality in GUI-based software. It divides the screen into separate areas, where the user can execute a different program, display directories, show a different plot, etc. • Desktop: The desktop is the main area (i.e., the computer screen) in most operating systems. One could think of this as the main window where the icons are usually displayed. • Icons: Icons are small graphics or pictures that represent tasks, programs, folders, files, windows, and more. These are usually organized on the desktop, toolbars, ribbons, directories, and menus. The main purpose of icons is to convey some helpful information to the user as to the task that will be invoked by clicking on the icon. Vo lu me 3, March/April 2011 • Pointing device: This is typically a mouse, trackball, or touch pad. The pointing device moves the cursor on the screen. It allows the user to select objects on the screen, click to activate user interface components (e.g., buttons, menus), drag files, or objects, resize windows, and more. Some GUIs include help information that is tied to the cursor or pointing device. These are sometimes called tool tips. When the user puts the cursor on some icon or UI, then a little text area shows up—attached to the cursor—with information about that particular interface. • Menus: Menus are used in the great majority of programs that have a graphical interface. These are similar to menus one finds in restaurants—a list of available options from which the user can choose. • Shortcut Menus: This type of menu is very useful. They are activated by right-clicking with the mouse when the cursor is in some area of interest on the GUI. Different menus and options come up, depending on the area where the click happens. Thus, they are tied to the context and are sometimes called context menus. • Drop-down Menus: When there is not enough GUI real estate, one can use drop-down or popup menus. These typically show the current value for the UI and have an arrow pointing down, indicating more choices are available. The user can click on the arrow to get a menu or list of other options. Examples of a drop-down menu are the font type and font size UIs in Microsoft Word. • Toolbars: Toolbars are horizontal strips that typically run across the top of a GUI interface. They are sometimes used to provide interface options similar to menus. The toolbars contain small icons (or buttons) that are grouped together according to their functionality. One could think of the toolbar as a toolkit and the icons representing the tools. In many applications, the user can select what toolbars are visible, where they appear on the GUI, and separate them from the main GUI window. • Ribbons: Programs in the Microsoft Office 2007 suite have a new type of user interface called a ribbon. A set of toolbars are placed on tabs and are organized across the top of the window. Each tab contains icons and control elements that pertain to the functionality designated by the tab. This eliminates the need for different toolbars and icons that clutter the window. 2011 Jo h n Wiley & So n s, In c. 121 wires.wiley.com/compstats Advanced Review untitled1.fig File Edit View Layout Tools Help Tag: figure1 Current Point: [5, 364] Position: [520, 380, 560, 420] FIGURE 1 | Example of a blank layout for MATLAB’s GUIDE tool for creating GUIs. • Dialog Boxes: There are many different types of dialog boxes or pop-up windows—from those that just provide information to those that ask the user to enter something or to make choices. One example of a dialog box is the one that pops up and asks you if you want to overwrite the existing file and another is the open file dialog box in Microsoft Windows. • Buttons: Push buttons are used when the developer wants to present the user with discrete choices, where some action will take place immediately once the button is pushed. There might be several buttons offering a range of choices, and the default button is highlighted. The user can hit the return key instead of clicking on a button; in which case, the highlighted option is executed. • Radio Buttons and Check Boxes: These UIs are similar to the push buttons in that they offer 122 ways for the user to make selections from a short list of options. Radio buttons and check boxes are typically grouped together and written in a way that makes the choices mutually exclusive. In other words, the user can only make one selection. • Text Boxes: GUIs can include interfaces or small areas that allow the user to type in some text. These are useful when the user is allowed to have an infinite number of choices. An example of this type of interface is the URL address box in most browsers. GUI DESIGN PRINCIPLES This section includes several general GUI design principles. While most readers of this journal are likely data analysts and not software developers, some might be interested in creating GUIs that implement 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces Basic Fitting - 1 Select data: data 1 Center and scale x data Numerical results Plot fits Check to diaplay fits on figure spline interpolant shape-preserving interpolant linear quadratic cubic 4th degree polynomial 5th degree polynomial 6th degree polynomial 7th degree polynomial 8th degree polynomial 9th degree polynomial 10th degree polynomial quadratic Coefficients and norm of residuals y = p1*x^2 + p2*x + p3 Find y = f(x) Enter value(s) or a valid MATLAB expression such as x, 1:2:10 or [10 15] Evaluate Coefficients: p1 = –0.023216 p2 = 0.91838 x f(x) p3 = 28.215 Norm of residuals = 30.054 Show equations Significant digits: Fit: 2 Plot residuals Save to workspace... Scatter plot Plot evaluated results Subplot Show norm of residuals Help Save to workspace... Close FIGURE 2 | The top figure shows the Basic Fitting GUI, which is part of the main MATLAB package. This can be invoked via the Tools menu in the figure window. The second figure shows the data, the fit, and the residuals. the methodologies they develop for computational statistics. We discuss some of the tools that GUIs can be developed in the next section. There are several good references and websites that provide guidelines for designing useful GUI11 . The following principles were taken and adapted from Jeff Johnson’s book12 describing the do’s and don’ts of developing GUIs. Principle 1. ‘Focus on the users and their tasks, not the technology’. This principle is first, because it is the most important. It will also become apparent that several of the following principles are corollaries to this first one. It means that one should start the GUI development process by answering the following questions: • Who are the intended customers or users of the GUI? • What activities should it support? • What problems does it solve? • What are the skills and knowledge of the typical user? Vo lu me 3, March/April 2011 • What are the usual processes and tasks that the users need to accomplish? In the ideal case, GUI developers should meet with the intended users to get answers to these questions, but statisticians and researchers do not usually have specific users or customers in mind when developing a GUI to implement some methodology they created. However, the researcher can and should think about these questions from a user viewpoint, because it will save time in coding and redesign. Principle 2. ‘Consider function first, presentation later’. When creating a GUI, one is often tempted to start sketching the look of the GUI using paper or some sort of GUI layout tools. This is still something one must do eventually, but a developer should start with the answers to some of the questions from above. Developers must consider the purpose and role of the GUI first before any coding. Johnson12 recommends that one build a conceptual model, which is a model of the analysis (or product) the users need to understand. 2011 Jo h n Wiley & So n s, In c. 123 wires.wiley.com/compstats Advanced Review Figure 1 File Edit View Insert Tools Desktop Window Help 60 data 1 quadratic 40 20 0 −20 0 10 20 30 Residuals 40 50 60 30 40 50 60 15 Quadratic: norm of residuals = 30.054 10 5 0 −5 −10 −15 0 10 20 FIGURE 2 | Continued. Principle 3. ‘Conform to the user’s view of the task, and do not make it more complicated’. Too often software developers want to make the application what they think it should be, and they tend to include lots of flashy animations and colorful graphics, as well as interface controls that are unnecessary. The whole idea of creating a GUI should be to make to make things easier for the user—not make them more complicated. So, go back to the first principle and make sure that the focus is on the user. Principle 4. ‘Promote learning and deliver information, not just data’. As stated before, one of the objectives for creating a GUI instead of a command line interface is to eliminate the task of learning a complicated programming language or text-based commands. Making GUIs complicated defeats this purpose, and 124 they should not be more complicated than they have to be. It is also a good idea to use the layout of the GUI to promote ease of use and learning—say by grouping UI controls together by their functionality or by providing helpful information (titles, tool tips, etc.) on the GUI itself. Once these principles are addressed—especially the first one—it is time to create a sketch of the GUI, along with task flows and processes. It is also a good idea to allow potential users to take a look at the design and offer ideas for improvement. At this point, it is all right to start writing code to create the GUI, possibly using one of the tools described in the next section. TOOLS FOR CREATING GUIs This section briefly describes several tools for developing GUIs in MATLAB, R, and other platforms. These 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces Distribution Fitting Tool File View Tools Window Help Distribution: Normal Display type: Density (PDF) Density (PDF) Cumulative probability (CDF) Evaluate... Manage Fits... Exclude... Quantile (inverse CDF) Probability plot Survivor function Cumulative hazard geyser data 0.035 0.03 Density 0.025 0.02 0.015 0.01 0.005 0 40 50 60 70 80 90 100 110 Data FIGURE 3 | This shows the distribution fitting GUI from the MATLAB statistics toolbox. two computing environments—MATLAB and R—are emphasized here because researchers and statisticians often implement their methods and algorithms in one of these languages. There are several ways to create GUIs in MATLAB, and there are many GUI creation packages for R. Thus, what follows is just a partial list of what is available. Creating GUIs in MATLAB MATLAB is a commercial computing environment developed by The MathWorks, Inc. for engineering, simulation, modeling, statistical data analysis, and much more (http://www.mathworks.com/). It is used by many statisticians and researchers to prototype their methods, and a lot of user-written Vo lu me 3, March/April 2011 MATLAB programs for state-of-the-art research are available.13,14 Also, MATLAB has excellent visualization and GUI-development capabilities.15 All of MATLAB graphics are based on lowlevel functionality called Handle Graphics. One can write GUIs directly using these low-level functions. However, MATLAB also includes something called GUIDE—Graphical User Interface Development Environment. One could think of this as a GUI for building GUIs. The low-level handprogramming approach for creating GUIs will not be discussed in this article. The interested reader can refer to the MATLAB documentation or the book Graphics and GUIs in MATLAB by Marchand and Holland for more information.16 There is also some online documentation for 2011 Jo h n Wiley & So n s, In c. 125 wires.wiley.com/compstats Advanced Review Random Number Generation Tool File Edit View Insert Tools Desktop Window Help Normal Distribution 500 Samples 70 60 50 Counts 40 30 20 10 0 −8 −6 −4 −2 0 Values Upper bound 2 Upper bound 2 Mu 0 Sigma 1 Lower bound −2 Lower bound 0.5 2 4 6 8 Upper Upper bound bound Lower Lower bound bound Resample Export ... FIGURE 4 | This is a screen shot of the random number generation GUI in the MATLAB Statistics Toolbox. creating GUIs (with GUIDE and programmatically) (http://www.mathworks.com/help/techdoc/), as well as some helpful video tutorials at MATLAB Central (http://www.mathworks.com/). Like any good GUI, GUIDE alleviates the learning and programming skills required to build useful GUIs in MATLAB. GUIDE is actually a suite of tools for creating GUIs. GUIDE can be invoked in several ways, but the easiest is to type in the word guide at the prompt. At this point, the GUIDE Layout Editor is opened, along with a blank GUI 126 figure, as shown in Figure 1. This allows the user to add UI controls (buttons, axes, sliders, drop-down menus, list boxes, etc.) to the GUI. Simply select the component and drag it to the figure area. The Layout Editor also allows one to create toolbars, change the appearance of the controls, set the tab order, and modify other GUI options. Once the GUI layout is saved, MATLAB creates a template that contains MATLAB code for controlling the way the GUI works, so the right things happen when a UI component is activated. 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces Exploratory Data Analysis GUI Exploratory Data Analysis CLOSE Set UP Data: These options can be used with all sections below. LOAD DATA TRANSFORM Load the data and labels for class, cases, and variables. With this GUI, you can sphere the data and apply other common transforms. Graphical EDA: Use these functions to visually explore your data set - both variables and observations. GRAPHICS SHAPES-UNIVARIATE SHAPES-BIVARIATE DATA TOURS PROJECTION PURSUIT Visualize using 2D/3D scatterplots, parallel coordiante plots, Andrews' curves, scatterplot Construct boxplots, histrograms, q-q plots to understand the distribution of single variables. Explore two variables at a time with hexagonal binning, bivariate histograms, and polar Search for structure using the grand tour and the permutation tour. Search for structure in 2-D using projection pursuit. Dimensionality Reduction: Use these functions to reduce the number of variables in your data set. MULTIDIMENSIONAL SCALING Reduce the dimensionality of your data using classical, metric, and nonmetric MDS. DIMENSIONALITY REDUCTION Reduce the dimenstionality of your data using PCA, ISOMAP, and LLE. Search for Groups: Use these functions to search for groups or clusters. K-MEANS CLUSTERING AGGLOMERATIVE CLUSTERING MODEL-BASED CLUSTERING Partition the data into a specified number of groups. Construct a hierarchy of nested partitions. Cluster your data based on probability density estimation (Gaussian finite mixtures). FIGURE 5 | Entrance GUI (optional) for the EDA GUI Toolbox. In MATLAB, GUI actions are governed by callbacks, which are functions that execute in response to GUI events (e.g., clicking a button or a menu option). The MATLAB programming editor can be used to fill in the rest of the template with the required code. Tcl/Tk R is an open-source environment and high-level language for statistical computing and graphics. It has been greatly extended by user-written packages that implement state-of-the-art analysis methodologies. See the Comprehensive R Archive Network (CRAN) to download the software, packages, documentation, and more (http://cran.r-project.org/). Some of the examples described in the next section are based on R. This section will provide information on a tool that can be used for building GUIs in R, as well as in many other computing environments. Tcl (http://www.tcl.tk/)17 is a scripting or dynamic language that is extensible and has been used in a wide variety of applications. This type of language Vo lu me 3, March/April 2011 is sometimes called ‘high-level’, and it is similar to other languages like Perl,18 Python,19 MATLAB, and R. These can be contrasted to lower-level system programming languages like C++ and Java. Tk is a graphical toolkit that works with Tcl, and allows one to easily create GUI applications that work with a variety of operating systems, such as Microsoft Windows, Linux, and the Macintosh OS. Tk can also be used with other languages, like C, Perl, Python, and more. There is a package in R called tcltk that provides access to the Tk toolkit.20,21 In addition, there are other packages that are built on the tcltk package, such as rpanel (builds simple UI controls for functions), traitr (builds GUIs using Python functionality), and gWidgetstcltk (toolkit for tcltk). wxWidgets The wxWidgets is a free cross-platform library that enables the rapid development of GUI applications for desktop computing platforms and more.22 wxWidgets 2011 Jo h n Wiley & So n s, In c. 127 wires.wiley.com/compstats Advanced Review Data Tours GUI Data Tours LOAD DATA TRANSFORM Grand Tour 1. Enter step size: 2. Number of dimensions to display: Pseudo Grand Tour CLOSE Use these capabilities to explore projections of the data. All plots/graphics are animated and will be shown in a plot window. Implements Asimov's torus grand tour. Cand display 2 to p dimensions. 0.01 2 3. Enter max interations: 1000 5. Push to: OUTPUT DATA START 4. Select diaplay type: Scatterplot OUTPUT PROJECTION Implements Wegman's psuedo grand tour. Only does 2D scatterplots. OUTPUT DATA 1. Enter step size: 0.01 2. Enter max iterations: 1000 3. Push to: START Permutation Tour Permutes the order of the variables and displays as either Andrews' curves or in parallel coordinates. 1. Select display: Andrews 2. Select type of tour: All permutations OUTPUT PROJECTION Use group colors 3. Push to: START FIGURE 6 | This is the Data Tours GUI in the EDA GUI Toolbox. Note that this GUI can be started from the command prompt, as well as the GUI in Figure 5. is written in C++, and it can be used with other languages, such as Python and Perl. For the most part, the wxWidgets toolkit uses native UI widgets. So, the applications built using this library have the look users expect for each particular platform. The R-wxPython package23 is an R interface for the wxWidgets GUI library of tools. This is achieved by using Python as an intermediary between R and the wxWidgets. The process is made simpler through the use of the Boa Constructor cross-platform Python development environment (http://boa-constructor.sourceforge.net/). This GUI development process was described by Wettenhall.23 EXAMPLES OF GUIs IN COMPUTATIONAL STATISTICS There are many examples of GUIs that might be of interest to statisticians and researchers. These include ones developed commercially and by the user community. A very short list of GUI examples in 128 MATLAB and R is provided below, the purpose of which is to show the user what one can do with GUIs and statistical data analysis. MATLAB GUIs The main MATLAB package has very powerful visualization and graphics capabilities. One of the basic plots available to analysts is the 2-D scatterplot. If such a plot is created, then the user can choose a menu option in the figure window called Basic Fitting. This brings up a GUI as shown in Figure 2. The user can explore various fits to the data, as well as residuals to assess the goodness of the fit. The MathWorks, Inc. sells a Statistics Toolbox that includes some useful GUIs. One is the Distribution Fitting Tool, which is invoked at the prompt using the command dfittool. This GUI allows one to fit various distributions to the data and to evaluate the results. A screen shot of the GUI is shown in Figure 3. 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces Rattle: Effective Data Mining with R Project Execute Data Edit Tools Settings New Open Rattle Version 2.3.6 togaware.com Help Save Export Quit Two Class Unsupervised Time Series Multi Class Regression Text Miner Select Explore Transform Model Evaluate Log Type: File name: CSV Rle ARFF (None) ODBC RData File Separator: R Dataset Header Library View Data Data Entry Edit Data Welcome to Rattle, Rattle is and Open Source project to develop a GNOME Data Mining suite of applications built on top of the statistical and data mining package R, licensed under the GNU General Public License, and Copyright (c) Graham j Williams. See the Help menu for extensive support in using Rattle. Togaware's Desktop Data Mining Survival Guide (under development) includes extensive documentation on using Rattle. It is available from http://rattle.togaware.com Rattle is Copyright (c) 2007-2008, Graham [email protected]. Rattle is licensed under the GNU General Public License, Version 2. See Help.>About for details FIGURE 7 | Screen shot of the opening Rattle GUI (http://rattle.togaware.com). A rather fun GUI that comes with the Statistics Toolbox is the Random Number Generation GUI (type randtool at the prompt). This allows one to generate random numbers from many distributions, visualize them in a histogram, and export them to the workspace. This GUI is shown in Figure 4, where we have a random sample from the standard normal distribution. The above examples are commercial products, but the next one—called the EDA GUI Toolbox—is freely available (http://pi-sigma.info/).24 It implements many of the tasks that one might want for exploratory data analysis and data mining, including clustering, graphical displays, dimensionality reduction, exploring distributions, data tours, and more. Figure 5 shows a screen shot of the (optional) opening GUI that provides a menu of buttons—along with descriptions—to other GUIs in the toolbox. Pressing a button brings up the associated GUI, which Vo lu me 3, March/April 2011 can also be opened individually. For example, the Data Tours GUI is displayed in Figure 6. Rattle—A Data Mining GUI in R Rattle (R Analytical Tool to Learn Easily) is an open source GUI data mining tool (http://rattle.togaware. com/).25 It connects to several R packages (described below) that can be used for data mining applications. Like any good GUI, it removes the burden of understanding the R language and using the command line interface. Rattle is based on the Gnome interface and the RGtk2 package in R. Additionally, the GUI itself was developed using Glade.18 Some of the R packages that Rattle utilizes include ada (stochastic boosting), arules (mining association rules), gplots (R plotting functions), randomForest, rggobi (interface with GGobi), and much more. These packages can be installed, 2011 Jo h n Wiley & So n s, In c. 129 wires.wiley.com/compstats Advanced Review R Commander File Edit Data Statistics Graphs Models Distributions Tools Help Data set: <No active dataset> Edit data set View data set Model: <No active model> Script Window Submit Output Window Messages [2] WARNING: The Windows version of the R Commander works best under RGui with the single-document interface (SDI) ; see ?Commander. FIGURE 8 | Screen shot of R Commander GUI for basic statistics. but Rattle will tell the user—via a graphical pop-up window—that it needs to be installed. Rattle has a UI that is based on tabs, which is similar to the ribbon-tab interface described previously. It is an attempt to provide information to the user about the task flow in a typical data mining process (design Principle 4). There are tabs for the following tasks (and more): 1. Data—This allows one to load the data and select variables for analysis. 130 2. Explore—Exploring one’s data is an important task in data mining, since it informs the rest of the analysis and enables better understanding of the results. 3. Test—This area provides tools for parametric and nonparametric statistical tests of distributions, e.g., Kolmogorov–Smirnov and Wilcoxon signed rank tests. 4. Transform—This tab has functions for rescaling, imputation, handling outliers, data cleaning, and more. 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces GGobi File Display flea.csv. Scatterplot (current) View XY Plot Interaction X Plot cycling X Cycle X No fixed axes X X Change direction Tools Help File Options tars2 Variable Manipulation Variable Transformation Sphering (PCA) 1 5 0 Variable Jittering 1 4 0 Color Schemes Automatic Brushing X Color & Glyph Groups X Case Subsetting and Sampling Missing Values 1 3 0 Data Viewer Save Display description Graph Layout 1 2 0 Graph Operations Variogram Cloud 1 1 0 ggvis (MDS) tars1 1 100 150 200 250 FIGURE 10 | This is the scatterplot that is being manipulated and explored using the GGobi interface. flea.csv: 74 x 7(C:\Program Files\ggobi\data\flea.csv) FIGURE 9 | This shows the GUI that appears and some of the options that are available to explore a data set using GGobi. 5. Model—Several options for building models are available, such as decision trees, support vector machines, generalized linear models, neural networks, etc. 6. Evaluate—In this tab, one can evaluate models using the confusion table, ROC curves, and life charts. A screen shot of the opening GUI window for Rattle is shown in Figure 7. R Commander—A Basic Statistics GUI for R Some time was spent on GUI design principles in a previous section. The developer of the R Commander GUI had these specific design objectives in mind: (1) to provide an easy-to-use interface that supports the functionality used in a basic-statistics course, (2) to protect the user from doing unreasonable things, and (3) to convey information about the R commands generated by UI events.26 It is clear from these objectives, that the developer spent some time thinking about the users of this GUI application (design Principles 1 and 3). The R Commander GUI is based on the tcltk package in R that provides an interface to Tcl/Tk described in a previous section. Vo lu me 3, March/April 2011 The R Commander GUI is a window that has menus, buttons, and information fields. The window also has some areas within it to display R commands generated by the user’s choices (Script Window) and for outputs (Output Window). Text in the Script Window is editable, which means that commands can be executed or typed in directly. Note that these windows promote learning (design Principle 4). Some of the main menus include the usual ones like File and Edit. There are also ones pertaining to statistics, such as Data (reading and manipulating data), Statistics (basic analyses), Graphs, Models (hypothesis tests, confidence intervals, diagnostics, and more), and Distributions (probabilities, quantiles, graphs). A screen shot of the opening R Commander window is shown in Figure 8. GGobi GGobi is a stand-alone open source GUI application for dynamic visualization and exploration of highdimensional data (http://www.ggobi.org/).27 The GGobi environment provides access to many useful tools to interactively visualize and explore data sets. Some of the options include data tours, scatter plots, parallel coordinate plots, brushing, transformations, and more. GGobi will also work with the R computational environment via the rggobi package. This allows the user to exploit the computational power of R to analyze any interesting data structures discovered 2011 Jo h n Wiley & So n s, In c. 131 wires.wiley.com/compstats Advanced Review using GGobi. Examples of the GGobi interface are given in Figures 9 and 10. Figure 9 shows some of the options under the Tools menu for the data in the 2-D scatterplot in Figure 10. Other R GUIs This section concludes with a very brief list of some other interesting GUIs for R. They are included here because they are example of the types of GUIs mentioned previously in this article. • Red-R: This is an open source GUI interface that provides a visual programming capability for R. The main goal is to make the programming power of R available to users with little to no experience in programming (http://www. red-r.org/). • RExcel: This is a package that works with Microsoft Windows and Excel. It enables transferring data between R and Excel, running R code from Excel, and more. Essentially, RExcel provides a spreadsheet GUI for R.28 • Tinn-R: This is a GUI that replaces the basic code editor that comes with installations of R (i.e., Rgui). It has some nice features that promote easier programming in R, such as color coding of reserved words, etc (http://www.sciviews.org/Tinn-R/). • Rpad: This R package provides an interactive workbook-style interface through a web browser. CONCLUSION This article provides a very brief introduction to GUIs and shows some examples of how GUIs can be used in statistical data analysis. Of course, there are many more examples of GUIs in R, MATLAB, and other software applications or platforms. The hope is that this introduction will motivate readers to take advantage of the development tools and to create GUIs, making it easier for the user community to better understand and use their methods. REFERENCES 1. PC Magazine Encyclopedia. Available at: http://www. pcmag.com/encyclopedia/ (Accessed October 12, 2010). 9. Engelbart DC, English WK. A research center for augmenting human intellect. AFIPS Conference Proceedings Fall Joint Computer Conference; 1968, 395–410. Available at: http://www.dougengelbart.org/ pubs/augment-3954.html (Accessed October 12, 2010). 2. Jansen BJ. The graphical user interface: an introduction. SIGCHI Bull 1998, 30:22–26. Available at: http://jimjansen.tripod.com/academic/pubs/chi.html (Accessed October 12, 2010). 10. Doug Engelbart Institute. Available at: http://www. dougengelbart.org/ (Accessed October 12, 2010). 3. R GUIs, R Wiki. Available at: http://rwiki.sciviews.org/ doku.php (Accessed October 12, 2010). 11. Galitz WO. Essential Guide to User Interface Design. New York, NY: John Wiley & Sons; 2002. 4. Reimer J. A History of the GUI; 2005. Available at: http://arstechnica.com/old/content/2005/05/gui.ars (Accessed October 12, 2010). 12. Johnson J. GUI Bloopers: Don’ts and Do’s for Software developers and Web Designers. San Francisco, CA: Morgan Kaufmann; 2000. 13. Martinez WL. Computational statistics in MATLAB. Wiley Interdiscip Rev: Comput Stat 2010. doi:0.1002/ wics.13 2011, 3:69–74. 5. Bush V. As we may think. Atlantic Monthly, 101–108. Available at: http://www.theatlantic.com/magazine/ archive/1945/07/as-we-may-think/3881/ (Accessed October 12, 2010). 6. Sutherland IE. Sketchpad: a man-machine graphical communication system. PhD Dissertation, MIT; 1963. 14. Martinez WL, Martinez AR. Computational Statistics Handbook with MATLAB. 2nd ed. Boca Raton, FL: CRC Press; 2007. 7. Sutherland IE. Sketchpad: a man-machine graphical communication system. AFIPS Conference Proceedings, 323–328. 15. Smith ST. MATLAB Advanced GUI Development. LLC: Dog Ear Publishing; 2006. 8. Blackwell A, Roddenphone K. UCAM-CL-TR-57. Sketchpad: a man-machine graphical communication system. Technical Report Number 574, University of Cambridge; 2003. Available at: http://www.cl.cam.ac.uk/ techreports/UCAM-CL-TR-574.pdf (Accessed October 12, 2010). 17. Flynt C. Tcl/Tk: A Developer’s Guide. 2nd ed. New York, NY: Morgan Kaufmann; 2003. 132 16. Marchand P, Thomas Holland O. Graphics and GUIs with MATLAB. Boca Raton, FL: CRC Press; 2002. 18. Schwartz RL, Phoenix T, Foy BD. Learning Perl. 5th ed. CA: O’Reilly Media, Sebastopol; 2008. 2011 Jo h n Wiley & So n s, In c. Vo lu me 3, March/April 2011 WIREs Computational Statistics Graphical user interfaces 19. Dawson M. Python Programming for the Absolute Beginner. 3rd ed. Boston, MA: Course Technology; 2010. 20. Dalgaard P. A primer on the R-Tcl/Tk Package. R News 2001, 1:27–31. Available at: http://www.r-project.org/ doc/Rnews/. 21. Dalgaard P. The R-Tcl/Tk interface. Proceedings of the 2nd International Workshop on Distributed Statistical Computing, Vienna, Austria; 2001. Available at: http://www.ci.tuwien.ac.at/Conferences/DSC-2001/ Proceedings/. Austria. Available at: http://bioinf.wehi.edu.au/folders/ james/wxPython/ (Accessed October 12, 2010). 24. Martinez WL, Martinez AR, Solka JL. Exploratory Data Analysis with MATLAB. 2nd ed. Boca Raton, FL: CRC Press; 2010. 25. Williams GJ. Rattle: a data mining GUI for R. The R J 2009, 1:45–55. 26. Fox J. The R commander: a basic-statistics graphical user interface to R. J Stat Softw 2005, 14. http://www. jstatsoft.org/v14/i09. 22. Smart J, Hock K, Csomor S. Cross-Platform GUI Programming with wxWidgets. New York, NY: Prentice Hall; 2005. 27. Cook D, Swayne DF. Interactive and Dynamic Graphics for Data Analysis with R and GGobi. New York, NY: Springer; 2007. 23. Wettenhall J. GUI development with R-wxPython and BoaConstructor, user! 2004 Conference, Vienna, 28. Heiberger R, Neuwirth E. R Through Excel. New York, NY: Springer; 2009. Vo lu me 3, March/April 2011 2011 Jo h n Wiley & So n s, In c. 133