* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 3.Dataflow programming
Survey
Document related concepts
Transcript
VISUAL PROGRAMMING 5 MARK 1.Visual programming language In computing, a visual programming language (VPL) is any programming language that lets users create programs by manipulating program elements graphically rather than by specifying them textually. A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols, used either as elements of syntax or secondary notation. For example, many VPLs (known as dataflow or diagrammatic programming)[1] are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations. VPLs may be further classified, according to the type and extent of visual expression used, into iconbased languages, form-based languages, and diagram languages. Visual programming environments provide graphical or iconic elements which can be manipulated by users in an interactive way according to some specific spatial grammar for program construction. A visually transformed language is a non-visual language with a superimposed visual representation. Naturally visual languages have an inherent visual expression for which there is no obvious textual equivalent Current developments try to integrate the visual programming approach with dataflow programming languages to either have immediate access to the program state resulting in online debugging or automatic program generation and documentation (i.e. visual paradigm). Dataflow languages also allow automatic parallelization, which is likely to become one of the greatest programming challenges of the future.[2] An instructive counterexample for visual programming languages is the Microsoft Visual Studio. The languages it encompasses (Visual Basic, Visual C#, Visual J#, etc.) are commonly confused to be but are not visual programming languages. All of these languages are textual and not graphical. MS Visual Studio is a visual programming environment, but not a visual programming language, hence the confusion. 2.Data acquisition Source Data acquisition begins with the physical phenomenon or physical property to be measured. Examples of this include temperature, light intensity, gas pressure, fluid flow, and force. Regardless of the type of physical property to be measured, the physical state that is to be measured must first be transformed into a unified form that can be sampled by a data acquisition system. The task of performing such transformations falls on devices called sensors. DAQ hardware DAQ hardware is what usually interfaces between the signal and a PC [1]. It could be in the form of modules that can be connected to the computer's ports (parallel, serial, USB, etc.) or cards connected to slots (S-100 bus, AppleBus, ISA, MCA, PCI, PCI-E, etc.) in themotherboard. Usually the space on the back of a PCI card is too small for all the connections needed, so an external breakout box is required. The cable between this box and the PC can be expensive due to the many wires, and the required shielding. DAQ cards often contain multiple components (multiplexer, ADC, DAC, TTL-IO, high speed timers, RAM). These are accessible via abus by a microcontroller, which can run small programs. A controller is more flexible than a hard wired logic, yet cheaper than a CPU so that it is permissible to block it with simple polling loops. For example: Waiting for a trigger, starting the ADC, looking up the time, waiting for the ADC to finish, move value to RAM, switch multiplexer, get TTL input, let DAC proceed with voltage ramp. DAQ software DAQ software is needed in order for the DAQ hardware to work with a PC. The device driver performs low-level register writes and reads on the hardware, while exposing a standard API for developing user applications. A standard API such as COMEDI allows the same user applications to run on different operating systems, e.g. a user application that runs on Windows will also run on Linux. 3.Dataflow programming Dataflow programming focuses on how things connect, unlike imperative programming, which focuses on how things happen. Inimperative programming a program is modeled as a series of operations (things that "happen"), the flow of data between these operations is of secondary concern to the behavior of the operations themselves. However, dataflow programming models programs as a series of (sometimes interdependent) connections, with the operations between these connections being of secondary importance. One of the key concepts in computer programming is the idea of "state", essentially a snapshot of the measure of various conditions in the system. Most programming languages require a considerable amount of state information in order to operate properly, information which is generally hidden from the programmer. For a real world example, consider a three-way light switch. Typically a switch turns on a light by moving it to the "on" position, but in a three-way case that may turn the light back off — the result is based on the state of the other switch, which is likely out of view. In fact, the state is often hidden from the computer itself as well, which normally has no idea that this piece of information encodes state, while that is temporary and will soon be discarded. This is a serious problem, as the state information needs to be shared across multiple processors in parallel processing machines. Without knowing which state is important and which isn't, most languages force the programmer to add a considerable amount of extra code to indicate which data and parts of the code are important in this respect. This code tends to be both expensive in terms of performance, as well as difficult to debug and often downright ugly; most programmers simply ignore the problem. Those that cannot must pay a heavy performance cost, which is paid even in the most common case when the program runs on one processor. Explicit parallelism is one of the main reasons for the poor performance of Enterprise Java Beanswhen building data-intensive, non-OLTP applications. Dataflow languages promote the data to become the main concept behind any program. It may be considered odd that this is not always the case, as programs generally take in data, process it, and then feed it back out. This was especially true of older programs, and is well represented in the Unix operating system which pipes the data between small single-purpose tools. Programs in a dataflow language start with an input, perhaps the command line parameters, and illustrate how that data is used and modified. The data is now explicit, often illustrated physically on the screen as a line or pipe showing where the information flows. Operations consist of "black boxes" with inputs and outputs, all of which are always explicitly defined. They run as soon as all of their inputs become valid, as opposed to when the program encounters them. Whereas a traditional program essentially consists of a series of statements saying "do this, now do this", a dataflow program is more like a series of workers on an assembly line, who will do their assigned task as soon as the materials arrive. This is why dataflow languages are inherently parallel; the operations have no hidden state to keep track of, and the operations are all "ready" at the same time. Dataflow programs are generally represented very differently inside the computer as well. A traditional program is just what it seems, a series of instructions that run one after the other. A dataflow program might be implemented as a hash table instead, with uniquely identified inputs as the keys, used to look up pointers to the instructions. When any operation completes, the program scans down the list of operations until it finds the first operation where all of the inputs are currently valid, and runs it. When that operation finishes it will typically put data into one or more outputs, thereby making some other operation become valid. 20 MARKS: 1.Visual Language Visual units in the form of lines and marks are constructed into meaningful shapes and structures or signs. Different areas of the cortex respond to different elements such as colour and form. Semir Zeki[2] had shown the responses in the brain to the paintings ofMichaelangelo, Rembrandt, Vermeer, Magritte, Malevich and Picasso. Imaging in the mind What we have in our minds in a waking state and what we imagine in dreams is very much of the same nature.[3] Dream images might be with or without spoken words, other sounds or colours. In the waking state there is usually, in the foreground, the buzz of immediate perception, feeling, mood and as well as fleeting memory images.[4] In a mental state between dreaming and being fully awake is a state known as 'day dreaming' or a mative state, during which "the things we see in the sky when the clouds are drifting, the centaurs and stags, antelopes and wolves" are projected from the imagination. [5] Rudolf Arnheim[6] has attempted to answer the question: what does a mental image look like? In Greek philosophy, the School of Leucippus and Democritus believed that a replica of an object enters the eye and remains in the soul as a memory as a complete image. Berkeley explained that parts, for example a leg rather than the complete body, appear in the mind. Arnheim considers the psychologist, Edward B. Titchener's account to be the breakthrough in understanding something of how the vague incomplete quality of the image is 'impressionistic' and carries meaning as well as form. Meaning and expression Abstract art has shown that the qualities of line and shape, proportion and colour convey meaning directly without the use of words or pictorial representation. Wassily Kandinsky in Point and Line to Plane showed how drawn lines and marks can be expressive without any association with a representational image.[7] Throughout history and especially in ancient cultures visual language has been used to encode meaning " The Bronze Age Badger Stone on Ilkly Moor is covered in circles, lines, hollow cups, winged figures, a spread hand, an ancient swastika, an embryo, a shooting star? … It's a story-telling rock, a message from a world before (written) words."[8]Richard Gregory suggests that, "Perhaps the ability to respond to absent imaginary situations," as our early ancestors did with paintings on rock, "represents an Perception The sense of sight operates selectively. Perception is not a passive recording of all that is in front of the eyes, but is a continuous judgement of scale and colour relationships, [10] and includes making categories of forms to classify images and shapes in the world.[11]Children of six to twelve months are to be able through experience and learning to discriminate between circles, squares and triangles.The child from this age onwards learns to classify objects, abstracting essential qualities and comparing them to other similar objects. Before objects can be perceived and identified the child must be able to classify the different shapes and sizes which a single object may appear to have when it is seen in varying surroundings and from different aspects.[12] Innate structures in the brain The perception of a shape requires the grasping of the essential structural features, to produce a "whole" or gestalt. The theory of thegestalt was proposed by Christian von Ehrenfels in 1890. He pointed out that a melody is still recognisable when played in different keys and argued that the whole is not simply the sum of its parts but a total structure. Max Wertheimer researched von Ehrenfels' idea, and in his "Theory of Form" (1923) – nicknamed "the dot essay" because it was illustrated with abstract patterns of dots and lines – he concluded that the perceiving eye tends to bring together elements that look alike (similarity groupings) and will complete an incomplete form (object hypothesis). An array of random dots tends to form configurations (constellations).[13] All these innate abilities demonstrate how the eye and the mind are seeking pattern and simple whole shapes. When we look at more complex visual images such as paintings we can see that art has been a continuous attempt to "notate" visual information. Visual thinking Thought processes are diffused and interconnected and are cognitive at a sensory level. The mind thinks at its deepest level in sense material, and the two hemispheres of the brain deal with different kinds of thought.[14] The brain is divided into two hemispheres and a thick bundle of nerve fibres enable these two halves to communicate with each other. In most people the ability to organize and produce speech is predominantly located in the left side. Appreciating spatial perceptions depends more on the right hemisphere, although there is a left hemisphere contribution.[15] In an attempt to understand how designers solve problems, L. Bruce Archer proposed "that the way designers (and everybody else, for that matter) form images in their mind's eye, manipulating and evaluating ideas before, during and after externalising them, constitutes a cognitive system comparable with but different from, the verbal language system. Indeed we believe that human beings have an innate capacity for cognitive modelling, and its expression through sketching, drawing, construction, acting out and so on, that is fundamental to human thought."[16] Graphicacy The development of the visual aspect of language communication has been referred to as graphicacy,[17] as a parallel discipline to literacy and numeracy. Michael Twyman[18] has pointed out that the ability to handle ideas visually, which includes both the understanding and conception of them, should not be confused with the specific talents of an artist. The artist is one very special kind of visual manipulator whose motives are varied and often complex. The ability to think and communicate in visual terms is part of, and of equal importance in the learning process, with that of literacy and numeracy. 2.Related fields of data visualization Data acquisition Data acquisition is the sampling of the real world to generate data that can be manipulated by a computer. Sometimes abbreviated DAQor DAS, data acquisition typically involves acquisition of signals and waveforms and processing the signals to obtain desired information. The components of data acquisition systems include appropriate sensors that convert any measurement parameter to an electrical signal, which is acquired by data acquisition hardware. Data analysis Data analysis is the process of studying and summarizing data with the intent to extract useful information and develop conclusions. Data analysis is closely related to data mining, but data mining tends to focus on larger data sets, with less emphasis on makinginference, and often uses data that was originally collected for a different purpose. In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis, and inferential statistics (or confirmatory data analysis), where the EDA focuses on discovering new features in the data, and CDA on confirming or falsifying existing hypotheses. Types of data analysis are: Exploratory data analysis (EDA): an approach to analyzing data for the purpose of formulating hypotheses worth testing, complementing the tools of conventional statistics for testing hypotheses. It was so named by John Tukey. Qualitative data analysis (QDA) or qualitative research is the analysis of non-numerical data, for example words, photographs, observations, etc. Data governance Data governance encompasses the people, processes and technology required to create a consistent, enterprise view of an organisation's data in order to: Increase consistency & confidence in decision making Decrease the risk of regulatory fines Improve data security Maximize the income generation potential of data Designate accountability for information quality Data management Data management comprises all the academic disciplines related to managing data as a valuable resource. The official definition provided by DAMA is that "Data Resource Management is the development and execution of architectures, policies, practices, and procedures that properly manage the full data lifecycle needs of an enterprise." This definition is fairly broad and encompasses a number of professions that may not have direct technical contact with lower-level aspects of data management, such as relational database management. Data mining Data mining is the process of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but is increasingly being used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods. It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data"[7] and "the science of extracting useful information from large data sets or databases."[8] In relation to enterprise resource planning, according to Monk (2006), data mining is "the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid decision making".[9] Data transforms Data transforms is the process of Automation and Transformation, of both real-time and offline data from one format to another. There are standards and protocols that provide the specifications and rules, and it usually occurs in the process pipeline of aggregation or consolidation or interoperability. The primary use cases are in integration systems organizations, and compliance personnels. 3.Initial data analysis The most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that are aimed at answering the original research question. The initial data analysis phase is guided by the following four questions: [3] Quality of data The quality of the data should be checked as early as possible. Data quality can be assessed in several ways, using different types of analyses: frequency counts, descriptive statistics (mean, standard deviation, median), normality (skewness, kurtosis, frequency histograms, normal probability plots), associations (correlations, scatter plots). Other initial data quality checks are: Checks on data cleaning: have decisions influenced the distribution of the variables? The distribution of the variables before data cleaning is compared to the distribution of the variables after data cleaning to see whether data cleaning has had unwanted effects on the data. Analysis of missing observations: are there many missing values, and are the values missing at random? The missing observations in the data are analyzed to see whether more than 25% of the values are missing, whether they are missing at random (MAR), and whether some form of imputation is needed. Analysis of extreme observations: outlying observations in the data are analyzed to see if they seem to disturb the distribution. Comparison and correction of differences in coding schemes: variables are compared with coding schemes of variables external to the data set, and possibly corrected if coding schemes are not comparable. Test for common-method variance. The choice of analyses to assess the data quality during the initial data analysis phase depends on the analyses that will be conducted in the main analysis phase.[4] Quality of measurements The quality of the measurement instruments should only be checked during the initial data analysis phase when this is not the focus or research question of the study. One should check whether structure of measurement instruments corresponds to structure reported in the literature. There are two ways to assess measurement quality: Confirmatory factor analysis Analysis of homogeneity (internal consistency), which gives an indication of the reliability of a measurement instrument. During this analysis, one inspects the variances of the items and the scales, the Cronbach's α of the scales, and the change in the Cronbach's alpha when an item would be deleted from a scale.[5] Initial transformations After assessing the quality of the data and of the measurements, one might decide to impute missing data, or to perform initial transformations of one or more variables, although this can also be done during the main analysis phase.[6] Possible transformations of variables are:[7] Square root transformation (if the distribution differs moderately from normal) Log-transformation (if the distribution differs substantially from normal) Inverse transformation (if the distribution differs severely from normal) Make categorical (ordinal / dichotomous) (if the distribution differs severely from normal, and no transformations help) Did the implementation of the study fulfill the intentions of the research design? One should check the success of the randomization procedure, for instance by checking whether background and substantive variables are equally distributed within and across groups. If the study did not need and/or use a randomization procedure, one should check the success of the non-random sampling, for instance by checking whether all subgroups of the population of interest are represented in sample. Other possible data distortions that should be checked are: dropout (this should be identified during the initial data analysis phase) Item nonresponse (whether this is random or not should be assessed during the initial data analysis phase) Treatment quality (using manipulation checks).[8] Characteristics of data sample In any report or article, the structure of the sample must be accurately described. It is especially important to exactly determine the structure of the sample (and specifically the size of the subgroups) when subgroup analyses will be performed during the main analysis phase. The characteristics of the data sample can be assessed by looking at: Basic statistics of important variables Scatter plots Correlations Cross-tabulations[ 4.Main data analysis In the main analysis phase analyses aimed at answering the research question are performed as well as any other relevant analysis needed to write the first draft of the research report. [13] Exploratory and confirmatory approaches In the main analysis phase either an exploratory or confirmatory approach can be adopted. Usually the approach is decided before data is collected. In an exploratory analysis no clear hypothesis is stated before analysing the data, and the data is searched for models that describe the data well. In a confirmatory analysis clear hypotheses about the data are tested. Exploratory data analysis should be interpreted carefully. When testing multiple models at once there is a high chance on finding at least one of them to be significant, but this can be due to a type 1 error. It is important to always adjust the significance level when testing multiple models with, for example, a bonferroni correction. Also, one should not follow up an exploratory analysis with a confirmatory analysis in the same dataset. An exploratory analysis is used to find ideas for a theory, but not to test that theory as well. When a model is found exploratory in a dataset, then following up that analysis with a comfirmatory analysis in the same dataset could simply mean that the results of the comfirmatory analysis are due to the same type 1 error that resulted in the exploratory model in the first place. The comfirmatory analysis therefore will not be more informative than the original exploratory analysis. [14] Stability of results It is important to obtain some indication about how generalizable the results are. [15] While this is hard to check, one can look at the stability of the results. Are the results reliable and reproducible? There are two main ways of doing this: Cross-validation: By splitting the data in multiple parts we can check if analyzes (like a fitted model) based on one part of the data generalize to another part of the data as well. Sensitivity analysis: A procedure to study the behavior of a system or model when global parameters are (systematically) varied. One way to do this is with bootstrapping. Statistical methods A lot of statistical methods have been used for statistical analyses. A very brief list of four of the more popular methods is: General linear model: A widely used model on which various statistical methods are based (e.g. t test, ANOVA, ANCOVA,MANOVA). Usable for assessing the effect of several predictors on one or more continuous dependent variables. Generalized linear model: An extension of the general linear model for discrete dependent variables. Structural equation modelling: Usable for assessing latent structures from measured manifest variables. Item response theory: Models for (mostly) assessing one latent variable from several binary measured variables (e.g. an exam). Free software for data analysis ROOT - C++ data analysis framework developed at CERN PAW - FORTRAN/C data analysis framework developed at CERN JHepWork - Java (multi-platform) data analysis framework developed at ANL KNIME - the Konstanz Information Miner, a user friendly and comprehensive data analytics framework. Data Applied - an online data mining and data visualization solution. R - a programming language and software environment for statistical computing and graphics. DevInfo - a database system endorsed by the United Nations Development Group for monitoring and analyzing human development. Zeptoscope Basic[16] - Interactive Java-based plotter developed at Nanomix. Nuclear and particle physics In nuclear and particle physics the data usually originate from the experimental apparatus via a data acquisition system. It is then processed, in a step usually called data reduction, to apply calibrations and to extract physically significant information. Data reduction is most often, especially in large particle physics experiments, an automatic, batch-mode operation carried out by software written ad-hoc. The resulting data n-tuples are then scrutinized by the physicists, using specialized software tools like ROOT or PAW, comparing the results of the experiment with theory. The theoretical models are often difficult to compare directly with the results of the experiments, so they are used instead as input forMonte Carlo simulation software like Geant4, predict the response of the detector to a given theoretical event, producing simulated events which are then compared to experimental data.