Download 3.Dataflow programming

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DIKW pyramid wikipedia , lookup

Junction Grammar wikipedia , lookup

Theoretical computer science wikipedia , lookup

The Measure of a Man (Star Trek: The Next Generation) wikipedia , lookup

Corecursion wikipedia , lookup

Data (Star Trek) wikipedia , lookup

Transcript
VISUAL PROGRAMMING
5 MARK
1.Visual programming language
In computing, a visual programming language (VPL) is any programming language that lets users
create programs by manipulating program elements graphically rather than by specifying them textually.
A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols,
used either as elements of syntax or secondary notation. For example, many VPLs (known
as dataflow or diagrammatic programming)[1] are based on the idea of "boxes and arrows", where boxes
or other screen objects are treated as entities, connected by arrows, lines or arcs which represent
relations.
VPLs may be further classified, according to the type and extent of visual expression used, into iconbased languages, form-based languages, and diagram languages. Visual programming environments
provide graphical or iconic elements which can be manipulated by users in an interactive way according
to some specific spatial grammar for program construction.
A visually transformed language is a non-visual language with a superimposed visual representation.
Naturally visual languages have an inherent visual expression for which there is no obvious textual
equivalent
Current developments try to integrate the visual programming approach with dataflow
programming languages to either have immediate access to the program state resulting in online
debugging or automatic program generation and documentation (i.e. visual paradigm). Dataflow
languages also allow automatic parallelization, which is likely to become one of the greatest programming
challenges of the future.[2]
An instructive counterexample for visual programming languages is the Microsoft Visual Studio. The
languages it encompasses (Visual Basic, Visual C#, Visual J#, etc.) are commonly confused to be but are
not visual programming languages. All of these languages are textual and not graphical. MS Visual Studio
is a visual programming environment, but not a visual programming language, hence the confusion.
2.Data acquisition
Source
Data acquisition begins with the physical phenomenon or physical property to be measured. Examples of
this include temperature, light intensity, gas pressure, fluid flow, and force. Regardless of the type of
physical property to be measured, the physical state that is to be measured must first be transformed into
a unified form that can be sampled by a data acquisition system. The task of performing such
transformations falls on devices called sensors.
DAQ hardware
DAQ hardware is what usually interfaces between the signal and a PC [1]. It could be in the form of
modules that can be connected to the computer's ports (parallel, serial, USB, etc.) or cards connected to
slots (S-100 bus, AppleBus, ISA, MCA, PCI, PCI-E, etc.) in themotherboard. Usually the space on the
back of a PCI card is too small for all the connections needed, so an external breakout box is required.
The cable between this box and the PC can be expensive due to the many wires, and the required
shielding.
DAQ cards often contain multiple components (multiplexer, ADC, DAC, TTL-IO, high speed timers, RAM).
These are accessible via abus by a microcontroller, which can run small programs. A controller is more
flexible than a hard wired logic, yet cheaper than a CPU so that it is permissible to block it with simple
polling loops. For example: Waiting for a trigger, starting the ADC, looking up the time, waiting for the
ADC to finish, move value to RAM, switch multiplexer, get TTL input, let DAC proceed with voltage ramp.
DAQ software
DAQ software is needed in order for the DAQ hardware to work with a PC. The device driver performs
low-level register writes and reads on the hardware, while exposing a standard API for developing user
applications. A standard API such as COMEDI allows the same user applications to run on different
operating systems, e.g. a user application that runs on Windows will also run on Linux.
3.Dataflow programming
Dataflow programming focuses on how things connect, unlike imperative programming, which focuses
on how things happen. Inimperative programming a program is modeled as a series of operations (things
that "happen"), the flow of data between these operations is of secondary concern to the behavior of the
operations themselves. However, dataflow programming models programs as a series of (sometimes
interdependent) connections, with the operations between these connections being of secondary
importance.
One of the key concepts in computer programming is the idea of "state", essentially a snapshot of the
measure of various conditions in the system. Most programming languages require a considerable
amount of state information in order to operate properly, information which is generally hidden from the
programmer. For a real world example, consider a three-way light switch. Typically a switch turns on a
light by moving it to the "on" position, but in a three-way case that may turn the light back off — the result
is based on the state of the other switch, which is likely out of view.
In fact, the state is often hidden from the computer itself as well, which normally has no idea
that this piece of information encodes state, while that is temporary and will soon be discarded. This is a
serious problem, as the state information needs to be shared across multiple processors in parallel
processing machines. Without knowing which state is important and which isn't, most languages force the
programmer to add a considerable amount of extra code to indicate which data and parts of the code are
important in this respect.
This code tends to be both expensive in terms of performance, as well as difficult to debug and often
downright ugly; most programmers simply ignore the problem. Those that cannot must pay a heavy
performance cost, which is paid even in the most common case when the program runs on one
processor. Explicit parallelism is one of the main reasons for the poor performance of Enterprise Java
Beanswhen building data-intensive, non-OLTP applications.
Dataflow languages promote the data to become the main concept behind any program. It may be
considered odd that this is not always the case, as programs generally take in data, process it, and then
feed it back out. This was especially true of older programs, and is well represented in the Unix operating
system which pipes the data between small single-purpose tools. Programs in a dataflow language start
with an input, perhaps the command line parameters, and illustrate how that data is used and modified.
The data is now explicit, often illustrated physically on the screen as a line or pipe showing where the
information flows.
Operations consist of "black boxes" with inputs and outputs, all of which are always explicitly defined.
They run as soon as all of their inputs become valid, as opposed to when the program encounters them.
Whereas a traditional program essentially consists of a series of statements saying "do this, now do this",
a dataflow program is more like a series of workers on an assembly line, who will do their assigned task
as soon as the materials arrive. This is why dataflow languages are inherently parallel; the operations
have no hidden state to keep track of, and the operations are all "ready" at the same time.
Dataflow programs are generally represented very differently inside the computer as well. A traditional
program is just what it seems, a series of instructions that run one after the other. A dataflow program
might be implemented as a hash table instead, with uniquely identified inputs as the keys, used to look up
pointers to the instructions. When any operation completes, the program scans down the list of operations
until it finds the first operation where all of the inputs are currently valid, and runs it. When that operation
finishes it will typically put data into one or more outputs, thereby making some other operation become
valid.
20 MARKS:
1.Visual Language
Visual units in the form of lines and marks are constructed into meaningful shapes and structures or
signs. Different areas of the cortex respond to different elements such as colour and form. Semir
Zeki[2] had shown the responses in the brain to the paintings
ofMichaelangelo, Rembrandt, Vermeer, Magritte, Malevich and Picasso.
Imaging in the mind
What we have in our minds in a waking state and what we imagine in dreams is very much of the same
nature.[3] Dream images might be with or without spoken words, other sounds or colours. In the waking
state there is usually, in the foreground, the buzz of immediate perception, feeling, mood and as well as
fleeting memory images.[4] In a mental state between dreaming and being fully awake is a state known as
'day dreaming' or a mative state, during which "the things we see in the sky when the clouds are drifting,
the centaurs and stags, antelopes and wolves" are projected from the imagination. [5] Rudolf
Arnheim[6] has attempted to answer the question: what does a mental image look like? In Greek
philosophy, the School of Leucippus and Democritus believed that a replica of an object enters the eye
and remains in the soul as a memory as a complete image. Berkeley explained that parts, for example a
leg rather than the complete body, appear in the mind. Arnheim considers the psychologist, Edward B.
Titchener's account to be the breakthrough in understanding something of how the vague incomplete
quality of the image is 'impressionistic' and carries meaning as well as form.
Meaning and expression
Abstract art has shown that the qualities of line and shape, proportion and colour convey meaning directly
without the use of words or pictorial representation. Wassily Kandinsky in Point and Line to Plane showed
how drawn lines and marks can be expressive without any association with a representational
image.[7] Throughout history and especially in ancient cultures visual language has been used to encode
meaning " The Bronze Age Badger Stone on Ilkly Moor is covered in circles, lines, hollow cups, winged
figures, a spread hand, an ancient swastika, an embryo, a shooting star? … It's a story-telling rock, a
message from a world before (written) words."[8]Richard Gregory suggests that, "Perhaps the ability to
respond to absent imaginary situations," as our early ancestors did with paintings on rock, "represents an
Perception
The sense of sight operates selectively. Perception is not a passive recording of all that is in front of the
eyes, but is a continuous judgement of scale and colour relationships, [10] and includes making categories
of forms to classify images and shapes in the world.[11]Children of six to twelve months are to be able
through experience and learning to discriminate between circles, squares and triangles.The child from
this age onwards learns to classify objects, abstracting essential qualities and comparing them to other
similar objects. Before objects can be perceived and identified the child must be able to classify the
different shapes and sizes which a single object may appear to have when it is seen in varying
surroundings and from different aspects.[12]
Innate structures in the brain
The perception of a shape requires the grasping of the essential structural features, to produce a "whole"
or gestalt. The theory of thegestalt was proposed by Christian von Ehrenfels in 1890. He pointed out that
a melody is still recognisable when played in different keys and argued that the whole is not simply the
sum of its parts but a total structure. Max Wertheimer researched von Ehrenfels' idea, and in his "Theory
of Form" (1923) – nicknamed "the dot essay" because it was illustrated with abstract patterns of dots and
lines – he concluded that the perceiving eye tends to bring together elements that look alike (similarity
groupings) and will complete an incomplete form (object hypothesis). An array of random dots tends to
form configurations (constellations).[13] All these innate abilities demonstrate how the eye and the mind
are seeking pattern and simple whole shapes. When we look at more complex visual images such as
paintings we can see that art has been a continuous attempt to "notate" visual information.
Visual thinking
Thought processes are diffused and interconnected and are cognitive at a sensory level. The mind thinks
at its deepest level in sense material, and the two hemispheres of the brain deal with different kinds of
thought.[14]
The brain is divided into two hemispheres and a thick bundle of nerve fibres enable these two halves to
communicate with each other. In most people the ability to organize and produce speech is predominantly
located in the left side. Appreciating spatial perceptions depends more on the right hemisphere, although
there is a left hemisphere contribution.[15]
In an attempt to understand how designers solve problems, L. Bruce Archer proposed "that the way
designers (and everybody else, for that matter) form images in their mind's eye, manipulating and
evaluating ideas before, during and after externalising them, constitutes a cognitive system comparable
with but different from, the verbal language system. Indeed we believe that human beings have an innate
capacity for cognitive modelling, and its expression through sketching, drawing, construction, acting out
and so on, that is fundamental to human thought."[16]
Graphicacy
The development of the visual aspect of language communication has been referred to
as graphicacy,[17] as a parallel discipline to literacy and numeracy. Michael Twyman[18] has pointed out
that the ability to handle ideas visually, which includes both the understanding and conception of them,
should not be confused with the specific talents of an artist. The artist is one very special kind of visual
manipulator whose motives are varied and often complex. The ability to think and communicate in visual
terms is part of, and of equal importance in the learning process, with that of literacy and numeracy.
2.Related fields of data visualization
Data acquisition
Data acquisition is the sampling of the real world to generate data that can be manipulated by a
computer. Sometimes abbreviated DAQor DAS, data acquisition typically involves acquisition of signals
and waveforms and processing the signals to obtain desired information. The components of data
acquisition systems include appropriate sensors that convert any measurement parameter to an electrical
signal, which is acquired by data acquisition hardware.
Data analysis
Data analysis is the process of studying and summarizing data with the intent to extract
useful information and develop conclusions. Data analysis is closely related to data mining, but data
mining tends to focus on larger data sets, with less emphasis on makinginference, and often uses data
that was originally collected for a different purpose. In statistical applications, some people divide data
analysis into descriptive statistics, exploratory data analysis, and inferential statistics (or confirmatory data
analysis), where the EDA focuses on discovering new features in the data, and CDA on confirming or
falsifying existing hypotheses.
Types of data analysis are:

Exploratory data analysis (EDA): an approach to analyzing data for the purpose of
formulating hypotheses worth testing, complementing the tools of conventional statistics for testing
hypotheses. It was so named by John Tukey.

Qualitative data analysis (QDA) or qualitative research is the analysis of non-numerical data, for
example words, photographs, observations, etc.
Data governance
Data governance encompasses the people, processes and technology required to create a consistent,
enterprise view of an organisation's data in order to:

Increase consistency & confidence in decision making

Decrease the risk of regulatory fines

Improve data security

Maximize the income generation potential of data

Designate accountability for information quality
Data management
Data management comprises all the academic disciplines related to managing data as a valuable
resource. The official definition provided by DAMA is that "Data Resource Management is the
development and execution of architectures, policies, practices, and procedures that properly manage the
full data lifecycle needs of an enterprise." This definition is fairly broad and encompasses a number of
professions that may not have direct technical contact with lower-level aspects of data management, such
as relational database management.
Data mining
Data mining is the process of sorting through large amounts of data and picking out relevant information.
It is usually used by business intelligence organizations, and financial analysts, but is increasingly being
used in the sciences to extract information from the enormous data sets generated by modern
experimental and observational methods.
It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially
useful information from data"[7] and "the science of extracting useful information from large data
sets or databases."[8] In relation to enterprise resource planning, according to Monk (2006), data mining is
"the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid
decision making".[9]
Data transforms
Data transforms is the process of Automation and Transformation, of both real-time and offline data from
one format to another. There are standards and protocols that provide the specifications and rules, and it
usually occurs in the process pipeline of aggregation or consolidation or interoperability. The primary use
cases are in integration systems organizations, and compliance personnels.
3.Initial data analysis
The most important distinction between the initial data analysis phase and the main analysis phase, is
that during initial data analysis one refrains from any analysis that are aimed at answering the original
research question. The initial data analysis phase is guided by the following four questions: [3]
Quality of data
The quality of the data should be checked as early as possible. Data quality can be assessed in several
ways, using different types of analyses: frequency counts, descriptive statistics (mean, standard
deviation, median), normality (skewness, kurtosis, frequency histograms, normal probability plots),
associations (correlations, scatter plots).
Other initial data quality checks are:

Checks on data cleaning: have decisions influenced the distribution of the variables? The
distribution of the variables before data cleaning is compared to the distribution of the variables after
data cleaning to see whether data cleaning has had unwanted effects on the data.

Analysis of missing observations: are there many missing values, and are the values missing at
random? The missing observations in the data are analyzed to see whether more than 25% of the
values are missing, whether they are missing at random (MAR), and whether some form
of imputation is needed.

Analysis of extreme observations: outlying observations in the data are analyzed to see if they
seem to disturb the distribution.

Comparison and correction of differences in coding schemes: variables are compared with coding
schemes of variables external to the data set, and possibly corrected if coding schemes are not
comparable.

Test for common-method variance.
The choice of analyses to assess the data quality during the initial data analysis phase depends on the
analyses that will be conducted in the main analysis phase.[4]
Quality of measurements
The quality of the measurement instruments should only be checked during the initial data analysis phase
when this is not the focus or research question of the study. One should check whether structure of
measurement instruments corresponds to structure reported in the literature.
There are two ways to assess measurement quality:


Confirmatory factor analysis
Analysis of homogeneity (internal consistency), which gives an indication of the reliability of a
measurement instrument. During this analysis, one inspects the variances of the items and the
scales, the Cronbach's α of the scales, and the change in the Cronbach's alpha when an item would
be deleted from a scale.[5]
Initial transformations
After assessing the quality of the data and of the measurements, one might decide to impute missing
data, or to perform initial transformations of one or more variables, although this can also be done during
the main analysis phase.[6]
Possible transformations of variables are:[7]

Square root transformation (if the distribution differs moderately from normal)

Log-transformation (if the distribution differs substantially from normal)

Inverse transformation (if the distribution differs severely from normal)

Make categorical (ordinal / dichotomous) (if the distribution differs severely from normal, and no
transformations help)
Did the implementation of the study fulfill the intentions of the research
design?
One should check the success of the randomization procedure, for instance by checking whether
background and substantive variables are equally distributed within and across groups.
If the study did not need and/or use a randomization procedure, one should check the success of the
non-random sampling, for instance by checking whether all subgroups of the population of interest are
represented in sample.
Other possible data distortions that should be checked are:



dropout (this should be identified during the initial data analysis phase)
Item nonresponse (whether this is random or not should be assessed during the initial data
analysis phase)
Treatment quality (using manipulation checks).[8]
Characteristics of data sample
In any report or article, the structure of the sample must be accurately described. It is especially important
to exactly determine the structure of the sample (and specifically the size of the subgroups) when
subgroup analyses will be performed during the main analysis phase.
The characteristics of the data sample can be assessed by looking at:

Basic statistics of important variables

Scatter plots

Correlations

Cross-tabulations[
4.Main data analysis
In the main analysis phase analyses aimed at answering the research question are performed as well as
any other relevant analysis needed to write the first draft of the research report. [13]
Exploratory and confirmatory approaches
In the main analysis phase either an exploratory or confirmatory approach can be adopted. Usually the
approach is decided before data is collected. In an exploratory analysis no clear hypothesis is stated
before analysing the data, and the data is searched for models that describe the data well. In a
confirmatory analysis clear hypotheses about the data are tested.
Exploratory data analysis should be interpreted carefully. When testing multiple models at once there is a
high chance on finding at least one of them to be significant, but this can be due to a type 1 error. It is
important to always adjust the significance level when testing multiple models with, for example,
a bonferroni correction. Also, one should not follow up an exploratory analysis with a confirmatory
analysis in the same dataset. An exploratory analysis is used to find ideas for a theory, but not to test that
theory as well. When a model is found exploratory in a dataset, then following up that analysis with a
comfirmatory analysis in the same dataset could simply mean that the results of the comfirmatory analysis
are due to the same type 1 error that resulted in the exploratory model in the first place. The comfirmatory
analysis therefore will not be more informative than the original exploratory analysis. [14]
Stability of results
It is important to obtain some indication about how generalizable the results are. [15] While this is hard to
check, one can look at the stability of the results. Are the results reliable and reproducible? There are two
main ways of doing this:

Cross-validation: By splitting the data in multiple parts we can check if analyzes (like a fitted
model) based on one part of the data generalize to another part of the data as well.

Sensitivity analysis: A procedure to study the behavior of a system or model when global
parameters are (systematically) varied. One way to do this is with bootstrapping.
Statistical methods
A lot of statistical methods have been used for statistical analyses. A very brief list of four of the more
popular methods is:

General linear model: A widely used model on which various statistical methods are based (e.g. t
test, ANOVA, ANCOVA,MANOVA). Usable for assessing the effect of several predictors on one or
more continuous dependent variables.

Generalized linear model: An extension of the general linear model for discrete dependent
variables.

Structural equation modelling: Usable for assessing latent structures from measured manifest
variables.

Item response theory: Models for (mostly) assessing one latent variable from several binary
measured variables (e.g. an exam).
Free software for data analysis

ROOT - C++ data analysis framework developed at CERN

PAW - FORTRAN/C data analysis framework developed at CERN

JHepWork - Java (multi-platform) data analysis framework developed at ANL

KNIME - the Konstanz Information Miner, a user friendly and comprehensive data analytics
framework.

Data Applied - an online data mining and data visualization solution.

R - a programming language and software environment for statistical computing and graphics.

DevInfo - a database system endorsed by the United Nations Development Group for monitoring
and analyzing human development.

Zeptoscope Basic[16] - Interactive Java-based plotter developed at Nanomix.
Nuclear and particle physics
In nuclear and particle physics the data usually originate from the experimental apparatus via a data
acquisition system. It is then processed, in a step usually called data reduction, to apply calibrations and
to extract physically significant information. Data reduction is most often, especially in large particle
physics experiments, an automatic, batch-mode operation carried out by software written ad-hoc. The
resulting data n-tuples are then scrutinized by the physicists, using specialized software tools
like ROOT or PAW, comparing the results of the experiment with theory.
The theoretical models are often difficult to compare directly with the results of the experiments, so they
are used instead as input forMonte Carlo simulation software like Geant4, predict the response of the
detector to a given theoretical event, producing simulated events which are then compared to
experimental data.