Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machines Without Screens Part of the Topics in Computing Series of Lectures Dr. D. Fitzpatrick Friday, 2 November 2007 In this Lecture: • • • • How some of our senses work Synthetic Speech and how it works Describing Maths using speech 3D audio, and Force Feedback How the Eye Works • Light rays enter the eye through the cornea. • The cornea takes widely diverging rays of light and bends them through the pupil • The lens of the eye is located immediately behind the pupil. The purpose of the lens is to bring the light into focus upon the retina, the membrane containing photoreceptor nerve cells that lines the inside back wall of the eye. • The photoreceptor nerve cells of the retina change the light rays into electrical impulses and send them through the optic nerve to the brain How People Read • not a linear progression • use of Sacades and fixations • consequence? Eye tracks to highlighted, or other key portions of the page. How the Ear Works • The outer ear or pinna (plural pinnae) leads to the middle ear’s auditory canal or meatus. • The auditory canal terminates with the ear drum, or tympanic membrane. • Beyond the ear drum is the inner ear, which contains the hidden parts of the ear encased in bone . How the Ear Works II • There are semicircular canals, and three liquidfilled passages that are associated with equilibrium rather than hearing. • They tell us about the orientation of the head • cause us to get dizzy when they are malfunctioning • cause some of us to get seasick when the head, body and eyes undergo motional disturbances. • The three little bones of the air-filled middle ear which are attached to the eardrum, excite vibrations in the cochlea, the liquid-filled inner ear. How we hear Sound • In the cochlea the vibrations of sound are converted into nerve impulses which travel along the auditory nerve, toward the brain • The purpose of the auditory canal is to guide sound waves to the ear drum. The pinna acts as a collector of sound from the outside world, and also acts as a directional filter. • The intensity of a sound wave in the auditory canal is proportional to the intensity of the sound wave that approaches the listener. Sound Waves • We are immersed in an ocean of air. • The snapping of fingers, speaking, singing, plucking a string or blowing a horn set up a vibration in the air. • The sound wave travels outward from – the source as a spherical wavefront – It is a longitudinal wave – In contrast, waves in a stretched string are transverse waves How fast does the sound wave travel? • If the air temperature is 20 degrees Celsius a sound wave travels at a velocity of 344 metres or 1,128 feet a second • Sound travels in helium almost 3 times as fast as in air, and longitudinal sound waves can travel through metals and other solids far faster. How Do We Hear? • The sound waves that travel through the air cause components of our ears to vibrate in a manner similar to those of the sound source. • What we hear grows weaker with distance from the source, because the area of the spherical wave front increases as the square of the distance from the source, and power of the source wave is spread over that increasing surface. • What actually reaches our ears is complicated by reflections from the ground and other objects. Role of Speech • Primary mode of communication • Convey emotional content Problems with synthetic speech • Monotonous; basically uninflected speech • Not possible to convey emotional content • Consequence? very boring... What is speech? • speech can be decomposed into three primary components: – frequency, amplitude and time. • “Frequency is the term used to describe the vibration of air molecules caused by a vibrating object...which are set in motion by an egressive flow of air during phonatation.” measured in Hertz (Hz). • Speech not as simple as other acoustic sounds: can contain many elements vibrating at different frequencies. • frequency of repetition referred to as the fundamental frequency f0. What is Speech? II • Amplitude: The acoustic component which gives the perception of loudness. – “the maximal displacement of a particle from its place of rest” – measured in decibels • Duration: the third component in the acoustic view. – The measurement, along the time-line of the speech signal Introducing Prosody • Simple description: Inflection • that set of features which lasts longer than a single speech sound. • “. . . those auditory components of an utterance which remain, once segmental as well as non-linguistic as well as paralinguistic vocal effects have been removed” What will it Sound Like? • The aim is to discard the monotone • E.g: If emboldened text is found: 1. Rate will slow (Duration) 2. Pitch range will increase (F0) 3. Volume will increase (Amplitude) • Most structural and font information will be conveyed by prosody Speaking Text Attributes • Major headings read as section x. – Slower rate, they have a lower average pitch, a lower baseline fall. • Minor headings read as x.x, – Same slower rate, lower average pitch, lower baseline fall • Emphasis increase pitch range, increase accent height, minimise smoothness, maximise richness, increase amplitude where possible. Speaking Maths • We intend to use prosodic changes to convey equations: 1. The prosodic system is already familiar 2. Prosody is capable of expressing mathematical material 3. All we have to do is...match the prosody to the maths!! Mathematical Prosody • Equations resemble a tree when broken down • Nested levels conveyed by: 1. use of parentheses or brackets 2. juxtaposition of symbols; vertical & horizontal Linearity • Only the most simple math is linear – a=b*c – This is easy to represent in a linear fashion – Unfortunately, math doesn’t stop here, though many wish it did? – Now try a=b*c-d – Still linear - well sort of! Linearity • Using implicit hierarchy rules it would be understood to be a = (b * c) - d • But what if we really wanted a = b * (c - d) Linearity • These simple equations are still considered to be linear in nature • But, linearity has a very short half life when learning math • Math rapidly becomes two-dimensional • Representing that non-visually becomes difficult Linearity Linearity • This relatively simple equation could be represented a = sqrt(((x super 2 base) - y) / z) • Essentially, using parentheses, we can represent ANY equation in a linear fashion, BUT??? • Speech is a basically linear representation Designing a Browser? • What is the goal of a Math Browser? – To allow users to traverse an equation • • • • • In whole or in parts Forwards or backwards Upwards or downwards Under user control To convey structure and semantics 3D Audio • Surround sound has amazed film watchers for years immersing the audience in a full experience. • However, surround sound systems do not provide a true 3D reconstruction of a setting. • Surround sound provides location of sources within a single plane, a true reconstruction would include all possible planes, the entire sound field. 3D Audio II • Through research it has been learned that the physical shape of the ears and head affect how we perceive sound. • To perfectly record a sound field, microphones must either be placed in the ears, or a model of the human head complete with ear canals can be created with microphones inside the ear cavities. • These two audio channels can then be driven to headphones creating a very life-like reproduction of the 3D sound environment.