Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machine Vision • The goal of Machine Vision is to create a model of the real world from images – A machine vision system recovers useful information about a scene from its two dimensional projections – The world is three dimensional – Two dimensional digitized images Machine Vision (2) • Knowledge about the objects(regions) in a scene and projection geometry is required. • The information which is recovered differs depending on the application – Satellite, medical images etc. • Processing takes place in stages: – Enhancement, segmentation, image analysis and matching (pattern recognition). Illumination Image Acquisition Scene Machine Vision System 2D Digital Image Feedback The goal of a machine vision system is to compute a Meaningful description of the scene (e.g., object) Image Description Machine Vision Stages Image Acquisition (by cameras, scanners etc) Image Processing Image Enhancement Image Restoration Image Segmentation Image Analysis (Binary Image Processing) Model Matching Pattern Recognition • Analog to digital conversion • Remove noise/Patterns, Improve contrast • Find regions(object) in the image • Take measurements of objects/relationships • Match the above description with similar description of known objects(models) Image Processing Image Processing Input Image Output Image • Image transformation – Image enhancement (filtering, edge detection, surface detection, computation of depth). – Image restoration(remove point/pattern degradation: there exist a mathematical expression of the type of degradation like e.g. Added multiplicative noise, sin/cos pattern degradation etc). Image Segmentation Image Segmentation Input Image Regions/Objects • Classify pixels into groups(regions/objects of interest) sharing common characteristics. – Intensity/Color, texture, motion etc. • Two types of techniques: – Region segmentation: fine the pixels of a region. – Edge segmentation: find the pixels of its outline contour. Image Analysis Image Analysis Input Image Segmented Image (regions, objects) Measurements • Take useful measurements from pixels, regions, spatial relationship, motion etc. – Gray scale / color intensity values; – Size, distance; – Velocity Pattern Recognition Model Matching Pattern Recognition Image/regions •Measurements, or •Structural description Class identifier • Classify an image(region) into one of a number of known classes – Statistical pattern recognition(the measurements form vectors which are classified into classes); – Structural pattern recognition(decompose the image into primitive structures). Relationships to other fields • • • • • • Image Processing (IP) Pattern Recognition (PR) Computer Graphics (CG) Artificial Intelligence (AI) Neural Networks (NN) Psychophysics Computer Graphics (CG) • Machine vision is the analysis of images while CG is the decomposition of images: – CG generates images from geometric primitives (lines, circles, surfaces). – Machine vision is the inverse: estimate the geometric primitives from an image. • Visualization and virtual reality bring these two fields closer. Machine Vision Applications • • • • • • • Robotics Medicine Remote Sensing Cartography Meteorology Quality inspection Reconnaissance Machine Vision Systems • There is no universal machine vision system – One system for each application • Assumptions: – Good lighting; – Low noise; – 2D images • Passive – Active environment – Changes in the environment call for different actions (e.g., turn left, push the break etc). Vision by Man and Machine • What is the mechanism of human vision? – Can a machine do the same thing? – There are many studies; – Most are empirical. • Humans and machines have different – Software – Hardware Human “Hardware” • Photoreceptors take measurements of light signals. – About 106 Photoreceptors. • Retinal ganglion cells transmit electric and chemical signals to the brain – Complex 3D interconnections; – What the neurons do? In what sequence? – Algorithms? • Heavy Parallelism. 시각 • 눈의 영상 감지 과정 – 안구→각막→홍체 수축→렌즈 모양 변함→광자가 망막 감광 기관에 몰림→감광기관(rod,cone)은 빛의 강도와 형태, 색깔을 신경 자극으로 변환→감광 신경 들의 자극들을 뇌에 전달→뇌에서 신경자극들을 해석 • 감광기관 – Cone • 컬러 감각 기관, 망막중앙의 fovea에 집중됨, 3가지 종류(R,G,B) (7백만개) – Rod • 강도 감각 기관으로 빛의 강도에 매우 민감 (1억2천5백만개) Spectral response of cones Machine Vision Hardware • PCs, workstations etc. • Signals: 2D image arrays gray level/color values. • Modules: low level processing, shape from texture, motion, contours etc. • Simple interconnections. • No parallelism Geometrical Model • Determines where in the image plane the projection of a point will be located. – the projected image is inverted – (x,y,z) is projected on (x’,y’) – f: focal length • Avoid inversion by assuming that the image plane is front of the center of projection – done automatically by cameras or by the human brain • Apply Euclidean geometry – x’ = x f /z and y’ = y f/z Sampling and Quantization • Sampling : – Sample within the image at a finite number of points from a scene. – Spatial resolutions ... image resolution. • Quantization : – Represent each sample within the finite word size of the computer. – 24 bit color, 16 bit color, 8 bit color, 8 bit grey, .... • Pixel : – Each image sample. – Unsigned 8-bit integer in the range . – 0 ... black, 255 ... white, the shades of gray ... middle values. 디지털 영상 취득 • 디지타이저(Digitizer) – 아날로그 형태의 그림을 디지털 형태로 변환 – 스캐너, 디지털 카메라, Frame Grabber • 샘플링(Sampling) – 아날로그 비디오 신호를 동등한 공간 크기로 데 이터 포인트(pixel)들을 획득 디지털 영상 취득 • 양자화(Quantization) – 아날로그 신호의 디지털 값을 결정하는 과정 – Bit/Pixel이 작을 수록 윤곽의 일그러짐이 두드 러짐 • 변환 과정 – 원본→샘플링→변환기에 의해 빛을 밝기로 표 현되는 신호로 변환→A/D 변환기에 의해 양자 화 디지털 영상 취득 • Image Sensor(CCD, CMOS) : 빛 에너지→ 영상신호 • A/D신호 변환(Sampling, 양자화) – Pixel(화소) – Gray-level Sampling Gray Level Resolution ■한 화소(pixel)에서 표현되는 명암의 개수 ■Binary image : 0/1 ■Gray-scale image: 0~2n-1, n: 화소당 bit 수 – 8-bit image : 0-255 1-bit 3-bit 8-bit 디지털 영상 저장 • 보통 Header + Data로 저장 • 다양한 포맷 : BMP, TIF, JPEG, … • Portable Bitmap : 사용이 간편 – PBM(portable bitmap) • 간단하고 다른 그래픽 파일 형식으로 변환 용이 • 0과 1의 값으로 표현, 바이트당 8개의 화소를 저장 – PPM(portable pixmap) • 컬러 영상, 화소당 3 바이트 필요 – PGM(portable graymap) • Gray level 영상, 바이트당 한 화소 저장 영상 저장을 위한 단일포인터 이용 방법 unsigned char *inputImg; // 입력 영상의 기억 장소에 대한 포인터 변수 unsigned char *resultImg; // 출력 영상의 기억 장소에 대한 포인터 변수 int imageWidth; // 영상의 가로 크기 int imageHeight; // 영상의 세로 크기 int depth; // 1 = 흑백 영상, 3 = 컬러 영상 // memory 할당 inputImg = (unsigned char *) malloc(imageWidth * imageHeight * depth); resultImg = (unsigned char *) malloc(imageWidth * imageHeight * depth); // 흑백영상에 대한 산술덧셈 예 for (y = 0; y < imageHeight; y++) for (x = 0; x < imageWidth; x++) { value = inputImg[y * imageWidth + x] + 100; if (value > 255) value = 255; resultImg[y * imageWidth + x] = value; } Levels of Computation • Point Level : fB [i,j] O po int {fA [i,j]} • Local Level : fB [i,j] O lo cal {fA [ik ,jl ]; [ik ,jl ] N [i,j]} • Global Level : The output depends on the whole picture. Ex) a histogram of intensity values, the Fourier transform • Object Level – Most applications of computer vision require properties to be computed at the object level. – Size, average intensity, shape, and other characteristics of an object must be computed for the system to recognize an object.