Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ringing artifacts wikipedia , lookup
Electric power system wikipedia , lookup
Standby power wikipedia , lookup
Electrification wikipedia , lookup
Alternating current wikipedia , lookup
Audio power wikipedia , lookup
Power over Ethernet wikipedia , lookup
Switched-mode power supply wikipedia , lookup
Mechanical filter wikipedia , lookup
Distribution management system wikipedia , lookup
Power engineering wikipedia , lookup
Analogue filter wikipedia , lookup
CHREC F3: Target Tracking Rafael Garcia 11/26/08 F3 Goals, Motivations, & Challenges Goals Formulation Analyze & examine available multi-FPGA platforms and tools for scalable system design Motivations Meet performance requirements in HPC/HPEC scenarios by mapping across multiple FPGAs Exploit multi-FPGA platforms to develop larger, complex designs and algorithms Increase understanding of performance prediction, power, and usability for scalable apps F3 Insights Challenges Perform multilevel algorithm partitioning, analysis, and optimization for multi-FPGA systems Determine influence of application characteristics on selection of platforms, tools and languages 2 Translation Design Develop applications & design strategies for scalable architectures from case-study Execution Kalman Filter Overview Traditional Kalman filters estimate the state of a dynamic system in a noisy environment Commonly used in target prediction and can be extended to multiple dimensions, targets, and models Excellent target tracker when an accurate model is known Useful even if an accurate model is not known Current Architecture 4 tightly coupled FPGAs mapped to 4 quadrants System is driven by two global clocks 100MHZ inter-FPGA communication links 50MHz data-processing clock Inter-FPGA communication occurs when target crosses a quadrant boundary 2-step processing cycle returns results at 25MSa/s Current state of target is passed along Non-pipelined design 2-step cycle where one cycle depends on the previous one and the other cycle depends on pseudo-sensor data from host CPU Low frequency and lack of pipeline registers is expected to lower power consumption 2-cycle design simplifies communication network Current Architecture Resource M4K rams DSPs ALUTs Stratix II: EP2S180F1020C3 1% 15% 2% Continuously receiving pseudo-sensor data and returning condensed information Limited to a single target per quadrant Set sensor sampling rate of 25MSa/s Simplified Algorithm Assumes steady-state operation Target must closely follow given movement model for accurate results Model tracks four parameters Allows for precomputed covariance and Kalman-gain terms RCML Representation i=4 Start/ Initialize Next-state prediction Time-Step Advance Update error covariance Generate Sensor Readings i=4 for each D value in MeasurementVector Report Current Results Horizontal position Vertical position Horizontal velocity Vertical velocity Remove the hardcoded terms, increasing prediction accuracy during nonsteady-state situations Modify model to include Zaxis parameters for airborne targets Time-Update (“Predict”) BCast Measurement-Update (“Correct”) Gather Update error covariance Correct prediction Data Set: PredictionVector Element Type: fixed Position fixed Acceleration Num Elements:= 4 Algorithm Changes for each C value in PredictionVector Compute Kalman gain Data Set: MeasurementVector Element Type: fixed Position fixed Acceleration Num Elements:= 4 New Module Types Sensor Target Precision Resource Kernel Low Power Fast Sampling Slow Fast Fixed Fixed Low Low Kalman Filter Kalman Filter Airborne Noisy Multiple Floating Floating Floating High Medium High MKS Kalman Filter Feature Selection Multi-Scale High-Noise Selective Kalman Filter Estimates state of a dynamic system in a noisy environment In this case, the ‘dynamic system’ is a moving target Commonly used in target prediction and can be extended to multiple dimensions, targets, and models Assumes sensor noise is white Gaussian noise Requires a pre-programmed model describing the target’s motion Works in a continuous 2-cycle loop Developed in 1960 by Rudolf E. Kalman (A UF professor from 1971-1992!) Kalman System Models Noisy Samples Kalman Filter Accurate Samples Mostly Accurate Samples NE wind -9.8 m/s at as 23mph Kalman Filter can be viewed a simple black box An input stream of samples measuring a target’s position is contaminated with noisy samples Follows Road The output is a stream of samples with most of the noisy samples filtered Reasons for sensor noise Battery Power variable battery voltage Sensors low quality sensors environmental conditions rain, dust, night-time tracking, snow Multiple targets misinterpreted samples from neighboring targets during multiple-target tracking bad orientation, obstructed sensor Environment cost-cutting for mass production sometimes requires cheap sensors incorrectly deployed sensors voltage regulators cost money, draw power, and are not perfect Sensor processing stage must ensure proper target isolation Wireless signal bad data from neighboring sensors due to a weak wireless signal Kalman Filter example PR Virtual Architecture with Kalman Filters Sensor records samples Image processing step extracts specific features Target size, vertical position, horizontal position, target bearing, elevation, etc. Kalman filters extract sensor noise Results are sent to a central location to be displayed VLX25 Communication architecture Sensor Interface Switch 3 Switch 4 Switch 5 Module interface Module interface Module interface Module interface Kalman Kalman Kalman Kalman Kalman filter filter filter filter filter Switch 1 Switch 2 Module interface Display Interface FPGA and PR benefits for the Kalman Filter FPGA amenable features Low memory requirements Simple filter with streaming inputs and outputs Requires only multiplication and addition No complex time-consuming operations such as division, square-root, differentiation, etc. Low bandwidth requirements Can be implemented using only logic and MAC units Filter receives/produces a stream of coordinates, not a stream of images PR amenable features Optimum resource usage The right filter type for the right job Swapping modules does not halt execution Active filters are never disturbed Experimental FPGA Power Measurements Experimental Setup GiDEL Host Specifications Dual Xeon 3.00 GHz processors (Pentium 4 era) 2GB RAM Single 500GB hard drive CD Drive 600W max power supply (Kappa clone) ProcStar II Power Characteristics Main board supply rated at 7.6A at 3.3V 7.6A × 3.3V = 25.08W maximum power available to: Stratix II EP2S180 FPGA (4x) 2GB SODIMM DDR memory(2x)(only 1 used for tests) 64MB SRAM memory (8x) Miscellaneous oscillators, peripherals, controllers, etc. This means roughly 5W max available to each FPGA Test Design Characteristics Kalman tracking filters Heavy multiplier usage, no block rams, minimal logic usage (w/ dedicated multipliers) In all cases, design runs at 33MHz Methodology GiDEL host system measured without FPGA board P3 Kill-A-Watt AC power meter used for measurements 0.2% documented accuracy Accurate to within 1 Watt 7 different test cases with varying power utilization GiDEL host system measured with FPGA board Same 7 test cases were used (without loading an FPGA design) This provides minimum power-use baseline for ProcStar II GiDEL board is loaded with FPGA-computationally intensive design CPU is kept idle Power consumption under regular design is measured (@ 33 MHz) Power consumption under maximum-multiplier-use design is measured (@ 33 MHz) 2% logic use (per FPGA) 15% multiplier use (per FPGA) 1 filter instance per FPGA 4% logic use 88% multiplier use 7 filter instances per FPGA Power consumption under maximum-logic-use design is measured (@ 33 MHz) 77% logic use 0% multiplier use 34 filter instances per FPGA Without ProcStar II With ProcStar II 1. Server off (not standby) 8W 8W 2. Idle 127 W 137 W 3. Idle with CDROM spinning 131 W 141 W 4. Full HDD load (defrag) 132 W 143 W 5. Full CPU load (1 thread) 188 W 198 W 6. Full CPU load (4 threads) 255 W 257 W 7. Full CPU/HDD load (3 threads, defrag) 258 W 264 W Difference in Power (Watts) Test Cases Power Consumed (Watts) Results: Baseline ProcStar II GiDEL Server Power Consumption 300 200 100 0 1 2 3 4 5 6 7 Case Number Without Procstar II With Procstar II GiDEL Server Power Consumption (Difference) 15 10 5 0 1 2 3 4 5 Case Number 6 Threads are simple while(1) loops Although only 2 cores are present, 4 threads were used to bypass Hyper-threading and OS scheduling HDD load is an exception since defrag requires its own thread to be effective 7 Results: Kalman Filters on ProcStar II Power estimates 12.5% toggle rate assumed @ 33 MHz Experimental numbers below assume FPGAs consume all power (ie. ProcStar II memories, glue logic, etc. consume 0W) Design 1 140 W total power 15% mult., 2% logic 1 filter instance, high Fmax Design 2 140 W total power ~3.25 W per FPGA ~3.25 W per FPGA 88% mult., 4% logic 7 filter instances, high Fmax Design 3 152 W total power ~6.25 W per FPGA 0% mult., 77% logic 34 filter instances, low Fmax Results: Kalman Filter in ProcStar II Altera EP2S180 FPGA Power Comparison (single FPGA) 7 Power Consumption (Watts) 6 5 4 Early Estimator Spreadsheet PowerPlay 3 Measured Power* 2 1 0 Design 1 Design 2 Design 3 *Measured power is derived by subtracting baseline power consumption on ProcStar II board from measured power consumption and dividing by 4 Power consumed from board components not accounted for, actual FPGA power consumption is lower Questions?