* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 5 - Purdue College of Engineering
Survey
Document related concepts
Transcript
ECE 477 Digital Systems Senior Design Project Rev 9/12 Homework 11: Reliability and Safety Analysis Team Code Name: Sports Telemetry Device Group No. 5 Team Member Completing This Homework: Brendan Claussen E-mail Address of Team Member: bclauss @ purdue.edu Evaluation: SEC DESCRIPTION MAX 1.0 Introduction 5 2.0 Reliability Analysis 40 3.0 Failure Mode, Effects, and Criticality Analysis (FMECA) 40 4.0 Summary 5 5.0 List of References 10 TOTAL 100 Comments: SCORE ECE 477 Digital Systems Senior Design Project Rev 9/12 1.0 Introduction The Sport Telemetry Device is a head mounted sensor that detects forces on a skull to predict concussions. To do this, many peripherals need to work in unison without fail. That data is produced from analog sensors, gyroscopes and accelerometers, and is translated to digital levels in the microprocessor. The data is then sent to a NAND flash for storage and every given time interval (predetermined) important data will be read from the NAND and sent to a Zigbit IC to be sent to a basestation on the sideline. This will enable coaches or staff to see real time data and make a decision to take an injured player out of the game. If any of the components in that chain fails, a player’s safety could be in jeopardy and brain trauma can go undetected in real time. Another issue arises with the Lithium-Polymer battery that is used to power the device. This battery can malfunction in very hazardous ways, such as a fire or leaking acidic fluids. The battery requires special attention and precautionary measures to ensure player safety. 2.0 Reliability Analysis There are five IC’s on the Sport’s Telemetry board to be considered in the reliability analysis: MT29F16G08QAAWC NAND Flash, TPS62203 3.3v regulator, ATZB-24-A2 zigbit and the XMEGA A3BU microcontroller. The zigbit (48 pins), NAND Flash (48 pins) and the microcontroller (64 pins) are all fairly complex and have the highest amount of functions that could go wrong. The regulator will be our hottest IC and will be under a great deal of stress so it should also be included. An assumption made throughout all the analysis is that components fall under the Airborn Uninhabited Cargo Environmental factor: “Environmentally uncontrolled areas which cannot be inhibited by an aircrew during flight. Environmental extremes of pressure, temperature and shock may be severe.” While our device will not be approaching temperatures reached by planes in the upper atmosphere, it is the only environmental factor that covers the issue of shock appropriately. Football tackles can reach hundreds of G force so we want to be prepared for shock intensive failures. Another assumption made across all parts is that they are not military grade. The quality of each IC analyzed is assumed to be of commercial standards. -1- ECE 477 Digital Systems Senior Design Project Rev 9/12 MT29F16G08QAAWC NAND Flash, Modeled with MOS Digital Gate Array Parameter name Description Value C1 Die complexity .29 Comments Mos Digital gate array, 30k-60k gates πT Temperature coeff. 2.1 Digital MOS 70 C C2 Package fail coeff. .024 48 pins non-hermetic DIP πE Environmental coeff. 5 Airbone Uninhabited cargo is used because it takes into consideration possibility of serve shock and temperature ranges. πQ Quality coeff 10 Commercial part πL Learning Factor 1.8 Out since Jan 2012 λp 13.1 failures / MTTF = 76335 hours until Entire design: 10^6 hrs failure Summary: Most of these factors cannot be lowered, such as the Die complexity or Temperature coeff. But the learning factor will steadily lower over the course of about 1.5 years. In the future this device may be seen as more reliable because there will be more documentation and source code for it. Also, the environmental factor chosen is more abrasive than what the device will actually encounter. The Airbone Uninhabited cargo experiences much greater swings in temperature and atmospheric pressure. The Sports Telemetry Device will only encounter an ambient temperature range of possibly -10C to 50C. The environmental factor could be lowered but this is a worst case analysis. While the MTTF is fairly acceptable, these factors are not the only elements in the NAND flash failing. It only has a minimum of 10,000 program/erase cycles before it is unusable. This is plenty of cycles for what we plan to use it for, but the cycles to failure may come before the MTTF. Also, as the program/erase cycles add up over time, bad blocks may also accumulate in the NAND at random. This will slowly render more and more of the 8GB unusable and software must be programmed to make sure these blocks are avoided. XMEGA A3BU microcontroller, Modeled as Microprocessor microdevice Comments. Parameter name Description Value C1 πT C2 πE Die complexity Temperature coeff. Package fail coeff. Environmental coeff. .28 1.6 .032 5 -2- 16 bit MOS micro Digital MOS 105 C 64 pins non-hermetic DIP Airbone Uninhabited cargo is used because it takes into consideration possibility of serve shock and temperature ranges. ECE 477 Digital Systems Senior Design Project Rev 9/12 πQ Quality coeff 10 Commercial part πL Learning Factor 1.8 Out since Jan 2012 λp 10.9 failures / MTTF = 91374 hours until 10^6 hrs failure Summary: Factors such as Package fail coeff. and learning factor could be lowered to bring the failure rate down. If the packaging was hermetically sealed instead of a plastic shell, it could withstand much more stress and avoid possible wear and tear. The Learning coeff. will also go down as time goes on and more documentation is created for it. Entire design: ATZB-24-A2 Zigbit, Modeled as a Microprocessor microdevice Parameter name Description Value Comments C1 πT C2 πE Die complexity Temperature coeff. Package fail coeff. Environmental coeff. .14 .84 .02 5 πQ Quality coeff 10 8 bit MOS micro Digital MOS 85 C 48 pins hermetic DIP Airbone Uninhabited cargo is used because it takes into consideration possibility of serve shock and temperature ranges. Commercial part πL Learning Factor 1 Out since 2009 λp 2.1 failures / MTTF = 476190 hours until 10^6 hrs failure Summary: The ATZB-24-A2 is our most reliable device due to its hermetic seal and low complexity die (compared to the NAND and the 16 bit Microcontroller). The Zigbit actually has its own microprocessor on board complete with general purpose IO, and ADC port and USART pins. This is why it is modeled the same as our XMEGA A3BU microprocessor. Entire design: TPS62203 3.3v regulator, Modeled as Low Freq. Si Fet Transistor Parameter name Description Value Comments λb πT πA πE Base failure rate Temperature coeff Application coeff Environmental coeff. .012 .84 2 20 πQ Quality coeff 8 MOSFET Digital MOS 125 C Power FET under 1 watt Airbone Uninhabited cargo is used because it takes into consideration possibility of serve shock and temperature ranges. Plastic Entire design: λp 19.5 failures / 10^6 hrs MTTF = 51062 hours until failure -3- ECE 477 Digital Systems Senior Design Project Rev 9/12 Summary: Our 3.3V regulator could possibly been modeled as a microprocessor microdevice due to its various comparator logic and the low amount of power actually handled by the device. With 3.7V and a couple hundred mA on the input of the regulator, it is only dealing with power around 1 watt. This combined with how tiny the IC is, approximately 3 x 3 mm, the power density/area is incredibly low. But instead of being modeled as a microdevice, it is more appropriate to use a Low Freq. Si Fet Transistor model. Low frequency indicates the frequency is below 400Mhz while the regulator only switches at the fastest speed of 1.5Mhz. While the device has the most chance of failure of the other analyzed parts, it is much better than expected for a regulator. However, possible improvement could be made in the quality coeff. as plastic is the worst possible quality rating for the model. 3.0 Failure Mode, Effects, and Criticality Analysis (FMECA) Criticality for each part was decided based on how likely it was to fail to detect injury to a player. However, since all of our peripherals are highly dependent on one another, just a single device failing could easily cause the entire system to be rendered useless to detect hard hits and/or concussions. When defining criticality, there were three levels chosen to represent the danger. Low criticality (λ=10^-6) means there is no danger to the user and still allows device to function by completing all five PSCC’s. Medium criticality (λ=10^-8) means there is no danger to the user but one or more PSCC’s cannot be met and some parts of the circuit may need replacing. High criticality (λ≤ 10^-9) means the device is now either useless or the device may cause harm to the user which also means that many or all of the PSCC’s cannot be met. A few assumptions were made in assessing each failure mode. To have the device “rendered useless” the device has lost its ability to transmit the data to the sideline to warn coaching staff of possible brain trauma. While it is possible for the device to still have other functionality and retains its ability to store data (but simply cannot transmit) the data is only available when downloaded via USB. This would allow further research after the game but this does a player no good if he/she receives a form a brain trauma and is sent back into the game to further exacerbate the injury. A second assumption made is when the input voltage for a device > 3.3V (the operating voltage) the effects may be random. Sometimes, if the increase is trivial enough, the circuit may work as intended. However over powering a device can sometime destroy IC’s and in the case of some of our complex IC’s, sometimes may only destroy subsections of a chip. The effects of over voltage can be unpredictable and would require thorough inspection of the circuit to assure all parts are still working as intended. 4.0 Summary Overall, the Sports Telemetry Device requires that very little goes wrong in the circuitry just to operate. There are very few redundant functions so each peripheral is required for the device to function. To help model the worst case scenario, each part in the reliability analysis was calculated as if it was receiving temperatures and shock forces so large, that humans would be unable in survive in those conditions. Even when using the worst case scenario model, all of the -4- ECE 477 Digital Systems Senior Design Project Rev 9/12 Sport Telemetry Device IC’s that were analyzed returned with acceptable failure rates. The most failure prone part is the 3.3V regulator at 19.5 failures/10^6 hours, but even that has a tiny failure rate when compared to other regulators. The device is using such a low power (around one watt at most) that the most failure prone part is not put under a large amount of stress. Issues that need special attention are the NAND flash and the Lithium-Polymer battery. The NAND will require software to avoid bad blocks and the loss of data. The Li-Po battery can only charge with a certain voltage and only for a given duration. To address these battery issues, the device has a specially designed charger IC that turns on a LED when charging is done. With these issues addressed, the Sports Telemetry Device will work for tens of thousands of hours before any part fails. -5- ECE 477 Digital Systems Senior Design Project Rev 9/12 5.0 List of References [1] "Military Handbook Reliability Prediction of Electronic Equipment." N.p., 2 Jan. 1990. Web. https://engineering.purdue.edu/ece477/Homework/CommonRefs/MilHdbk-217F.pdf [2] Micron, “NAND Flash Memory,” MT29F16G08CBACAWP datasheet, 2005 https://www.micron.com/~/media/Documents/Products/Data%20Sheet/NAND%20Fla sh/70%20Series/L72A_Production_Datasheet.pdf [3] "HIGH-EFFICIENCY, SOT23 STEP-DOWN, DC-DC CONVERTER." N.p., Mar. 2002. Web. http://www.ti.com/lit/ds/symlink/tps62203.pdf [4] "ZigBit 2.4 GHz Wireless Modules ATZB-24-A2/B0." N.p., June 2009. Web. http://www.atmel.com/Images/doc8226.pdf [5] Atmel, “8/16-bit Atmel XMEGA A3BU Microcontroller,” ATxmega256A3BU datasheet, Jan. 2012 http://www.atmel.com/Images/doc8362.pdf [6] "SOT23, Dual-Input, USB/AC Adapter, 1-Cell Li+ Battery Chargers." MAX1551, MAX1555. N.p., July 03. Web. http://datasheets.maximintegrated.com/en/ds/MAX1551-MAX1555.pdf [7] Analog Devices, “Ultra-Low Power, 2-Channel, Capacitance Converter for Proximity Sensing,” AD7150 datasheet, 2007 http://www.analog.com/static/imported-files/data_sheets/AD7150.pdf [8] InvenSense, “Integrated Dual-Axis Gyro,” IDG-500 datasheet, 2008 http://invensense.com/mems/gyro/documents/PS-IDG-0500-00-06.pdf [9] Analog Devices, “3-Axis +200 g Analog MEMS Accelerometer,” ADXL377 datasheet, 2012 http://www.analog.com/static/imported-files/data_sheets/ADXL377.pdf -6- ECE 477 Digital Systems Senior Design Project Spring 2009 Appendix A: Schematic Functional Blocks Figure A Covers the power supply of the board including the LiPo charger, regulator and the passive components of each. BATT refers to our 3.7V battery -7- ECE 477 Digital Systems Senior Design Project Figure B Covers Microprocessor to NAND flash connections and the passive components for the Microprocessor -8- Spring 2009 ECE 477 Digital Systems Senior Design Project Figure C Covers the connections between the Microprocessor and Zigbit IC (and Zigbit’s passive components) -9- Spring 2009 ECE 477 Digital Systems Senior Design Project Figure D Covers the ADXL377 Accelerometer, IDG-500 Gyroscope and their passive components -10- Spring 2009 ECE 477 Digital Systems Senior Design Project Spring 2009 Appendix B: FEMCA Worksheet FEMCA CRITICALITY DEFINITIONS: LOW: λ=10^-6 No danger to user and still allows device to function and complete ALL PSCC’s MEDIUM: λ=10^-8 No danger to user but 1 or more PSCC cannot be met HIGH: λ≤ 10^-9 User’s safety is put in jeopardy and/or renders the device useless FIGURE A FEMCA Failure Failure Mode No. A1 Battery output = 0V Possible Causes Dead Battery Loose connection Failure Effects Unusable device if unplugged from USB power source Method of Detection Observation Criticality High A3 A4 MAX charge IC output != 4.2V (with USB plugged in) Broken charge IC 3.3V Regulator output = 0V Faulty inductor (L1) Regulator output >3.3V Plug in and recharge battery Reconnect wires from battery to board Faulty power switch (S1) A2 Remarks Medium Replace switch Replace Part No power to Observation device peripherals High Device useless Too much power to device, possibility of broken parts High Unable to recharge battery Observation Shorted diode Regulator input out of operating range (3.3-6V) Shorted regulator -11- Observation ECE 477 A5 Digital Systems Senior Design Project Regulator output < 3.3V Shorted bypass capacitors (C1, C3) Possibility of peripherals not turning on Observation Spring 2009 High Criticality depends on how much the voltage drop is, could be negligible if drop low enough FIGURE B FEMCA Failure Failure Mode No. B1 Power Input = 0V B2 B3 Possible Causes Failure Effects Failed/Open regulator No device functionality Dead Battery Cannot store data Method of Detection Criticality Medium Unable to store data, renders device almost useless Medium Unable to store data, renders device almost useless Observation Power Input > 3.3 V Failed/Shorted Regulator Too much power, could break device Inoperable Microprocessor Failed Oscillator (F8) No device functionality Observation High Would be unable to warn of concussion in real time. User safety in danger Constant reset, loss of functionality of device Observation High Could be fixed by removing switch Observation Incorrect input voltage B4 Remarks Microprocessor Reset tied to ground permanently Switch (S3) soldered in wrong orientation -12- ECE 477 Digital Systems Senior Design Project Spring 2009 B5 Unable to read/write to Flash (NAND) Wrong latch enabled (software error) Limited device functionality Observation Medium If memory has older data, it should be safe still B6 Unable to read/write to certain block A bad block has formed Unable to use specific block (permanent) Observation Low Still able to read/write elsewhere FIGURE C FEMCA Failure Failure Mode Possible Causes Failure Effects No. C1 Zigbit Input = 0 V Failed/Open regulator No device functionality Dead Battery Method of Detection Cannot transmit to other devices Criticality High Would be unable to warn of concussion in real time. User safety in danger High May require replacement of zigbit High Could be fixed by removing switch Observation C2 C3 Zigbit Input > 3.3 V Reset tied to ground permanently Failed/Shorted Regulator Insufficient bypass capacitance (C12) Switch (S4) soldered in wrong orientation Too much power, could break device Cannot transmit to other devices Remarks Observation Constant reset, loss of functionality of device -13- Cannot transmit to other devices Observation ECE 477 Digital Systems Senior Design Project Spring 2009 C4 Unable to program via JTAG Bad JTAG connector/connection (SV1) Unable to initially Observation program zigbit, loss of functionality High Replace connector, check connection C5 USB output = 0V (assuming it is plugged in) Bad connection from USB pin to trace Loss of ability to recharge battery Medium Resolder USB pins to board Possible Causes Failure Effects FIGURE D FEMCA Failure Failure Mode No. D1 Power Input < 3.3V D2 Power Input > 3.3 V Failed/Open regulator No device functionality Dead Battery Cannot measure Bad connector (JP1) and create data Sorted bypass capacitors (C1) Failed/Shorted Regulator Too much power, could break devices -14- Observation Method of Detection Criticality High Renders device almost useless High Renders device almost useless Observation Observation Remarks ECE 477 D3 D4 Digital Systems Senior Design Project Invalid Gyroscope Data Invalid Accelerometer Data Broken resistors (R1, R2) Broken bypass capacitors (C2, C3, C4) Spring 2009 Invalid data Observation High Invalid data Output data does not match what is physically happening to the board Observation Unable to determine concussions, device almost useless High Unable to determine concussions, device almost useless Output data does not match what is physically happening to the board -15-