Download Status of measurements of FE-I4 SEU and PRD

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microprocessor wikipedia , lookup

Immunity-aware programming wikipedia , lookup

Transcript
Status of measurements of FE-I4
SEU and PRD
P.Breugnon, M.Menouni, R.Fei,
F.Gensolen, L.Perrot, A.Rozanov,
08.06.2011
GR errors at start of run and Malte’s
On-Off trick
•
•
•
•
•
•
At the beginning very often we got GR readout errors with all bits zero at the start
of the run. Many iterations and many warm startups does not help to suppress
these GR errors. Probably the problem is amplified by the use of 3.5 m grey flat
cable, as other users do not see it with usual short flat cables or ethernet cables.
Usually Malte’s OFF/ON cycling trick solve the problem
After discussion with Malte more understanding of the trick: standard initial
configuration switch to active state all or half of the pixels. Noisy pixels could
probably perturb GR readout . ON/OFF cycling switch off pixel matrix with default
zero parameters and GR configuration in the Start run primitive restore only GR
values. So we decided to switch to configuration with PR zero values for GR tests.
Suppressing configure GR option in GRTEST reduce probability of GR read-back
errors. Too small delay ?
Other mesures: reducing chip consumption (zero PrmAmp*, DisVbn*), cold start at
the beginning of each each run (insure good Vdda from voltage regulators),
module configuration only outside beam spill.
With all these measures together we do not observe anymore GR errors at the
start of the run
Radiation monitoring and GR
• Previously reported zero GR error rate was due to the
wrong option in the GR readout primitive: configuration
was done just before the read-back. Now corrected to readonly.
• In the special runs with PS beam pointing GR we observe
high rates of GR errors
• When beam is pointing to GR position, observe counting
latch SEU errors in counter SR24 independent of GR errors.
Observed 16 non-zero counts in 237 spills (6.8%) with mean
of 2.1 seu/spill. So in average 0.15 seu/spill.
• Classification into 6 categories for typical spill of 25 1010
protons/cm2/spill
Classification of GR
•
•
•
•
•
•
1)PRDlike resets: All read GR bits zero, ServiceRecord non zero if
SR read placed after GR configure. Thanks Maurice for pointing to
it. SR21~100, SR24~0, SR25~0
2)Write glitch from Decoder: One bit flip in GR,
ErrorFlagb(SR24)=0, WrRegDataErr(SR25)>0
3)Internal glitch in Global Memory: One bit flip in GR,
SR24=0,SR25=0
4)Normal GR SEU: one bit in GR is flipped, SR24>0, SR25=0
5)Efuse errors: same type of SEU hard triple memory, but reset and
some logic with ARM cells, reload to GR from PROM (actually 0) if
single SEU bit flips in triple memory(actuallu with the bug).
6)”Frozen” bits in 16 words, SR24=16, SR25=0
PRD-like GR errors
• Biggest effect in GR errors, SEU hard triple memory
probably not responsible
• Typical rate with GR configure in every spill is 5.3 %
(79/1480 spills)
• In majority of cases SR21~100, SR24=0, SR25=0/2
• This rate looks too high for quoted threshold of 100
Mrad/sec during 20 ns.
• Our beam intensity corresponds to 0.0065 Mrad/spill or
0.015 Mrad/sec during the spill only
• Calibration of PRD wrong ?
• Reset glitches on doubled reset lines ?
PRD-like GR errors, run83, chip 28
PRD-like GR errors, run 83, chip#28
PRD-like GR errors, run 83, chip#28, 39
events
No GR errors, run 83, chip#28, ~700
events
Write glitch from Decoder to GR errors
•
•
•
•
•
One bit flips in GR
SR24=0 - no SEU in ConfigMemory
SE25>0 - error in Write from Decoder
Typical rate 0.7% (11/1480 spills)
Unfortunately single rate of SR25 with beam
pointing to GR is very high ~85%
• In normal beam position SR25=0
• Could it be reduced ?
Write glitch from Decoder run 83,
chip#28, 3 events
Internal Global Memory glitch
•
•
•
•
•
One bit flips in GR
SR24=0 - no single SEU in ConfigMemory
SE25=0 - no errors in Write from Decoder
Typical rate 0.7% (10/1480 spills)
On top of Triple Memory Cells there are some
SEU non-hard logic. Is it responsible for these
glitches ?
Internal Global Memory glitch, run 83,
chip 28, 3 events
Global Memory SEU error
• One bit flips in GR
• SR24=1 - single SEU in one of 3 redundant bits of
Triple ConfigMemory
• SE25=0 - no errors in Write from Decoder
• Observed rate 0.1% (2/1480 spills)
• Too high, as the experimental rate of single SEU SR#24
is 6.8%, we gain with triple latches only a factor 70.
Naïve extrapolation in absence of correlations
predicted rate is ~0.003 %. Could the correlations give
the worsening factor of 30 ?
Efuse GR errors
• SEU hard triple memory, but logic around in ARM cells and
reload from PROM in case of latch SEU
• Typical rate with GR configure in every spill is 0.4 %
(5/1480 spills)
• For the run with GR configure only at the run start high
rate of coherent SEU in 6 bits
• Efusedc32 “0->1”
• EfusedC35 “0->1”
• EfuseRef “15->0”
• Typical rate of coherent SEU 3% per spill (5/155 spills)
• Maurice propose to study it with/without correction
Efuse GR Write errors, run 83, chip
#28, 2 events
Scotch bits in run 83 , chip#27
• After iteration 29, abnormal scotched bits appeared
• SR#24=16 always, SR#25=0
• many words got wrong static error, not erasable by GR configuration,
cold start helps
• ERRMASK1:
65535
• DISVBN_CPPM: 62
• PRMPVBP:
0
• LVDSDRVIREF: 171
• BONNDAC:
237
• PLLIBIAS:
88
• LVDSDRVVOS: 105
• PLSRLDACRAMP:64
• PLSRVGOAMP: 255
• VTHIN_ALTFINE:128
• COLPR_MODE: 1
• COLPR_ADDR: 0
• CHIP_LATENCY: 210
• KILLDC38:
0
• KILLDC36:
0
• KILLDC35:
0
• CMDCNT0_12: 11
• LVDSDRVSET06:1
• CLK0_S2:
1
• EN_160M:
1
• RISEUPTAO:
7
•PULSERPWR: 1
•PULSERDELAY: 2
•EXTDIGCALSW: 0
•EXTANCALSW: 0
0
0
8
26
25
0
0
255
0
0
0
8
0
1
1
1
0
0
0
0
0
0
0
0
1
GR error rates, run 84, chip#28 , 750 spills
Summary GR error rates
•
•
•
•
•
•
•
average runs 75,81,83 chips#27 and #28
PRD-like GR errors 5.3 % (79/1480 spills)
Write from Decoder 0.7% (11/1480spills)
Internal GlobalMem glitch 0.7% (10/1480 spills)
GlobMem SEU 0.1% (2/1480 spills)
Efuse (Write) 0.3 % (4/1480 spills, 3 times less bits)
Efuse (Int) 0.1 % (1/1480 spills, 3 times less bits)
Preliminary PR SEU results
chips# 27 and 28
•
•
•
•
•
•
Runs 50,51,52,53,54,76
Type A (old) DC29 TDAC[1] and FDAC[1]
Type B (new) DC30 TDAC[1] and FDAC[1]
Normal SEU (<40 errors/bunch)
Coherent SEU(>40 errors/bunch )
Check DC28/DC31 request by Abder
Normal SEU chip
#27, run 52,
DC29- 0ld
DC30 -new
Vdda=1.5V-0.1V
Normal SEU chip
#27, run 52,
DC28- 0ld
DC31 -new
Vdda=1.5V-0.1V
SEU asymmetry
•
•
•
•
•
•
•
•
run 54, chip 27, Vdda=1.4 V-0.1V, Clk=1 Mhz
Old 0->1 623 bits flipped
Old 1->0 214 bits
New 0->1 4 bits
New 1->0 14 bits
Old R01= 2.9
New R01= 0.29
So biggest gain is in 0->1 transitions
Very preliminary SEU results chips# 27,28
• Normal SEU per spill ~20 10 10 protons/cm2
• Vdda1= 1.5V – 0.1V(drop)
• Chip 27 Type A 2.8 seu/spill/672bits (last week
4.6)
• Chip 27 Type B 0.1 seu/spill/672bits
• Chip 28 Type A 3.0 seu/spill/672bits (last week
8.0)
• Chip 28 Type B 0.1 seu/spill/672bits
Conclusions
• Abnormal high PRD rates: calibration of PRD or reset
glitches ?
• Too high Write decoder rates.
• Too high Internal Global Memory glitches
• Good single latch SEU rate ~6%
• High Triple Global Memory. Correlations ?
• One abnormal run with some scotched GR bits.
• New/old pixel SEU latches verified also in DC28/DC31:
much better performance of new latches
• 0->1 versus 1->0 SEU asymmetry opposite in new PR
latches
• More runs at PRD/GR position agreed for this week
Spare
Measurements
• Two FE-I4 chips installed in the PS beam Irrad3
• Chip ID28 (PC marslhc)
• Chip ID27 (PC marnach)
• Also chip SEU3D installed
• Order in the beam ID27,ID28,SEU3D
• Al foils on ID27 and SEU3
• Orientation: EOC on the top, beam traverse first PCB, second the
chips
• Horizontal beam position in the center (columns 39/40)
• Vertical beam position 5mm down from the center (far from EOC)
• Beam size 12x12 mm
• Start beam Thursday 12 may 2011
Beam properties
•
•
•
•
•
•
•
Supercycle with 36x1.2=43.2 sec period
But sometimes is changing
Typical 2 bunches of 40 x 10 10 protons
But sometimes 3-4 bunches
Typical bunch positions: 4, 6,14,26
Bunch length 400 msec
One iteration 2 (or 3) supercycles, 2-4
bunches per supercycle
Timing Problems
• Syncronize with Spill signal instead of CPS
cycle
• Delete time stamps, DCS voltages and currents
• Reduce readout to one DC per Spill
• Result: able to readout without superpositions with the beam (except sporadic
spills)
Timing Problems
• Joern propose to write one RootDB file per spill,
gain factor two in time
• Since this weekend we switched to this mode
• In this mode we have the time to double the
readout and write the time stamp, so we can in
principle correlate with beam information files
• Software not yet ready for large volume analysis
in this mode (handle of hunreds of files per run)
Measurement of beam profile by
diode (Maurice Glaser)