EFFICIENCY OF TEST PATTERN DECOMPRESSION TECHNIQUES
ONDREJNOVAK
TU Liberec, Czech Republic. E-mail: ondrej.novak@vslib.cz
Abstract. In the paper we study different automata from the test pattern decompression ability point of view. In the near past a lot of test pattern decompression techniques have appeared and it is difficult to decide, which kind of automaton is better for pattern decompression. Unfortunately test pattern decompression techniques are bounded with the algorithm of test pattern ordering and test data flow controlling. Some ofthe methods could have more sophisticated sorting algorithm, some of the methods may be supported by fault simulation and some can work without any information about the quality of previously generated test patterns. These differences between the decompression techniques cause that the efficiency ofthe automata is not comparable and the published results do not give us any idea of the best choice of decompressing automaton. We have performed experiments, in which we have studied the degradation of random input stimuli by the automata. We have found that the medium probability of creating an arbitrary r bit test vector is correlated with the decompression automaton efficiency. This statement has been verified by experiments with ISCAS circuits.
1. Introduction
A number of test pattern generation techniques have been proposed both for test-per-scan (TPS) and test-per-clock (TPC) BIST. Usually pseudorandom patterns are generated by linear-feedbackshift registers (LFSRs) and/orby other automata. As the pseudorandom patterns do not cover a part of random-pattern-resistant (hard) faults, mixed-mode techniques have been proposed. Mixed-mode methods like bit-flipping [5], bit-fixing [18], method [4] and [10] use the LFSR as a source of pseudorandom vectors and the LFSR outputs are masked with the help of AND, OR and XOR gates. The modified output vectors detect random-resistantfaults after or duringgenerating a limited number ofLFSR clock cycles. These methods could be quite hardware consuming because of the fact that for every random-resistant fault an extra decoding circuit has to be used. Another possibility of mixedmode testing is to store the patterns detecting the random-resistant faults in a memory and to load them through a scan chain to the circuitundertest (CUT). A lotof effortwas spent onreducing the memory requirements [8, 7, 13, 9, 12, 16]. In the mentioned methods the LFSR or other types of counters usually generate the pseudorandom patterns. The deterministic vectors have to detect random resistant faults and have to be computed by some ATPG tool. They are stored ina memory ina compressed mannerandthey are transformed back into patterns with the help of the pattern decompression hardware. The decompressed patterns are shifted through the scan chain to the CUT inputs.
Another possibility of mixed mode testing is to seed the automaton with the first test pattern and to generate the automaton sequence. The generated sequence can be deflected in regular instants by adding a modification input bit to the automaton state [15] orby reseeding all the automaton stages [12] so that patterns covering the hard-to-test faults are generated.
It is possible to use a shift register without any feedback instead of a decompression automaton (DA) with feedback taps [3, 15]. Itwas shownthat even such a simple decompressing
automaton can be relatively effective if a sophisticated input sequence is used. As this fact is not very well-known we demonstrate it in Tab. 1, where memory requirements for several ISCAS benchmark circuits [15] are compared with minimized test lengths generated by nowadays ATPG tools [6]. Another possibility is to use as a DA a multiple polynomial (MP) LFSR [7]. This method provides high decompression efficiency, butthe hardware overhead is relatively high. Johnson (twisted ring) counters were used in [9]. This automaton is very effective, because it has better decoding ability than a shift register and it is less hardware consuming than the LFSR. Programmable Johnson counter called folded counter was used also in [8]. Another possibility is to use a specialized decoder. This approach was used in [11] where the Golomb code decoder was designed.
Table 1.
Number of stored bits for a simple scan chain decompression [15] compared with the number of
bits in Mintest [6]
Circuit Stored bits Scan chain/Mintest
c7552 7354 0,49
s953 715 0,21
s1196 800 0,22
s1238 832 0,21
s1423 659 0,33
s1488 439 0,31
s5378 2147 0,10
s9234 8960 0,35
s13207 4553 0,03
s15850 5158 0,09
In this paper we discuss different D A‘s efficiency. According to results of several series of experiments we show that the efficiency of decompression can be estimated by evaluating the percentage of the (s,r) exhaustive test set created with the help of random bits, which were fed to the DA input.
2. Arrangement ofthe experiments
We suppose that the DA generates test vectors in a TPC mode,
i.e. every clock cycle one new test pattern. It would be possible to consider also the TPS mode but as the chosen mode does not impact relative effectiveness of pattern decompression we constrained our experiments on the TPC mode only. The automaton overlaps the scan chain, i.e. the automaton outputs are directly connected with the CUT inputs. If the automaton has lower number of stages than is the number of CUT inputs we complete the automaton with scan chain bits. We used an average percentage of exhaustive sets created on r-tuples (1<r<rmax) within n inputs as a measure of the decompression efficiency. The DA is fed with regularly distributed randombits in regular instants. This experiment arrangement can indicate whether any part of test vectors is difficult to be obtained from the DA. In order to keep comparability between different types of automata we have compared a percentage of exhaustive test sets in the time instants inwhichthe total number of “consumed” random input bits is the same for every automaton. We have performed the following experiments:
- Comparison of the percentage of created (s,r) exhaustive test set depending on the number of clock cycles performed between feeding a new input random bit into the automaton. The number of automaton stages is fixed.
- Comparison of the percentage of created (s,r) exhaustive test set depending on the automaton dimensions. The number
R&I, 2003, Ns 3
19
scan chain
binary
counter
LFSR with reseeding
LFSR
LFSR with output modification
100
80
60
40
20
0
CA
100 -f
50 -
0 -1^
100 -80 60 40 20 H 0
100
80
60
40
20
0
CA with primitive polynomial
MP LFSR
N <b <?>
100 80 60 -40 -20 -r-0
N * cM^nnn n *
Johnson counter
100 - 100 - 100 -
80 -60 - 80 -60 - 80 -60 -
40 - 40 - N 40 -
> 20 -
20 -0 - 20 -0 - 0 -
Fig. 1. Medium percentage of an (17,8) exhaustive test set for different automata. The exhaustive test set percentage is checked after performing 16 (the lowest curve), 32, 64, 128, 256, 512, 1024, 2048 and 4196 (the highest curve) clock cycles. 64, 128, 256, 512,
1024, 2048 and 4196 (the highest curve) clock cycles. The number of clock cycles performed after each new input bit feeding is a
parameter on the x-axis of the graph
of clock cycles performed between feeding a new input random bit is fixed.
- Comparison of the relative part of (s,r) exhaustive test set, which was obtained by DA for different values of r.
- Evaluation of the number of detected faults on ISCAS circuits.
The experimental results do not provide any information about the number of bits, which have to be stored in a memory in real decompression schemes, because they are based on random DA inputs and not on the compressed test patterns.
3. Experimental results
W e have found that if we choose the parameter r smaller than the automaton dimension and parameter n several times greater than the automaton dimension, the results are representative and similar for different experiments. In other words: the relative DA effectiveness can be verified by one representative experiment, variations ofthe parameters in the above mentioned range do not impact the final decompression effectiveness.
3.1. Number of performed clock cycles
In this experiment the mathematical model with 17 inputs was fed with a decompressing automaton output sequence. At the beginning the automaton was set to a randomly chosen state. A given number of clock cycles was performed and a new input bit was loaded to the first automaton flip-flop. This sequence was repeated until a given number of input bits was consumed for the sequence modification. We checked the percentage of (17,8) exhaustive test set on the model inputs in logarithmically distributed instants.
The simplest automaton, which can be used for test pattern decompression is a scan chain. A 17-bit scan chain is directly fed with random bit source and the scan chain flip-flops are connected with the CUT model inputs. Inthe first graphplotted inFig. 1 the scan chain ability of decompressing input sequence into test patterns is demonstrated. The graph shows an average percentage of an exhaustive test set, which is created on all 8-tuples of scan chain flip-fop outputs during shifting randomly generated input bits through the scan chain. The obtained exhaustive test set percentage is checked after performing 16 (the lowest curve), 32, 64, 128, 256, 512, 1024,2048 and4196 (the
LFSRs with different ________lengths__________
LFSRs with reseeding
MP LFSRs
100 -50 -0
q> S> A <6
100 -80 -60 -40 -20 -0 -
t—i—i—i—i—r~
A AAM q> <b\ <5
“i--1—i---1--r-
17 16 15 13 11 10 8 7 6 5
Fig. 2. Medium percentage of the (17,8) exhaustive test set generated by LFSRs, LFSRs with reseeding of all internal bits, Johnson counters, cellular automata and MP LFSRs with different lengths. The number of clock cycles between feeding new input bits is equal to 13
R&I, 2003, N 3
20
5-tuples
8-tuples
9-tuples
10 tuples
1,1 1
0,9 0,8 0,7
^ > 'V ^ ^
;^7“
-1-1-1-r*
1,1 1
0,9 0,8 0,7
N V A ,$>
1,1 1
0,9 0,8 0,7
\ fc A ,$> <b
1,1 1
0,9 0,8 0,7
K fc A nO
Fig. 3. Relative percentage of the (17,5), (17,8), (17,9), (17,10) exhaustive test sets generated by the 11-bit LFSR and the 11-bit CA divided by the relative percentage of this test set generated by the same number of randomly chosen 17-bit vectors. N=13. The percentages were checked after 2b clock cycles; b is a parameter given on the x-axis. The LFSR curves are marked with squares; the CA curves are marked with triangles
highest curve) clock cycles. The number of clock cycles performed after each new inputbit feeding is a parameter on the x-axis of the graph. In the next graph we have simulated the decompression ability of a 17-bit binary counter. The number of the counter clock cycles before loading a new 17-bit seed of the counter is 17 times higher than in previous case because all the 17 flip-flops of the counter have to be reseeded simultaneously. This arrangement enables us to compare the percentage of exhaustive test set at instants in which the same number of bits from a memory is “consumed” for automaton state modification. In the next graph we have plotted the decompression ability of an 11-bit LFSR with a primitive characteristic polynomial followed by a 6 bit scan chain. (This arrangement guarantees that the total number of outputs on which we check the exhaustive test set existence is equal to 17 as in the previous case). The generated test sequence is modified by reseeding of all internal flip-flops.
The number ofthe counter clock cycles before loading a new 11-bit seed of the LFSR is 11 times higher than in the first graph because all 11 flip-flops ofthe LFSR are reseeded simultaneously. we have used an 11 bit LFSR with a primitive characteristic polynomial. This arrangement enables us to compare the percentage of exhaustive test set at instants in which the same number ofbits from a memory is “consumed” for other automata state modification. Inthe next graph followedby a 6 bit scan chain as in previous case. The test sequence is modified by adding mod. 2 one bit only. In the next graphthe 11 bit LFSR sequence is modified on the LFSR output, the 17 bit patterns are obtained by shifting the LFSR content through the XOR gate controlled by the random modification bits. In the next graph we have plotted the decompression ability for an 11 bit cyclic cellular automaton withthe rule 60 implemented on all the cells followed by a 6-bit scan chain. In the next graph we have plotted the decompressionability for an11 bit cyclic cellularautomatonwith
the rules 90 and 60 implemented
LFSR and LFSR with output modification fault coverage
in such a way that a maximum output sequence length is
LFSR and binary counter faUt coverage
35
15
10
A A ^ 35 -i 30 - V-v a./"
. s/ \/\/ 25 - xy/\ rVyc
■ x a ^ A ^ / 20 - wy V V
r ■ 1 S-/V"V 15 -
1 1 1 1 1 1 1 1 10 J 1 1 1 1 1 1 1—'
1 3 5 7 9 11 13 15
R&I,
Fig. 4. Relative fault coverage of S953 circuit with different 11 bit DA (marked with a square) and with the 11 bit LFSR
2003, Ns 3
obtained. Inthe next graphwe have compared an 11-bit MP LFSR with regular switching between 4 different feedback schemes corresponding to 4 different primitive polynomials. The graph shows the average percentage of an exhaustive test set in the same way as it is done in the previous experiments. In the last graph we have used an 11 bit Johnson counter followed by the same scan chain as in the previous case.
We can see that the highest percentages of (n,r) exhaustive is obtained for the CA with primitive polynomial and then for the MP LFSR. In the case of the Johnson counter , there are gaps in the percentage for the numbers of clock cycles between feeding a new bit, which are equal to any multiple of the Johnson counter length.
3.2. Influence of automaton dimension
We arranged another set of experiments in order to get an idea of necessary number of automaton flip-flops which still keeps the percentage of generated (17,8) exhaustive test on sufficient level.
In Fig. 2 we show the results of these experiments. We have varied the automaton length from 0 to 15. Simultaneously we have concatenated the automaton with a scan chain with the length varying from 17 to 2. We have fixed the numberN of clock cycles performed before each input bit change equal to 13.
These experiments have shown, that if the DA dimension is higher than some minimal value, the quality of test patterns is not decreased.
3.3. Influence of the CUT cones size
In this section we present results of the decompression efficiency comparison for 5-tuples, 8-tuples, 9-tuples and 10-tuples ofLFSR and cyclic CA withthe rule 60 output sequences. In the graphs in Fig. we have plotted the relative numbers, which were obtained by division of the average percentage of the obtained exhaustive test set with the percentage of exhaustive test set, which can be obtained by a set of the same number of randomvectors. The number offlipflops for the CA
and the LFSR was equal to 11 in all experiments. The number of clock cycles between loading a new randomly chosen input bit was equal to 13. The graphs give us information, whether the exhaustive test set percentage obtained from the automata is greater or smaller than that one formedby vectors 1315 with randomly distributed bit values. We can also quantify the degradation of the input
21
LFSR and scan chain fault coverage
1 3 5 7 9 11
16 test patterns
256 test patterns
1024 test patterns
4096 test patterns
90 70 50 H
30
1 90-fc*Kl
1 11 21 31 41 51
1 11 21 31 41 51
“I-----1----1----1----1“
1 11 21 31 41 51
Fig. 5. 11 bit LFSR and 11 bit Johnson counter fault coverage of the ISCAS S953 circuit
random sequence by the decompressing automata. We have performed other experiments with different automata dimensions and different r-tuples. The degradation of input random sequence was similar to the plotted examples.
We have concluded that for r-tuples which sizes are close to the automaton dimension the degradation is more important. This means that the DA can be effectively used only for decompressing patterns with relatively low portion of care bits.
3.4. Experiments with ISCAS circuits
W e have performed several series ofexperiments withthe ISCAS circuits [19] . The experiments were done similarly to the above described experiments with the only difference of using a percentage of detected faults instead of the medium percentage of (n,r) exhaustive test set. We have experimented with the circuits C7552, S1238 and S953. We have ordered the tested DA according the medium fault coverage during the experiments with constant number of performed test (16) patterns and variable number of clock cycles between feeding a new random input bit (1-15). . The best results were obtained for CA with maximumperiod, thenforMP LFSR, LFSR, CA, Johnsoncounter, scan chain, binary counter. These results are correlated with the results of experiments with mathematical model. Some of the results are shown in Fig 5 and 6. In Fig. 5 we have exercised relative fault coverage ofthe LF SR with the output modification, scan chain and binary counter after decompressing 16 test patterns in the cases of changing the number ofperformed clock cycles between loading a new random input bit from 1 to 15. In Fig. 6 we can see that for 16 generated test patterns the Johnson counter has similarfault coverage as the LFSR, for256 and 1024 test patterns the fault coverage is substantially lower and for 4096 patterns it is comparable with the LFSR. (We do not consider the numbers of clock cycles between feeding a new bit, which are equal to any multiple of the Johnson counter length.)
4. Conclusions
We have verified that checking the relative percentage of the (n,r) exhaustive test sets obtained from D As, which are excited with regularly distributed random bits gives us a general idea of the decompression quality of DAs.
According to the experimental results given in the graphs we can choose which DA could be effectively used for intended compression parameters. We can estimate the necessary DA length and we can choose the maximum number of autonomously generated test patterns for which the degradation of the pattern quality is not still critical.
The high correlation between fault coverage obtained on the CUT and the percentage of created (n,r) exhaustive test set gives us a possibility to choose appropriate DA without detailed knowledge of the CUT structure.
The experiments have proved that a scan chain has the same patterndecompression effectiveness as other more complicated automata in case of using a new input bit every clock cycle. As it is the simplest DA solution it can be advantageously used.
22
The second simplest solution is the Johnson counter. The experiments have shown that for a limited number of clock cycles between feeding a new input bit it could be used for decompression also very effectively. For maximum memory savings (maximum numbers of autonomously performed cycles) we have to use more complicated hardware structures of DA in orderto avoid the pattern quality degradation. The experiments have shown that CAs with primitive polynomials are very suitable for this purpose.
Acknowledgment
The research was supported by the research grant of the Czech Grant Agency GACR102/01/0566.
References: 1. MurrayB.T., Hayes J.P. Testing ICs: Getting to the core of the problem. IEEE Computer. Vol. 29. P. 32-38, November 1996. 2. BrglezF., Bryan D., Kozminski K. Combinational Profiles of Sequential Benchmark Circuits. Proc. of. Int. Symp. on Circuits and Systems, 1989. PP. 1929-1934. 3.Daen W.,MuchaJ. Hardware TestPattern Generation for Built-in Testing. Proc. ofIEEE Test Conference, 1981. PP. 110-113.
4. Fagot C, Gascuel O., Girard P., Landrault C. A Ring Architecture Strategy for BIST Test Pattern Generation. Proc. of 8 ATS Symp. 1998.
5. WunderlichH.-J., Kiefer G. Bit-Flipping BIST, Proc. ofACM/IEEE Inter. Conf. on CAD-96 (ICCAD96), San Jose,California, November 1996. PP. 337-343. 6. HamazaogluI, Patel J.H. Test Set Compaction Algorithms for Combinational Circuits. Proc. of International Conf. on Computer-Aided Design, November 1998. 7. Hellebrand S., Rajski J., TarnickK.S., Venkataraman S. Courtois, B.: Built-In Test for Circuits with Scan Based onReseeding ofMultiple-PolynomialLinearFeedback Shift Registers. IEEE Trans. on Comp. Vol. 44, No. 2, February 1995. PP. 223-233. 8. HellebrandS., LiangH.G., Wunderlich H.J. A mixed modeBIST scheme based on reseeding offolding counters. Proc. ofIEEE ITC, 2000. 9. Chakrabarty K., Murray B.T., Iyengar V. Built-in Test Pattern Generation for High-Performance Circuits Using Twisted-Ring Counters. Proc. of IEEE VLSI Test Symp. 1999. 10. Chatterjee M, Pradhan D.K. A Novel Pattern Generator for Near Perfect Fault-Coverage, Proc. of IEEE VLSI Test Symp., 1995. 11. Chandra A., Chakrabarty K. Test Data Compression for Systhm-on-a-Chip Using Golomb Codes. Proc. of VTS 2000, 0-7695-0613-5/00. 12. Kaligeros E., KavousianosX., Bakalis D., Nikolos D. ANewReseeding Technique for LFSR-based Test Pattern Generation. Proc of7th IOLTW On-Line Testing Workshop, 2001. PP. 80-86. 13. Koenemann B. LFSR - coded test patterns for scan designs. Proc. Europ. Test Conf., Munich , Germany, 1991. PP. 237-242. 14. NovakO., Hlawiczka A., Garbolino T, Guczwa K. PlHva Z., NosekJ. LowHardware OverheadDeterministic Logic BIST with Zero-Aliasing Compactor. Proc. IEEE DDECS conf. Gynr, Hungary, 2001. PP. 29-35. 15. Novak O., Nosek J. Test Pattern DecompressionUsing a Scan Chain. Proc. ofthe2001 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems , 24-26 October 2001, San Francisco, California. P. 110-115. 16. Novak O. Pseudorandom,Weighted Random and Pseudoexhaustive Test Patterns Generated in Universal Cellular Automata, Springer: Lecture Notes in Computer Science 1667, Sept. 1999. PP. 303-320. 17. Garbolino T., Hlawiczka A., KristofA. Easy Integration ofBased on T-Type Flip-Flop Test Pattern Generators to the Scan Path. Proc. of European Test Workshop (ETW’00), Cascais, Portugal, May 2000. PP. 161-166. 18. Touba N.A, McCluskeyEJ. Synthesis ofMapping Logic for Generating Transformed Pseudo/random Patterns for BIST. Proc. of Inter. Test Conf., 1995. PP. 674-682. 19. Brglez F., Bryan D., Kozminski K. Combinational Profiles ofSequential Circuits,“ Proc. IEEE International Symposium of Circuits and Systems (ISCAS’ 89), Portland, May 1989.
R&I, 2003, N 3