Reliability Testing and Failure Analysis: Lessons Learned
Delegates came from as far afield as Finland, Lithuania, and France to rainy Woking, in Surrey, UK, to attend the first in a series of reliability testing and failure analysis workshops organised jointly by EIPC and IPC. Engineers from process suppliers, PCB fabricators, EMS providers, and OEMs were keen to build their knowledge and understanding, to learn about material selection, current electronics research and failure analysis case studies, to gain an insight into design considerations for advanced assembly processes and modern analytical techniques for materials characterisation, and to discuss and share their own practical experiences.
Workshop leader and presenter was Dr. Martin Anselm, manager of Universal Instruments' Advanced Research in Electronics Assembly (AREA) Consortium, which conducts research into materials, processes and reliability for over 30 member companies, all big names in the electronics assembly industry.
He commented that because electronics OEMs and CEMs were under pressure to adapt quickly to new market trends and technologies, the drive to meet these expectations often resulted in a less-than-thorough understanding of new components, materials and processes, and an inability to implement credible reliability testing and validation procedures. In his opinion, the term “reliability” was frequently misused, since many tests were evaluations based upon internal, customer-driven, or industry-accepted standards which did not always provide meaningful pass/fail criteria and as a result were often no more than best guesses for the reliability of products in the field.
Theme of the first workshop session was the cost of failure, with specific case studies and a review of testing methods. Dr. Anselm set the scene by listing the areas in an EMS environment where failure could originate: Process, component, PCB, stress, commenting that people were not generally interested in hearing the actual facts and were more inclined to pass the blame up the supply chain: “It’s the supplier’s fault!”
There were many opportunities to reduce cost in the product life cycle, to streamline materials selection, to achieve continuous process improvement, to enhance yields, to optimise product reliability, and to modernise testing procedures. Total life cycle costs were largely determined at the design stage, and traditionally OEMs had spent 75% of their development costs on test-fail-fix activities, so it paid to focus on the front end of the product life cycle to reduce cost and time during concept and development.
Reviewing tools and techniques for failure analysis, Dr. Anselm took the phenomenon of CAF--conductive anodic filamentation along glass fibres in PCB laminates--as his first case study. CAF could occur trace-to-trace, hole-to-trace, hole-to-hole or trace to hole and a CAF fault could be difficult to locate, particularly since it could act as fuse once it had grown to reach the cathodic conductor and hence show a cyclic intermittent short circuit. Electrical testing could indicate an approximate location, then skilful microsectioning and microscopy could reveal the actual CAF defect, which might be in a specific glass-fabric layer in a construction and be batch-related to glass surface treatment or incomplete resin penetration far back along the supply chain--in effect “the supplier’s supplier’s supplier.” But once the problem was isolated and identified, there was an opportunity to work towards eliminating it in the future, and to determine whether other products traceable to a particular batch might show the same problem. “You will never answer the question unless you get to this level of detail!”
Dr. Anselm’s second glass-related example was “wicking”--penetration of plating chemistries along glass yarns resulting from fracturing of the glass in the PCB drilling process. He showed many examples of plated-though-hole defects and invited delegates to suggest how they had originated. Were they a consequence of drilling parameters, tool condition, stack height, backing board? Or were they a consequence of particular features in the board design--were via holes pitched too close together? Did they relate to non-functional pads? There were plenty of questions to ask the PCB supplier.
Moving on, he discussed instrumental analysis techniques, how and where they could be employed, what sort of data could be collected and how this could be used to demonstrate the origin of defects and failures. A case example he used was a problem with high-temperature lead-indium solder investigated by differential scanning calorimetry, which showed unexpected extra melting and solidification points at lower-than-nominal temperatures. Further investigation showed the effect to be caused by tin contamination. There should have been no tin in the system but a component buyer, eager to save money, had purchased tin-finish capacitors instead of the specified gold-finish and unknowingly initiated a serious reliability problem.
Ionic contamination of PCB assemblies was an ongoing area of concern, and ROSE-type tests gave limited information on the constituents of the contamination. Ion chromatography was a technique for separating and analysing complex mixtures of anions and cations so that individual species could be identified and their likely origin determined. Ion chromatography was very useful in investigating cleaning issues with low-standoff components and could be used on individual components if necessary.
The wetting balance, used in accordance with IPC J-STD-003, Solderability Tests for Printed Boards, continued to be an invaluable measuring tool for determination and investigation of soldering problems on PCBs and components.
Dr. Anselm turned the attention of the workshop to a discussion of the various microscopy techniques available and their applicability in defect investigation and failure analysis. Scanning acoustic microscopy was becoming a popular non-destructive analysis method, using ultrasound to detect changes in acoustic impedance in components and packages. Pulses of different frequencies were transmitted to the package through a fluid, and the intensity of reflections recorded and analysed to detect air-gaps, voids and delamination and to allow different interfaces to be examined and their mechanical integrity to be determined. Higher frequencies gave good resolution but low penetration and were suitable for examining flip-chips, whereas lower frequencies gave higher penetration at the expense of resolution and were more suited to BGA examination. White light interferometry was a useful technique for measuring the coplanarity of flip-chip bumps.
Scanning electron microscopy (SEM) was very widely used (and in some cases misused) for studying surface features and, in conjunction with energy dispersive X-ray spectroscopy (EDS), for providing rapid analysis of elemental composition. Dr. Anselm explained the principles of SEM and the differences between secondary electrons, backscattered electrons, and X-rays. Secondary electrons were emitted from atoms at the surface and produced an image of surface topography but could give confusing results in the interpretation of intermetallic areas if used on polished samples. Backscattered electrons were primary beam electrons reflected from atoms in the solid, and contrast in the image was determined by atomic number so it could show the distribution of different elements. Interaction of the primary beam with atoms in the sample caused the emission of X-rays whose energy was characteristic of the parent element. EDS worked by detecting and measuring X-ray energies to provide a means of elemental analysis.
An interesting case example involved the investigation of brittle failure of BGA joints in a laptop computer. The BGA finish was nickel-gold and the PCB finish copper-OSP, with tin-lead solder. But counter-intuitively, the brittle failure occurred at the component, rather than the PCB interface, and the incidence of failure appeared related to time above liquidus in the reflow operation. EDS analysis revealed a nickel-copper intermetallic at the component interface. Where had the copper come from? It had dissolved in the solder at the PCB surface, a longer time above liquidus had allowed it to diffuse into the bulk solder and then precipitate as a brittle intermetallic at the component interface. The problem was overcome by reducing the time above liquidus, using an adhesive to improve mechanical compliance, changing the PCB finish to nickel-gold and asking the component supplier to consider changing his under-bump metallisation from nickel-gold to copper--for reliability rather than cost purposes.
However elegant and powerful the scanning electron microscope might be, the optical microscope remained one of the most valuable general-purpose inspection tools available, especially when used with an intelligent choice of bright-field, dark-field or cross-polarised illumination to reveal different features and structures. Dr. Anselm stressed the critical importance of proper sample preparation, declaring that microsectioning and cross-sectioning was an art-form, although simple precautions like thorough cleaning between successive grinding and polishing stages could help enormously in achieving good-quality sections. Micro-etching of copper-based structures with peroxide-ammonia revealed different grain structures and intermetallic boundaries. Brute-force techniques such as dye penetration testing offered a quick way of checking which joints had failed under a BGA, or which pads had cratered, as an alternative to laboriously microsectioning every row of interconnections.
To round off the morning workshop session, Dr. Anselm invited suggestions from the audience as to the root cause of a cracked capacitor on a PCB assembly. The customer had supplied a couple of badly focused photographs of the component in question and wanted to know what had happened and whether it had been a defective component. Typically, the customer did not tell the full story and it was clear that without further and better information no conclusion could be reached. Some help was at hand from a component supplier’s white paper on failure mode classification, which gave examples of defects and their causes and from the shape and position of the crack suggested that it had been induced by thermal stress. It was not until the actual assembly was made available for examination that it could be demonstrated that the failure was a result of clumsy rework of an adjacent component. “What do you tell the customer?”
The afternoon workshop focused on a discussion of the reliability of lead-free solders, surface finishes, and lead-free-compatible laminates and Dr. Anselm began by reviewing the consequences of RoHS on PCB assembly, with particular reference to solder alloys, PCB processing temperatures, and PCB surface finishes.
Because RoHS restricted the use of lead except in some specific high-reliability applications, a wide range of lead-free solders had become commercially available, of which the most popular were tin-silver-copper (SAC) alloys: SAC105 containing 1% silver and 0.5% copper, SAC 305 containing 3% silver and 0.5% copper, and SAC 405 containing 4% silver and 0.5% copper. Apart from having higher melting points than eutectic tin-lead solder, SAC alloys showed significant differences in microstructure. A characteristic of tin was its anisotropic tetragonal crystal structure, which resulted in large differences in mechanical properties, depending on the lattice direction. Tin-lead alloys tended to have multiple small grains, which averaged-out these differences. But SAC alloys generally crystallised as a few large grains which, depending on their orientation, could have significant effects on reliability.
In general, SAC alloys were stronger than tin-lead, and their ductility varied with silver content--SAC105 was more ductile that SAC405--but there was not a direct correlation between strength and field reliability, it depended on the conditions. For example, SAC405 gave better results on thermal cycling whereas SAC105 gave better results on drop-shock.
So how to predict reliability? The goal of thermal cycle testing was to accelerate failures with much shorter cycle times than would be seen in service, either to predict life in service or at least to achieve the same rankings as life in service. Thermal expansion mismatch between component and PCB caused stresses in solder joints and the amount of thermal fatigue per cycle was related to the time of the applied stress and the stress relaxation properties of the joint. The performance of lead-free alloys in thermal cycle testing was more significantly dependent on dwell time than that of tin-lead. SAC alloys appeared better at shorter dwell times, whereas when dwell times were longer, and hence more representative of life in service, tin-lead appeared better. Clearly, alloy selection was critical and Dr. Anselm stressed the importance of making intelligent choices of test conditions and careful interpretation of results in attempting to predict reliability. Moreover, because of crystal orientation effects, there tended to be more scatter on lead-free results, so that failure was less predictable. Drop-shock was a separate issue, and tin-lead had been observed to give better reliability than any of the lead-free solders tested, although there were significant variances between different alloys.
“Mixed alloys” was another issue, particularly in “backwards-compatibility” circumstances where lead-free components had been soldered with tin-lead paste. Reflow temperatures could be kept lower than for all-lead-free to minimize thermal damage to components, but if reflow temperature was too low or paste volume was too small, head-on-pillow defects could result and/or thermal cycling life could be reduced.
The overarching message was to choose the solder alloy carefully to fit the application, particularly to decide whether thermal cycling or drop shock resistance was the priority. Solder considerations aside, mechanical reliability could be greatly enhanced by the use of full underfill or even simple corner bonds.
The higher melting points of lead-free alloys, for example 217 °C for SAC305 compared with 183 °C for 63:37 tin-lead had driven the development of more thermally resistant PCB laminates. These materials already had a whole list of requirements to satisfy: The ability to withstand processing heats and chemicals, the mechanical strength to support components, manufacturability in drilling and machining operations, low thermal expansion, low dielectric constant, plate-ability of the dielectric material, and resistance to the absorption of excessive moisture. With lead-free they also had to withstand more thermally aggressive assembly conditions.
Dr. Anselm described experimental apparatus for monitoring PCB samples under a range of reflow profiles and temperatures, typically 240 to 245°C but maybe as high as 260°C, and it was realistic to expect that in the worst case a double-sided assembly that had been reworked could have experienced up to nine reflow cycles. He showed a whole catalogue of PCB failures after reflow: laminate colour changes indicating material degradation, examples of solder mask damage, surface blistering, many examples of delamination of core layers, prepreg layers and bond interfaces; also examples of inner layer damage, outer layer pad lifting, barrel cracking, and corner cracking. But it was not all bad news--other examples showed no deterioration after nine reflow cycles. Even if the PCB survived the thermal stresses of assembly, the robustness of the different available laminates varied greatly. Pad cratering had become an issue since the introduction of lead-free: Alloys were stiffer and stronger, out-of-plane stresses were higher, together with the tendency for high-temperature laminates to be more brittle. Another issue was copper dissolution, which was much higher in lead-free than tin-lead alloys, such that the thickness of surface-mount pads could be seriously reduced during soldering or rework operations.
Lead-free solderable finishes for PCBs included immersion ton, immersion silver, electroless nickel immersion gold (ENIG), copper-OSP, lead-free HASL and electroless nickel electroless palladium immersion gold (ENEPIG). Each had its own benefits and reliability risks. Dr. Anselm reviewed whiskering with immersion tin, champagne voiding and silver sulphide whiskering with immersion silver, “black-pad” issues with ENIG and dissolution of copper pads with HASL, illustrated with many real-life examples.
Dr. Anselm’s final topic was a review of recent research aimed at developing a meaningful test for solder joints that allowed the study of intermetallic structures within the joint following component attachment, to investigate the cause of premature failures when BGA components with nickel-gold surface finish were soldered to PCBs with copper-based pads. Whether the finish on the copper was OSP, immersion silver, immersion tin or HASL, some copper would be dissolved into the solder joint, and this could have an important effect on the intermetallic structure at the nickel-gold component pad which was not accurately replicated by current mechanical solder ball testing procedures.
He explained how the presence of copper affected the rate of nickel dissolution in lead-free solder, and the subsequent formation of copper-nickel-tin intermetallics at the component interface. Two ball-attach cycles had been used, with short and long times above liquidus, then hot-bump-pull testing and cold-bump-pull testing had been compared following addition of copper as powder in a second reflow. Intermetallic structures following hot-bump-pull testing were largely identical, and the process appeared to have eliminated the intermetallic variations observed during ball attach. Cold-bump-pull testing showed varying intermetallic behavior in both load-to-failure distribution and failure mode and intermetallic morphologies following second reflow with copper powder were vastly different to those produced in hot-bump-pull testing. It had been concluded that the current hot-bump-pull process was not comparable to cold-bump-pull following copper addition, but the benefits of hot-bump-pull were still attractive and additional development was planned, to further examine the procedure and to identify the conditions for ideal intermetallic formation.
The workshop concluded with an energetic session of interactive discussion and comparison of individual experiences. All present had a most interesting and rewarding day, and the benefit of a substantial learning and networking experience. Martin Anselm is to be applauded for the quality and content of his presentations, the reality of his practical examples and for his willingness to share his extensive specialist knowledge.
And the generous hospitality of Alent plc, the parent company of familiar industry names Alpha and Enthone, in hosting the event, is gratefully acknowledged.