The recent advances and trends in fan-out wafer/panel-level packaging (FOW/PLP) are presented in this study. Emphasis is placed on: (A) the package formations such as (a) chip first and die face-up, (b) chip first and die face-down, and (c) chip last or redistribution layer (RDL)-first; (B) the RDL fabrications such as (a) organic RDLs, (b) inorganic RDLs, (c) hybrid RDLs, and (d) laser direct imaging (LDI)/printed circuit board (PCB) Cu platting and etching RDLs; (C) warpage; (D) thermal performance; (E) the temporary wafer versus panel carriers; and (F) the reliability of packages on PCBs subjected to thermal cycling condition. Some opportunities for FOW/PLP will be presented.
The first fan-out wafer-level packaging (FOWLP) U.S. patent was filed by Infineon on Oct. 31, 2001 [1,2], and the first technical papers were also published (at IEEE/ECTC2006 and IEEE/EPTC2006) by Infineon and their industry partners: Nagase, Nitto Denko, and Yamada [3,4]. At that time, they called it embedded wafer-level ball (eWLB) grid array. This technology eliminates wirebonding or wafer bumping and lead frame or package substrate, and potentially leads to a lower cost, better performance, and lower profile package . Alternatively, this technology requires a temporary (reconstituted) carrier for the known-good die (KGD), epoxy molding compound (EMC), compression or lamination molding, and the fabrication of the redistribution layers (RDLs) .
It should be emphasized that the concept of FOWLP was first proposed by Infineon (Germany) . Even though some of the knowledge of this technology has been patented by General Electric [5,6] and EPIC , however, Infineon's patent  specifically pointed out the use of RDLs to fan out the circuitry from the metal pad of the chip on a wafer and solder ball to the metal pads on a printed circuit board (PCB), for example, Fig. 1. Infineon  also specifically pointed out that some of the RDLs have a portion that extends beyond (fan-out) the edges of the chip. These are the major claimed in Ref. , which were not claimed by GE and EPIC [2,5–7].
The advantages of FOWLP, Fig. 2(a), over flip chip plastic ball grid array package, Fig. 2(b), are [2,8]: (1) lower cost, (2) lower profile, (3) eliminating the substrate, (4) eliminating the wafer bumping, (5) eliminating the flip chip reflow, (6) eliminating the flux cleaning, (7) eliminating the underfill, (8) better electrical performance, (9) better thermal performance, and (10) easier to go for system-in-package (SiP) and three-dimensional (3D) integrated circuit (IC) packaging [9–11].
The advantages of FOWLP over wafer-level chip scale package (WLCSP), Fig. 2(c), are [2,8]: (1) the use of known-good die, (2) better wafer-level yield, (3) using the best of silicon, (4) single or multichip, (5) embedded integrated passive devices, (6) more layer of RDLs, (7) higher pin counts, (8) better thermal performance, (9) easier to go for SiP and 3D IC packaging [9–11], and (10) higher PCB level reliability.
During ECTC2007, Freescale (now NXP) presented a similar technology and called it redistributed chip package . Institute of microelectronics extended the FOWLP technology to multidie and stacked multidie in 3D format and presented at ECTC2008 . During ECTC2009, Institute of microelectronics presented four papers on: (1) a novel method to predict die shift during compression molding ; (2) laterally placed and vertically stacked thin dies ; (3) the reliability of 3D FOWLP ; and (4) the demonstration of high quality and low-loss millimeter-wave passives on FOWLP . In Refs. [1,3,4,12–17], they used chip-first and die face-down  fan-out wafer-level processing.
During IEEE/ESTC2010 and ECTC2011, NEC (now Renesas) presented a couple of papers on system in wafer-level package (SiWLP) , and “RDL-first” FOWLP . These papers are based on their SMArt chip connection with feed through interposer packaging technology for interchip wide-band data transfer [21,22] and 3D stacked memory integrated on logic devices [23–27]. The feed through interposer (FTI) used in SMArt chip connection with feed through interposer is a film with ultrafine linewidth and spacing RDLs. The dielectric of the FTI is usually SiO2 or a polymer, and the conductor wiring of the RDLs is Cu. The FTI not only supports the RDLs underneath within the chip, it also provides support beyond the edges of the chip. Area array solder balls are mounted at the bottom side of the FTI, which are to be connected to the PCB. EMC is used to embed the chip and support the RDLs and solder balls. Their technology requires besides the fabrication of the RDLs, wafer bumping, fluxing, flip chip assembly, cleaning, and underfills dispensing and curing, and thus is very costly. Their potential applications are for very high-density and high-performance products such as super computers, high-end servers, telecommunications, and networking systems. Their technology is chip-last or RDL-first FOWLP processing .
At ECTC2012, Statschippac proposed a package-on-package (PoP) for the application processor (AP) chipset with the FOWLP technology . TSMC [29,30] presented two papers on FOWLP at ECTC2016: one is their integrated fan-out (InFO) wafer-level packaging for housing the most advanced AP for mobile applications , and the other is to compare the thermal and electrical performance between their InFO technology and the conventional flip chip on buildup package substrate technology . During September 2016, TSMC put the PoP of AP of iPhones with their FOWLP (InFO) technology into high-volume manufacturing. This is very significant since this means that FOWLP is not only for packaging baseband, power management IC, radio frequency (RF) switch/transceiver, RF radar, audio codec, microcontroller unit, connectivity ICs, etc., it can also be used for packaging high performance and large (>120 mm2) system-on-chip (SoC) such as APs. TSMC used chip-first and die face-up FOWLP processing .
Recently, through-silicon via (TSV)-less interposer  to support multiple flip chips is a very hot topic in semiconductor packaging. At ECTC2013, Statschippac proposed  using the fan-out flip chip-eWLB to make the RDLs for the chips to perform mostly lateral communications. During ECTC2016, ASE  and Mediatek  used a similar technology to fabricate the RDLs with FOWLP and showed that the TSV interposer, wafer bumping, fluxing, chip-to-wafer bonding, cleaning, and underfill dispensing and curing are eliminated, i.e., TSV-less interposers.
All pervious mentioned fan-out papers are using the round 200- or 300-mm wafers as the reconstituted carriers for supporting the KGDs and making the molds, RDLs, etc. (This is because of the existing equipment for fabricating the device wafers.) In order to increase the throughput, fan-out panel-level packaging (FOPLP) has been proposed. For examples, starting from EPTC2011, J-Devices have been presenting their FOPLP (320 mm × 320 mm) called WFOP™ (Wide Strip Fan-Out Package) [37–39]. Starting from ECTC2013, Fraunhofer have been presenting their evaluation results on compression molding of a large area (610 mm × 457 mm) FOPLP [40–42]. At ECTC2014, SPIL published two papers on FOPLP called P-FO, one is to develop and characterize a 370 mm × 470 mm  and the other is to measure their warpage . One of the bottlenecks for FOPLP is the availability of panel equipment such as the spin coating, physical vapor deposition (PVD), electrochemical deposition (ECD), etching, backgrinding, solder ball mounting, and dicing for making the molds, RDLs, and packages, due to the lack of the standard of panel sizes. Thus, the potential FOPLP users are unanimously calling for the panel-size industry standards.
In this study, the following important topics of FOW/PLP will be examined, discussed, and updated: (A) the package formations such as (a) chip first and die face-down, (b) chip first and die face-up, and (c) chip last or RDL-first; (B) the RDL fabrications such as (a) organic RDLs, (b) inorganic RDLs, (c) hybrid RDLs, and (d) laser direct imaging (LDI)/PCB RDLs; (C) warpage; (D) thermal performance; (E) the temporary wafer versus panel carriers; and (F) the reliability of the fan-out packages on PCB subjected to thermal cycling.
Fan-Out Wafer/Panel-Level Packaging Formations
There are many FOW/PLP formations. However, basically there are three different kinds, namely chip-first (die face-down), chip-first (die face-up), and chip-last or RDL-first.
Chip-First (Die Face-Down).
Fan-out wafer/panel-level packaging with the chip-first and die face-down processing is actually the eWLB first proposed by Infineon [1,2] and high-volume manufacturing is proposed by STATSChipPac, ASE, STMicroelectronics, Infineon, and NANIUM (now AMKOR). This is the most conventional method to form FOW/PLPs, and most FOW/PLP products being manufacturing today are using this method.
Chip-First (Die Face-Down) Process.
Figure 3 shows, in general, the process flow of chip-first with die face-down FOW/PLP [3,4,12–18,45–73]. First, the device wafer is tested for KGDs and then singulated into individual dies. This is followed by picking up the KGDs and placing them face-down on a temporary carrier (which can be metal, silicon, glass, or organic) that can be round (wafer) or rectangular (panel) with a double-sided thermal release tape. (The most common used tape is the REVALPHA provided by Nitto Denko.) Then, the reconstituted carrier with the KGDs is molded with liquid EMC using the compression method + postmold cure (PMC) or the lamination method + postannealing before removing the carrier and the peeling off the double-sided tape. Next comes building the RDLs (which will be detailed in Sec. 3) for signals, power, and grounds from the Al or Cu pads of the KGD. Finally, solder balls are mounted and the whole reconstituted carrier (with KGDs, RDLs, and solder balls) is diced into individual packages.
Chip-First (Die Face-Down) With Wafer Carrier.
The chips under consideration are shown in Figs. 4(a) and 4(b), respectively, for the large test chip (5 mm × 5 mm × 150 μm) and the small chip (3 mm × 3 mm × 150 μm). There are 160 pads with a pitch = 100 μm (the inner rows) for the large chip and 80 pads with a 100-μm pitch (the inner rows) for the small chip. For both chips, the SiO2 passivation opening of the Al-pad is 50 μm × 50 μm, and the size of the Al-pad is 70 μm × 70 μm, Fig. 4(c).
The 10 mm × 10 mm package under consideration is shown in Fig. 5(a), which consists of one 5 mm × 5 mm chip, three 3 mm × 3 mm chips, and 4 (0402) capacitors. The spacing between the large chip and the small chip is only 100 μm. One practical application of the package is for the application processor chipset, i.e., the large chip could be a processor and the small chips could be memories.
Figure 5(b) schematically shows the cross-sectional view of the test package. It can be seen that there are two RDLs and the thickness of the metal layer of RDL1 is 3 μm and that of RDL2 is 7.5 μm. The metal linewidth and spacing of RDL1 are 10 μm and those of RDL2 are 15 μm. The dielectric layer thickness of DL1 and DL2 is 5 μm, and DL3 is 10 μm. The opening of the passivation (DL3) is 180 μm. The solder ball size is 200 μm, and the ball pitch is 0.4 mm.
Figure 6 shows a 300 mm reconstituted wafer carrier with 629 (10 mm × 10 mm) packages [45–47]. Each package has 4 (one 5 mm × 5 mm and three 3 mm × 3 mm) chips and 4 (0402) capacitors. The spacing between the large chip and the small chip is 100 μm. There are two RDLs for each package. It should be emphasized that FOWLP is a very high-throughput process. In this case, one shot, it can produce 629 10 mm × 10 mm packages.
Chip-First (Die Face-Down) With Panel Carrier.
Figure 7 shows a special process flow of chip-first with die face-down with a panel carrier [48–50]. Since the panel carrier involves the PCB process, the work must be done on the device wafer by electroplating an 8-μm Cu-pad on top of the Al-pad. (The purpose of the Cu-pad is to stop the laser drilling to the Al-pad.) Also, unlike those processes shown in Fig. 3, this process uses an organic carrier, Fig. 7(a), and forms an ECM-panel first, Fig. 7(d). (In this paper, the EMC with the KGDs embedded is called ECM-panel.) It is followed by attaching the ECM-panels on both sides of a core substrate with epoxy resin, Fig. 7(e). Then, perform the five-layer PCB lamination. It is followed by peeling off the two-sided tapes, Fig. 7(f), and making the RDLs, which will be discussed in Sec. 3.
The 10 mm × 10 mm test package on the panel carrier also consists of 4 chips as shown in Figs. 8 and 9. The chip sizes are also 5 mm × 5 mm and 3 mm × 3 mm. However, there are 88 pads on a pitch = 180 μm (the outer rows) for the large chip and 48 pads on a 180 μm-pitch (the outer rows) for the small chip. The SiO2 passivation opening of the Al-pad is 110 μm × 110 μm, and the size of the Al-pad is 130 μm × 130 μm. The Cu contact-pad is 110 μm in diameter and is 8 μm tall from the Al-pad.
Figure 10 shows a panel carrier of 340 mm × 340 mm with 378 (10 mm × 10 mm) packages [48,49]. Figure 11 shows a panel carrier of 508 mm × 508 mm with 1512 (10 mm × 10 mm) packages . It can be seen that there is not any void (inspected by the C-mode scanning acoustic microscopy) in the EMC even in the 100 μm gap between the large chip and the small chip.
Thermal Cycling of the Chip-First (Die Face-Down) Package Assembly.
The package shown in Figs. 6, 10, and 11 is assembled on a six-layer PCB with 405 (Sn3 wt%Ag0.5 wt%Cu) solder joints . The sample sizes for the thermal-cycling test are equal to 60. The thermal cycling test results of the solder joint (without underfill) reliability are shown in Fig. 12. The thermal cycling test stops at 1300 cycles. It can be seen that the characteristic life (63.2% failed) of the Weibull plot is 1070 cycles, which is more than adequate for the expecting life (usually is less than 3 yr) of mobile products such as the smartphones and tablets. The failure location and mode are shown in Fig. 13. It can be seen that the solder joint cracks near the interface between the bulk solder and the package contact pad and it occurs under the chips' corners near the package corners—the longest distance to neutral point (DNP) .
The PCB assembly of the fan-out SiP shown in Figs. 6, 10, and 11 is modeled as a 3D strip that captures the construction along a diagonal path from the assembly (Fig. 14) with the proper boundary conditions. Using exclusively hexahedral solid elements, the model can capture the precise shape of the packages' solder joint and potential DNP effects while retaining significant computational efficiency over full octant models. Despite the overall economy of elements in the strip model, selective mesh refinement is used to concentrate highly refined elements in the solder joints where failure is anticipated. In the present PCB assembly, failure would be predicted in the solder joints with the greatest DNP (the package corner) and near the chip corners as shown in Fig. 14. Thus, highly refined meshes are applied to these solder joints. The other solder joints are coarsely meshed. The abaqus 6.12 (C3D8R) is used for the model. The Sn3 wt% Ag0.5 wt%Cu is assumed to follow the generalized Garofalo creep equation : dε/dt = 500,000 Sinh5(σ/1 × 108) exp(−5807/T(K)), where ε is the strain, σ is the stress in Pa, and T is the temperature in Kevin. The other material properties are shown in Table 1.
The temperature profile shown in Fig. 12 is to be imposed on the PCB assembly. Five temperature cycles are executed. The largest accumulated creep strain occurs at the solder joint under the 3 mm × 3 mm chip corner and the 5 mm × 5 mm chip corner as shown in Fig. 14. The location is at the interface between the bottom of the package and the bulk solder. Thus, any failure should occur at this location. This correlates very well with the thermal cycling test failure location and mode as shown in Fig. 13. For drop test and simulation results, see Ref. .
Applications of Chip-First and Die Face-Down FOW/PLP.
Most of the applications of chip-first and die face-down are for small dies and not so high pin counts. Also, the metal line width and spacing of the RDLs are not small, e.g., 10–15 μm or larger. The semiconductors to be packaged are, e.g., baseband, power management IC, RF switch/transceiver, RF radar, audio codec, microcontroller unit, and connectivity ICs. With the popularity of SiP or heterogeneous integration, fan-out (which can handle multiple dies and discrete components) will be used more because the fan-in WLCSP  can only handle single die.
Chip-First (Die Face-Up).
Chip-First (Die Face-Up) Process.
First, the device wafer must be modified by sputtering a Ti/Cu as a bottom layer of under bump metallurgy (UBM) with a PVD on the Al (or Cu) pad, and a Cu contact-pad is electroplated on the UBM, as shown in Fig. 15(a). (This is unique for chip-first and die face-up and these Cu-contact pads are for building the RDLs later.) This step is followed by spin coating a polymer on top of the device wafer and laminating (at ∼70 °C) with a (∼20 μm) die-attach film (DAF) provided by, e.g., Hitachi at the bottom of the device wafer as shown in Fig. 15(b). The device wafer is then tested and diced into individual KGDs. In this paper, the chip size is 10 mm × 10 mm (Fig. 16) and the package size is 13.42 mm × 13.42 mm (Fig. 17).
On the reconstituted glass carrier, a light-to-heat conversion (LTHC) layer (about 1 μm) provided by, e.g., 3 M is spin-coated onto the glass carrier as shown in Fig. 15(c). The individual KGDs are picked and placed face-up on the LTHC carrier as shown in Fig. 15(d). In order to cure the DAF, a bonder with temperature and pressure should be used. The DAF process is carried out at 120 °C (both bond-head and bond-stage) with the bond force of 2 kg for 2 s. Thus, the reconstituted carrier will expand during chips pick and place. However, during patterning/photolithography of the RDLs, it is operated at room temperature. Thus, pitch compensation due to the DAF heating is needed [76–78].
The EMC used in this study [76–78] is a liquid-like material (Nagase R4507). After EMC dispensing, it is followed by the compression molding. After a few experiments, the optimal compression molding parameters are: temperature = 125 °C, pressure = 45 kg/cm2, time = 10 min, and removing trap air before compression molding. It is followed by PMC with a temperature = 150 °C, time ≥ 60 min, and a dead weight = 15 kg for a better warpage control. Because of the DAF, the die shift due to compression molding is very small (≤ ±4 μm) [76–78].
The process in fabricating the RDLs will be discussed in Sec. 3. The thickness of dielectric-layer 1 (DL1), DL2, and DL3 is 5 μm, and that of DL4 is 10 μm (Fig. 17). The thickness of the metal of redistribute-layer 1 (RDL1) and RDL2 is 3 m, and that of RDL3 is 7.5 μm. (The thicker metal of RDL3 is for the UBM-less thicker Cu-pads to “resist” the Cu consumption from the solder ball reflow and during operation.) The line width and spacing of the RDLs are 5 μm for RDL1, 10 μm for RDL2, and 15 μm for RDL3. The RDLs are shown in Fig. 18(d). Overall, it can be seen that all the RDLs are properly done.
There are two different stencils for the solder ball mounting: one is for stencil printing the flux, and the other is for stencil mounting the solder balls. The solder (Sn3 wt%Ag0.5 wt%Cu) balls (200 μm diameter) used are from Indium. The peak temperature for solder reflow is 245 °C. Figure 18(b) shows the individual package, and Fig. 18(c) shows the close-up view of solder balls on the package.
The debonding of the glass carrier as shown in Fig. 15(h) is by scanning a laser (355-nm diode-pump solid-state Nd: YAG UV laser source is used) from the glass carrier side. The laser spot-size is 240 μm, the scanning speed is 500 mm/s, and the scanning pitch is 100 μm. When the LTHC layer “sees” the laser light, it converts into powders, and the glass carrier is easily removed. It is followed by chemical cleaning.
Thermal Cycling of Chip-First (Die Face-Up) Package.
The thermal cycling test setup, data acquisition system, and temperature profile are exactly the same as those shown in the left-hand side of Fig. 12. The sample size is also 60. The thermal cycling test stops at 1100 cycles and there are 14 failures (including one early failed at 58 cycles) It can be seen that the characteristic life (63.2% failed) of the Weibull plot is 2382 cycles, which is more than adequate for the expecting life (usually is less than 3 yr) of mobile product such as the smartphones and tablets . The failure mode is shown in the right-hand side of Fig. 19. It can be seen that the solder joint cracks at the interface between the bulk solder and the RDL3 and it occurs near the package corners.
The basic philosophy of simulation of this case (Fig. 20) is the same as the heterogeneous integration case (Fig. 14), except for the geometry of the structure. The boundary (temperature and kinematic) conditions are the same. The simulation (creep strain) results are shown in Fig. 20. It can be seen that the maximum strain occurs in the solder joint near the package corner and the chip corner and the failure mode is the cracking of the solder joint near the interface between the package and the bulk solder. These correlate very well with the experimental observation.
Thermal Performance of Chip-First (Die Face-Up) Package.
Figure 21 schematically shows the top view and cross-sectional view of the FOWLP structure shown in Fig. 18 for thermal analyses [79,80]. It can be seen that the chip size is 10 mm × 10 mm with various thicknesses (10, 25, 50, 100, 150, 200, 250, and 300 μm). There are a 100 μm EMC covering the top of the chip and 40 μm RDLs to fan-out the circuitry from the bottom of the chip. The package is with 1024 (0.2 mm-diameter) solder balls on a 0.4 mm-pitch, which are reflowed on a PCB. The dimensions of the PCB are 25 mm × 25 mm × 0.8 mm.
The ambient temperature is assumed to be 25 °C. The boundary condition on the top side and bottom side of the PCB and the top side of the chip is with a convective heat transfer coefficient, h = 10 W/m2 K, which is to imitate a natural convection condition. The heat dissipation of the chip is 5 W. The junction-to-ambient thermal resistance (Rja) of the 10 mm × 10 mm chip with various thicknesses is shown in Fig. 21. It can be seen that the thinner the chip, the higher the Rja (i.e., the lower the thermal performance). This is because of the inferior thermal spreading capability of thinner chips. The thermal performance degrades rapidly as the chip thickness goes below 100 μm, as shown in Fig. 21. A typical temperature contour distribution is shown in Fig. 21 and it can be seen that: (1) the maximum temperature is 101.5 °C; (2) the minimum temperature is 89.9 °C; and (3) the Rja is 15.3 °C/W.
Applications of Chip-First and Die Face-Up FOW/PLP.
Figures 22, 23, and 24 show the schematic and scanning electron microscope images of the cross sections of the PoP that houses the Apple A10, A11, A12 AP and the mobile dynamic random access memories (DRAMs) of the iPhones. These PoPs are fabricated by TSMC with its InFO WLP technology [29–32]. It can be seen from the bottom package that the wafer bumping, fluxing, flip-chip assembly, cleaning, underfill dispensing and curing, and build-up package substrate (of the AP A9 shown in Fig. 3 of Ref. ) have been eliminated and are replaced by the RDLs (for the A10, A11, and A12 as shown in Figs. 22, 23, and 24). This results in a lower cost, higher performance, and lower profile package. The InFO is a chip-first and die face-up FOWLP.
The points noted above are very significant because Apple and TSMC are the trendsetters. Once they used the technology, it became likely that many others will follow. Also, this means that FOWLP is not only for packaging baseband, power management-integrated circuits, radio-frequency (RF) switch/transceiver, audio codec, microcontrol unit, RF radar, connectivity ICs, etc., it can also be used for packaging high-performance and large (>120 mm2) SoC, such as APs. Furthermore, with the popularity of SiP or heterogeneous integration, fan-out chip-first and die face-up will be used more for fine (say 5 μm) metal line width and spacing RDLs.
Chip-Last or Redistribution Layer-First
Reasons for Chip-Last or RDL-First.
According to Refs. [19,20], one of the challenges of chip-first (either die face-up or face-down) FOWLP and the key reasons for them to introduce the chip-last or RDL-first FOWLP is the production yield during the RDL process is low because the KGDs are already embedded. This is true only if the chip-last (RDL-first) FTI is fully functionally tested before the chip-to-wafer bonding. Otherwise, the KGDs still have to be thrown away for the case of a FTI with bad RDLs after a system test. Also, it should be noted that fully functionally tests of RDLs on a FTI are not only very costly but very difficult, if not possible.
Chip-Last or RDL-First Process.
Figure 25 shows the process flow of the chip-last or RDL-first FOWLP. This is very different from the chip-first FOWLP discussed in Secs. 2.1 and 2.2. First of all, RDL-first FOWLP requires: (1) building up the RDLs on a bare silicon wafer (the FTI) or a glass wafer or panel; (2) performing the wafer bumping; (3) performing the fluxing, chip-to-wafer bonding, and cleaning; and (4) performing the underfill dispensing and curing. As to wafer bumping, chip-to-wafer bonding and underfilling, e.g., see Refs. [88,89]. Each of these tasks is a major undertaking and requires additional materials, process, equipment, manufacturing floor space, and personal effort. Therefore, comparing to chip-first FOWLP, chip-last (RDL-first) FOWLP incurs very high cost and has a higher probability of greater yield losses. It can only be afforded by very-high density and performance applications such as high-end servers and computers.
The very first step in RDL-first is to build the RDLs on a bare silicon or glass wafer, Figs. 25(A)–25(F). First, spin coat a sacrificial layer on a glass wafer or panel, Fig. 25(A). Then, build up the Cu-pads and the dielectric layer, and make the openings on the dielectric layer, Fig. 25(B). It is followed by electroplating the Cu metal layer for RDL1, Fig. 25(C). Repeat all the processes to fabricate the other RDLs, Figs. 25(D) and 25(E). Then, make the final dielectric layer (passivation) and the micro-bump pads, Fig. 25(F).
On the device wafer, the first step is to perform wafer bumping as shown in Fig. 4 of Ref. . The next step is to test for KGDs and then dice the wafer into individual KGDs. Next, the KGDs are picked up, flux is applied, and then the KGDs are placed face down on the microbump contact pad (which is on top of the RDLs) of the full-thickness wafer or panel prior to performing chip-to-wafer or panel bonding, Fig. 25(a). That step is followed by cleaning the flux residue and then dispensing the underfill and curing, Fig. 25(b). Next comes molding the whole reconstituted wafer or panel carrier using the compression method with EMC, Fig. 25(c). Then, remove the glass carrier, Fig. 25(d). Finally, the solder balls are mounted on the bottom RDL and the reconstituted wafer or panel is diced into individual packages, Fig. 25(e).
Applications of Chip-Last or RDL-First FOW/PLP.
In contrast to TSMC's chip-first and die face-up InFO PoP for Apple's AP chipset, Samsung proposed to use chip-last or RDL-first for their AP chipset [90–92] as shown in Fig. 26. It can be seen that the AP and the mobile DRAMs are placed side-by-side (not-PoP) with the chip-last FOWLP. The package profile of Samsung's side-by-side should be lower than Apple/TSMC's PoP. On the other hand, Samsung's package horizontal dimensions should be larger. For other chip-last or RDL-first potential applications, see Refs. [93–103].
The chip size with chip-last FOWLP can be very large and the metal line width and spacing of the RDLs can be very small. However, it is very expensive and can only be afforded by very-high density and performance applications. On the other hand, for high-density and high-performance applications, why insist on the FOWLP technology because there are many packaging alternatives?
Redistribution Layers Fabrications
Organic Redistribution Layers (Polymer and Electrochemical Deposition Cu + Etching).
This is the oldest method to make RDLs for fan-in WLP, for examples, see Ref. . The dielectric layer is made of a polymer, e.g., polyimide (PI), benzocyclobutene, or polybenzobisoxazole and the conductor layer is made by ECD of Cu and etching. The key process steps are described as follows: (1) First, spin coat a polymer on the whole wafer; (2) That step is followed by spin coating a photoresist; (3) Then the photoresist is opened with a mask aligner or stepper; (4) The polymer is then etched, and the resist is stripped off; (5) Next, the adhesive/seed layer (Ti/Cu) is sputtered using PVD; (6) The photoresist is then spin coated, and then the photoresist is opened with a mask aligner or stepper; and (7) Next comes electroplating the Cu. After the resist is stripped off and the TiCu is etched off, we have the first RDL. If one repeats the processes, you get the other RDLs. Today, most outsourced semiconductor assembly and test suppliers (OSATS) use this method to make RDLs for FOWLP with chip-first and chip-last processing.
A better and simpler process is shown in Fig. 27 [46,77]. It can be seen that for the PI development, the whole reconstituted wafer is spin-coated with a photosensitive PI. It is followed by applying a stepper (for high-yield) and then using photolithography techniques to align, expose, and develop the vias of the PI. Finally, the PI is cured at 200 °C for 1 h—this will form a 5 μm-thick PI layer. It is followed by sputtering Ti and Cu by PVD at 175 °C over the entire reconstituted wafer. Then, apply a photoresist and a stepper and use photolithography techniques to open the redistribution trace's locations. Next, electroplate the Cu by ECD at room temperature on the Ti/Cu in the photoresist openings. These steps are followed by stripping off the photoresist and etching off the Ti/Cu; RDL1 is thereby obtained. Finally, repeat all the above steps to obtain other RDLs. For example, the RDLs in Figs. 6, 18, 22, 23, and 24 are fabricated by this method. This can be used for FOWLP with chip-first and chip-last processing. The RDLs made by the polymer (either photosensitive or not) and ECD Cu + etching are called organic RDLs. The line width and spacing of the RDLs can be as little as 5 μm for high yield.
Inorganic Redistribution Layers (Plasma Enhance Chemical Vapor Deposition and Cu-Damascene + Chemical-Mechanical Polishing).
This is the oldest back-end semiconductor process. This process uses SiO2 or SiN for the dielectric layer and ECD to deposit the Cu on the whole wafer. That is followed by using chemical-mechanical polishing (CMP) to remove the overburden Cu and seed layer to make the Cu conductor layer of the RDLs. The key process steps are shown in Fig. 28. First, use plasma enhance chemical vapor deposition (PECVD) to form a thin layer of SiO2 (or SiN) on a full thickness bare silicon wafer and then use a spin coater to laminate the photoresist. These steps are followed by using a stepper to open the resist and a reactive ion etch to remove the SiO2. Then, a stepper is used to open the resist wider and reactive ion etch to etch more of the SiO2. Next, strip off the resist, sputter the TiCu, and ECD the Cu on the whole wafer. These steps are followed by CMP to remove the overburden Cu and the TiCu, and then we have the first RDL1 and V01 (the via connecting the Si and RDL1) as shown in Fig. 29. This is called the dual Cu-damascene method [104,105]. Finally, repeat all the processes to get the other RDLs. This method can be used for FOWLP with chip-first and chip-last processing. The RDLs made by PECVD and Cu-damascene + CMP are called inorganic RDLs. The line width and spacing of the RDLs can be ≤ 2 μm and down to submicron.
Hybrid-Redistribution Layers (First Inorganic Redistribution Layers and Then Organic Redistribution Layers).
As of today, this hybrid-RDL method only applies to chip-last or RDL-first, i.e., wafer bumping and chip-to-wafer bonding are necessary. The key process steps for chip-last by hybrid RDLs are shown in Fig. 30. It can be seen that a glass carrier-1 is coated with a sacrificial layer, Fig. 30(a). The contact pad and the first RDL (RDL1) are then fabricated by the PECVD for the SiO2 dielectric layer and dual Cu-damascene + CMP for the conductor layer, Fig. 30(b). The remaining RDLs are fabricated by the ordinary polymer (or photosensitive polymer) and Cu-plating + etching method. Another carrier-2 is then attached to the other side of the reconstituted wafer, Fig. 30(c). That step is followed by debonding of the carrier-1 as shown in Fig. 30(d). That, in turn, is followed by fluxing, chip-to-wafer bonding, cleaning, underfill dispensing, and curing as shown in Fig. 30(e). Then, the reconstituted wafer is molded with EMC by the compression method, Fig. 30(f). Next comes debonding of the carrier-2 and solder ball mounting as shown in Fig. 30(g). Figure 31 shows the cross section of a FOWLP with hybrid RDLs published by SPIL in Ref. . For other hybrid-RDLs, see Refs. [110,111].
Redistribution Layers by Pure Printed Circuit Board Technology (Ajinomoto Buildup Film/SAP/Laser Direct Imaging and Printed Circuit Board Cu Plating + Etching).
Figure 32 shows a semi-additive process (SAP) flow for fabricating the RDLs on a panel published in Refs. [48–50]. It starts off by laminating an Ajinomoto buildup film (ABF) on the reconstituted ECM-panel. It is followed by laser drilling, smoothing the ABF surface by in-smear, and electroless Cu seed layer plating. Then, follow those steps with dry film lamination, laser direct imaging (LDI) lithography, dry film developing, and PCB Cu plating for RDL1. Follow those steps with stripping off the dry film and etching off the seed layer. These steps are then repeated to get the other RDLs. The final RDL can be used as a contact pad. The next steps are laminating, photolithography, and curing the solder mask (in either a solder mask defined, or a nonsolder mask defined format) before mounting the solder balls. In this case, the dielectric layer thickness can be as little as 10 μm and the conductor layer thickness can be as little as 5 μm. For example, the RDLs in Figs. 10 and 11 are fabricated by this method. In general, the line width and spacing of the pure PCB technology RDLs are ≥ 10 μm.
Warpage is a critical issue for FOW/PLP [112–118]. Depending on the formation of the package and the number of RDLs, there are a few different warpages affecting the FOW/PLP process. Let us use the chip-first and die face-up with 3 RDLs on a 300 mm FOWLP as an example such as those shown in Figs. 15 and 18. In this case, there are at least six different warpages affecting the FOWLP process.
The first warpage is right after PMC (see Fig. 15(e)) of the reconstituted wafer. If the warpage is too large, then the reconstituted wafer cannot be placed and/or operated on the backgrinding equipment to perform the backgrinding of EMC to expose the Cu-contact pads.
The second warpage is right after the backgrinding of the EMC to expose the Cu-contact pads (see Fig. 15(f)). If the warpage is too large, then the reconstituted wafer cannot be placed and/or operated on the RDL equipment such as the stepper, lithographic, PVD, electrochemical deposition, and etching.
The third warpage is right after the fabrication of the first RDL. (The temperature of the PVD is about 200 °C, so there is a thermal expansion mismatch among the EMC, Si chip, and glass carrier.) If the warpage is too large, then there are issues in making the second RDL. The fourth warpage is right after the fabrication of the second RDL. If the warpage is too large, then there are issues in making the third RDL. The fifth warpage is right after the fabrication of the third RDL. If the warpage is too large, then there are issues (such as holding and/or operating of the reconstituted wafer on the equipment and controlling the accuracy of ball drops) in performing the solder ball mounting (see Fig. 15(g)).
The sixth warpage is right after the solder ball mounting. (The temperature of the lead-free reflow temperature is about 250 °C, so there is a very large thermal expansion mismatch among the EMC, Si chip, and glass carrier.) If the warpage of the diced individual package is too large (see Fig. 15(h)), then there are issues (such as solder joint standoff height variation, stretched solder joints, and titled component) in PCB assembly.
What are the maximum allowable warpages? The rule of thumb is for a 300-mm reconstituted wafer, the maximum allowable warpage of the first five kinds of warpage is 1 mm, but 0.5 mm is preferred for high yield. The maximum allowable warpage of the individual package (≤ 20 mm × 20 mm) is 0.2 mm, but 0.1 mm is preferred for high yield.
Warpage has been studied by Lin et al.  for wafer-level chip-first and die face down, Che et al.  is for wafer level mold first, Che et al.  is for wafer-level chip-last or RDL-first, Hou et al.  for panel-level packaging, Shen et al.  is for individual package with solder bumped flip chip with underfill on package substrate with a metal cap., and Lau et al. [117,118] are for chip-first and die face-up.
Figure 33(a) shows the shadow Moire warpage measurements of the 300 mm reconstituted wafer (Fig. 18) right after PMC (609 μm in a smiling face), Fig. 15(e), and right after backgrinding of the EMC to expose the Cu-contact pads (811.9 μm in a crying face), Fig. 15(f). It should be emphasized that the warpage of the reconstituted wafer without RDLs and solder balls right after backgrinding of the EMC to expose the Cu-contact pads has been found to be changing from a smiling face to a crying face. Similar trend has been found by the simulation method shown in Fig. 33(b) [117,118]. In order to reduce the warpage of the reconstituted wafer after PMC, the coefficient of thermal expansion of the glass carrier and EMC should be as close as possible. Also, in order to reduce the warpage of the reconstituted wafer after backgrinding, the coefficient of thermal expansion of the EMC should be larger than that of the glass carrier (to create a bigger smiling face). However, it should not be large enough to produce a warpage of the reconstituted wafer right after PMC > 1 mm so the reconstituted wafer cannot be placed on the backgrinding machine.
According to JEDEC Standard (JESD22-B112A) , the laser reflection method (confocal displacement metrology) is used to measure the warpage of the individual package. Typical measured warpage contours of the present package are shown in the top of Fig. 34. The warpage measurements of two different individual package samples versus temperatures are shown in the bottom of Fig. 34, where the simulation results are also presented. They compare very well [117,118].
Temporary Wafer Versus Panel Carriers
Theoretically speaking, FOPLP will potentially increase throughput and reduce cost. However, in order to achieve these goals, the following issues  for FOPLP need to be noted and/or resolved:
Most OSATS and foundries already have the necessary equipment for FOWLP. For FOPLP, new capital will have to be expended on newly developed equipment.
Inspection of wafers is a well-known process. FOPLP inspection must be developed.
The yield of FOWLP is higher than that of FOPLP (assuming the size of the panel is larger than that of the wafer).
The cost advantages of panel over wafer need to be carefully determined. (Yes, the throughput is higher, but the pick and place and EMC dispensing times are longer, and the yield is lower.)
A fully loaded high-yield wafer line might be cheaper than a partially loaded low-yield panel line.
The panel equipment takes longer to clean than wafer equipment.
Unlike FOWLP, FOPLP is for medium chip size and metal line width and spacing.
If indeed, the panel processing is developed and is high yield for fine line-width and spacing, there is a chance to produce a major oversupply of capacity.
Intellectual property, materials background, equipment automation, and management of the dimensional stability and yield of the panel in a large format are needed.
The lack of a panel standard for FOPLP means equipment suppliers cannot make the equipment.
Opportunities for Fan-Out Wafer-Level Packaging
Semiconductor industry has identified five main growth engines (applications), namely, mobile, high-performance computing, automotive (especially self-driving car), Internet of things (IoTs), and Big Data (especially for cloud computing). The system-technology drivers such as 5 G, artificial intelligence (AI), and machine learning (ML) are boosting the growths of these applications. We, the packaging people, are using various packaging methods such as wirebonding, flip chip, build-up substrate PoP, WLCSP, FOW/PLP, 2.5D/3D IC integration, multichip module/system-in-package/heterogeneous integration, chiplets, high bandwidth memory (HBM), and embedded multidie interconnect bridge to house (package) the semiconductor devices (e.g., central processors, field programmable gate array, graphic processors, application-specific IC, and memory) for those five main applications.
Because of the drive of AI, ML, and 5 G, the semiconductors such as the sliced field-programmable gate arrays density and I/Os increase and pad-pitch decreases. Even a 12 build-up layers (6-2-6), organic package substrate is not enough to support the sliced chips and a TSV-interposer is needed [121–139]. TSMC called this kind of structure CoWoS (chip-on-wafer-on substrate) [137,138]. Leti [121,122] called it SoW (system-on-wafer).
Figure 35 schematically  shows an advanced packaging of a SoC (system-of-chip) such as the central processor unit (CPU) or graphic processor unit (GPU) and HBMs, which consist of a stack of DRAMs and a base logic vertically interconnect through TSVs and microbumps. These SoC and HBMs are side-by-side attached through microbumps on a TSV-interposer with RDLs. The TSV-interposer is attached (through Cu-C4 bumps) on a package build-up substrate, which is then solder-ball-attached to a PCB. NVidia's Pascal 100 GPU is an example. Since TSV-interposer is very expensive [10,11], which leads to an opportunity for FOWLP.
STATSchipPAC proposed [34,140] using the fan-out flip chip-eWLB to make the RDLs for the chips to perform mostly lateral communications as shown in Fig. 36. It can be seen that the TSV interposer, wafer bumping, fluxing, chip-to-wafer bonding, cleaning, and underfill dispensing and curing are eliminated.
ASE  proposed using the FOWLP technology (chip-first and die face-down on a temporary wafer carrier and then overmolded by the compression method) to make the RDLs for the chips to perform mostly lateral communications as shown in Fig. 37. ASE called it fan-out wafer-level chip-on-substrate (FOCoS). The TSV interposer, wafer bumping of the chips, fluxing, chip-to-wafer bonding, and cleaning, and underfill dispensing and curing are eliminated. The bottom RDL is connected to the package substrate using UBM and the C4 bump as shown in Fig. 37.
Recently, Samsung [141,142] proposed using chip-last or RDL-first FOWLP to eliminate the TSV-interposer as shown in Fig. 38. First of all, they build the RDL on a glass wafer or panel. In parallel, wafer bumping of the logic and HBM. Then, perform fluxing, chip-to-wafer bonding, cleaning, and underfill dispensing and curing. It is followed by EMC compression molding. Then, backgrind the EMC, remove the carrier, and C4 wafer bumping. It is followed by attaching the whole module on the package substrate. Finally, perform solder ball mounting and lid attachment. They called it Si-less RDL interposer.
Figure 39 shows the schematic of TSMC's InFO_oS (integrated fan-out on substrate) . The RDLs are fabricated by TSMC's InFO chip-first and die face-up technology. This InFO_oS is for high performance applications but not as high as those with CoWoS or CoWoS-2 technology.
TSMC  proposed to use the FOWLP technology + Cu-pillar/solder bump to eliminate the TSVs in the HBM cube. In the individual chip, the FOWLP is used to make the RDLs to fan-out all the circuitries to the peripherals of the package. The vertical interconnects of the individual packages are through the Cu-pillars and solder bumps, as shown in Fig. 40. Figure 41 shows another example , where six layers of chips are vertically interconnected together without TSVs. FOWLP has been used to build the RDLs to fan-out the circuitries to the peripheral of the packages and the vertical interconnects are through the Cu-pillar and microbumps.
Since the Intel's proposal [146,147] of using embedded multidie interconnect bridge to serve as the high-density interconnects between chips in a heterogeneous integration system, the “bridge” has been very popular. (Basically, a bridge is a piece of dummy silicon with RDLs and contact pads but without TSVs.) For example, recently, IMEC proposed  to use the bridges + FOWLP to interconnect the logic chip, wide I/O DRAM, and the flash memory as shown in Fig. 42. Their objective is not to use TSVs for all the device chips.
TSMC  demonstrated that the InFO_AiP (antenna-in-package) for high performance and compact 5 G millimeter wave system integration is superior than that of solder bumped flip chip AiP on substrate as shown in Fig. 43. It can be seen that: (a) in the 28 GHz frequency range, InFO RDLs transmission loss (0.175 dB/mm) is 65% less than that on flip chip substrate trace (0.288 dB/mm), and (b) in the 38 GHz frequency range, the transmission loss for InFO RDLs (0.225 dB/mm) is 53% less than that (0.377 dB/mm) on flip-chip substrate trace.
Summary and Recommendations
The recent advances and trends of FOW/PLP have been presented in this study. Some important results and recommendations are summarized as follows:
First of all, TSMC/Apple saved the FOWLP technology. Before September 2016, most of the chips housed by the FOWLP technology were very small and the metal line width and spacing of the RDLs were very large. People looked down on the FOWLP technology.
Chip-first is a good choice for packaging semiconductor ICs such as baseband, RF/analog, PMIC, AP, and low-end ASIC, CPUs and GPUs for portable, mobile, and wearable products. While chip-last (RDL-first) is potentially suitable for packaging IC devices such as high-end CPUs, GPUs, ASICs, and field-programmable gate arrays for super computers, servers, networking, and telecommunication products.
Chip-first and die face-down is the most simple and low cost formation. In general, this applies to smaller chips and the metal line width and spacing of the RDLs are ≥ 10 μm.
The process steps of chip-first and die face-up are a little more complicated than chip-first and die face-down and thus slightly higher cost. In general, this applies to larger chips and the metal line width and spacing of the RDLs are ≥ 5 μm.
The process steps of chip-last or RDL-first are the most complex and high cost. However, it applies to very large chips and the metal line width and spacing of the RDLs are < 5 to submicron. Thus, this process can only be afforded by very-high density and performance applications. On the other hand, for high-density and high-performance applications, why insist on the FOWLP technology because there are many other packaging alternatives?
Organic RDLs fabricated by polymer (either photosensitive or not) and ECD Cu + etching is the most common method for FOWLP by OSATS or even foundries. It can be applied to the chip-first and chip-last formations.
Inorganic RDLs fabricated by PECVD and Cu damascene + CMP is a backend semiconductor method for the chip-last FOWLP formation. By viewing the change of the line width and spacing (from 5 μm to 10 μm) of the RDLs of the application processor chipsets (from A10 of iPhone 7 to A11 of iPhone 8), the chance of using PECVD and Cu-damascene + CMP in fabricating the RDLs for FOWLP is very slim (may be only for niche applications). If there is a need for inorganic RDLs, however, why insist on the FOWLP technology?
Hybrid RDLs fabricated by inorganic RDL first and then organic RDLs are a mixed method for the chip-last FOWLP formation. Again, if there is a need for hybrid RDLs, why insist on the FOWLP technology?
RDLs by pure PCB/LDI technology are for chip-first FOPLP. There is not any semiconductor equipment required and is the highest throughput and lowest cost technology. However, the chip sizes are small (< 8 mm × 8 mm) and the metal line width and spacing of the RDLs are, in general, large (≥ 10 μm).
In order to increase the throughput and yield and reduce the cost with FOPLP, some important issues (should be noted and resolved) have been highlighted.
Warpage is a critical issue for FOW/PLP. Depending on the formation of the package and the number of RDLs, there are a few different warpages affecting the FOW/PLP process. What are the maximum allowable warpages? The rule of thumb is for a 300-mm reconstituted wafer, the maximum allowable warpage of the reconstituted wafer is 1 mm, but 0.5 mm is preferred for high yield. The maximum allowable warpage of the individual package (≤ 20 mm × 20 mm) is 0.2 mm, but 0.1 mm is preferred for high yield.
The junction-to-ambient thermal resistance (Rja) of a 10 mm × 10 mm chip in a 13.42 mm × 13.42 mm package is higher (i.e., the lower the thermal performance) for thinner chips. This is because of the inferior thermal spreading capability of thinner chips.
Because of the drive of 5G, AI, and ML, there are many opportunities for FOWLP to house (package) various semiconductor devices for mobile, high-performance computing, self-drive cars, and IoTs applications. For example: (1) by using the chip-first FOWLP to eliminate the TSV-interposer for multiple flip chips on package substrate, (2) by using the chip-last FOWLP to eliminate the TSV-interposer for SoC and HBM cubes on package substrate, (3) by using the chip-first FOWLP to construct the HBM cubes without TSVs, (4) by suing the bridge + FOWLP to interconnect all the chips (without TSVs) in a 3D IC heterogeneous integration system, and (5) by using the fan-out AiP to reduce the transmission loss.