Electronics on the Frontier
Outer space poses unique challenges in electronics design; the dangers of cosmic rays and stray particles are ever present, and the near impossibility of in-flight maintenance means that dependability is the name of the game.
Our planet’s magnetic field that largely shield us[1] is absent in deep space, exposing crafts to harsh radiation stemming from the Sun and beyond, yet Earth’s orbits are not safe-spaces either, as the Van-Allen radiation-belts serve as conduits for charged particles that impact orbiting satellites, wreaking havoc on delicate silicon. Similar belts are also present around other planets with a magnetic field, with that of Jupiter being particularly strong, having a field strength orders of magnitude greater than that of the Earth. Absent too is of course an atmosphere, and materials commonplace in terrestrial applications may be unsuitable in the vacuum of space as they might evaporate (only later to condense on an optical lens), trap moisture, or crack due to thermal expansion, breaking yet another piece of processed semiconductor packaged inside.
Electronics and Radiation
Electronics in space fail from exposure to radiation in multiple ways, with the underlying cause stemming from its ionizing effects. Charged particles in the form of electrons, protons and ions zip through circuits, leaving ionized tracks in their wake. The particular radiation environment is dictated by a craft’s location in space, with the inner and outer Van Allen belts containing mostly electrons and protons in different ratios and at different energies, for instance. While passive components are largely unaffected, this is not the case for semiconductor devices, and the effects can manifest over time as degraded circuit performance or be immediate.
In the former case, accumulated dose (or TID, for total-ionizing dose) result in trapped charges in the oxide layer between a transistor’s gate terminal and channel, affecting its threshold voltage, in turn causing increased leakage currents and in the worst-case inability to switch a transistor on or off. More nebulous consequences stem from the instantaneous single-event effects (SEE); high-energy ionizing phenomena that can trigger parasitic structures in CMOS devices, causing self-sustaining short-circuits in a latch-up process, which can only be halted by cycling the power and if left unmitigated may cause catastrophic failures. Single event gate rupture might also occur, where a conductive path is established straight through the gate oxide. Other effects may or may not ruin the day – upsets in a microcontroller’s program memory might cause the device to enter an invalid state and reset harmlessly, or it might lead to undesirable (and perhaps undetected) behaviour and occur during critical phases; the same thing can happen in an FPGA’s configuration memory. Upsets can interfere with analog circuitry too, for instance the feedback loops- or pass-transistors in linear regulators, potentially exposing downstream devices to overvoltage.
Design and Mitigation
So how to mitigate? The radiation environment, mission lifetime and acceptable downtime all play a part in deciding just how much is needed. Shielding offers a degree of protection, particularly against electrons, but bringing mass to orbit and beyond is costly, and aluminium in practical amounts is unable to stop the more energetic particles. Protection thus also comes in other forms, among which is the use of radiation-hardened circuitry. These devices are often manufactured on processes that eliminate the previously mentioned risk of latch-up, being built on insulated substrates (silicon on insulator) that remove the parasitic thyristor-structures, and on larger nodes (>100 nm) which are less susceptible to SEEs because more energy must be deposited in a larger node to actually cause an upset from a “1” to a “0”. In terms of memory technologies, ferroelectric- and magneto-resistive memories show promise in terms of their tolerance against upsets (ECC might be employed in any case) over FLASH. Relatedly, FLASH- or antifuse (i.e., one-time-programmable) FPGAs might be preferred over SRAM FPGAs. In addition to this comes various optimizations implemented on the layout of the IC itself. Importantly for a designer, rad-hard devices are also characterized by the manufacturer, typically up to some level of TID in kilo-rads and with specified immunity to latch-up and other SEEs up to some level of linear energy transfer (the energy deposited by the ionizing particles in the device per unit length), giving users some confidence in whether a part will survive for the duration of a mission, and in its susceptibility to various single event effects.
Simply choosing a rad-hard device is not a guarantee of success by itself. Parts are rarely fully immune to SEEs, and transient excursions on the output of regulators can for instance be a concern, whereas voltage references might drift over time owing to the effects of accumulating TID. Additional mitigating circuitry must therefore often be incorporated in order to handle eventualities, especially if a fault can propagate to other sub-systems. A thorough FMEA and worst-case analysis can often catch these potential failure modes and lead to design changes or other forms of mitigation. Particularly critical circuitry might even be duplicated and left unpowered until needed, and triple modular redundancy with voting is sometimes used for processing systems.
Other Aspects
A myriad of factors must in addition be taken into account: With no convection of air, thermal considerations must be handled carefully, and what could happen when a 5×5 cm ceramic-packaged BGA has a different coefficient of thermal expansion than the PCB-material upon which it is mounted is worth considering – the brittle ceramic might in fact crack under the strain, and as such these devices are sometimes mounted on literal pillars of solder that provide additional stress relief. Ceramic packaging is nonetheless often preferred as it does not trap moisture, does not outgas, and its CTE matches well to that of certain semiconductors, which reduces the risk of cracking the packaged die. When all is said and done, it’s likely worthwhile to derate heavily across the board, and one might peruse the open standards published by the European Cooperation for Space Standardization (ECSS) for details on exactly how much, as well as on how several other aspects related to electronics design for space applications can be handled.
Of course, requirements must be tailored to any project, and with the advent of what is often referred to as New Space, some of the rigor typically associated with designs for space is occasionally discarded in exchange for lower cost, faster project-cycles, smaller solution sizes, and capability, along with an elevated risk-tolerance. Component manufacturers follow these trends too, with some providing devices that fit in between standard commercial and their truly “rad-hard” counterparts, for instance parts in plastic packaging and with moderate radiation tolerance. In terms of increased risk, a 5-in-100 failure rate may be acceptable if “100” refers to the number of satellites in a constellation and not the probability of failure of a core processor in a single bus-sized satellite, years in the making.
[1] Possibly with the exception of certain Belgian local elections, where a candidate once tallied exactly 210 additional- and unexplainable votes, although whether this was due to rogue high-energy rays from a distant supernova triggering a bit-flip will remain unclear.