# Towards pW-Class IoT Nodes Using Crystalline Oxide Semiconductor Dynamic Logic

Tobias Kaiser and Friedel Gerfers Chair of Mixed Signal Circuit Design Technische Universität Berlin, Germany Email: kaiser@tu-berlin.de

*Abstract*—Minimum operating power limits possible energy sources in IoT nodes. This paper considers dynamic logic circuits based on c-axis aligned crystalline indium-gallium-zinc-oxide FETs as a design style reducing the minimum operating power of digital systems. A method for ensuring timing and signal integrity and the integration into a standard-cell design flow is presented. Based on a generated logic library, a RISC-V CPU with an outstanding estimated minimum operating power of 6.0 pW is presented. This result shows that systems based on this logic family surpass comparable systems based on CMOS technologies in terms of minimum operating power.

## I. INTRODUCTION

Physical dimensions, functionality and lifetime of smart Internet-of-Things (IoT) objects such as wireless sensor nodes are limited by power requirements and the need to integrate suitable power sources into products. At low operating frequencies, static leakage currents in silicon transistors dominate power consumption and limit the possible energy sources that a system can operate from.

Recent approaches to reduce static power consumption in silicon-based systems include dynamic leakage suppression (DLS) logic [1, 2] and feedforward leakage suppression logic [3], which managed to push leakage into sub-nW range. Further reduction in leakage remains ultimately limited by parasitic conduction in silicon FETs.

Oxide semiconductor field effect transistors (OSFETs) based on c-axis aligned crystalline indium-gallium-zinc-oxide (CAAC-IGZO) are considered as a complement and alternative to silicon FETs due to their exceptionally good off-state performance with leakage currents below  $10^{-23}$  A/µm [4]. OSFETs have previously been successfully employed in low-power display [5, 6], memory [7–9] and logic applications. In contrast to previously implemented normally-off low-power processors [10–12] that incorporate the OSFET as auxiliary device to aid state retention in designs that otherwise rely on silicon devices for computation, this paper addresses feasibility, design flow and power consumption of digital systems entirely based on OSFETs using OSFET dynamic logic (OSDL).

Section II introduces OSDL and includes a timing and signal integrity analysis. Section III presents a standard-cell based design flow for complex logic systems based on OSDL. In Section IV, the implementation of a RISC-V processor using OSDL and further system-level considerations are described. In Section V, simulation results are presented and compared



Fig. 1. OSDL logic family (a) and corresponding clocking schema (b). PC / EV mark precharge / evaluate transistors and corresponding clocking phases.

to previous work. The paper closes with a conclusion in Section VI.

## II. OSFET DYNAMIC LOGIC

Because OSFETs are n-channel devices only, logic families such as CMOS that rely on complementary n- and p-channel devices cannot be realized without introducing leakage-prone silicon devices. Dynamic logic circuits do not require both nand p-channel devices and therefore are the basis for OSDL. A

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

This paper has been accepted for publication as: T. Kaiser and F. Gerfers, "Towards pW-Class IoT Nodes using Crystalline Oxide Semiconductor Dynamic Logic," 2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020, pp. 1-5, doi: 10.1109/ISCAS45731.2020.9181130.

working shift register has previously been demonstrated using this design style [9].

Fig. 1a shows a circuit schema of OSDL. The presented OSDL gate consists of a precharge transistor (PC), an evaluate transistor (EV) and a pull-down network (PDN). The PDN can realize arbitrary negative unate boolean functions. To achieve a valid output signal, a positive clock pulse at the PC input has to be followed by a non-overlapping positive clock pulse at the EV input.

For clocking of PC and EV, a set of non-overlapping clocks  $\phi_0, \ldots, \phi_{n-1}$  is applied as depicted in Fig. 1b. Each logic gate is assigned to one logic phase *i* between 0 and n-1. The PC transistor is connected to clock  $\phi_i$  and the EV transistor to clock  $\phi_{i+1}$ . Hence, EV / PC clocks of gates assigned to adjacent phases are shared. Modulo *n* logic for clock indices is assumed.

The output  $\overline{Z}$  of a given logic gate of phase *i* is unconditionally precharged using the positive pulse of  $\phi_i$ . The following positive pulse of  $\phi_{i+1}$  then produces a valid output signal. Between the falling edge of  $\phi_{i+1}$  and the following rising edge of  $\phi_i$ , the output signal  $\overline{Z}$  is valid but not driven by any transistor. This phase is marked as *hold*. To ensure that input signals  $A_1, \ldots A_m$  are valid during the EV phase, they need to be driven by gates of neither phase *i* nor i + 1.

## A. OSFET Manufacturing and Device Model

OSFETs have been manufactured and characterized for different target applications. Simulations in this paper are based on characterization data from a recently published 21 nm long and 25 nm wide OSFET [4] with fin-style top-gate and independent back-gate.

Even though OSDL itself utilizes only n-channel OSFETs, auxiliary p-channel devices are desired for integration of a clock generator and helpful for peripheral memory and I/O circuits. For this purpose, OSFETs can be manufactured on top of a standard silicon CMOS process or possibly be combined with organic p-type transistors on top of an insulating substrate.

A compact device model has been implemented for analog SPICE circuit simulation that employs the following equation for drain current  $I_{ds}$  in the subthreshold region:

$$I_{\rm ds} = \frac{W}{L} \cdot I_{\rm d,0} \cdot e^{(V_{\rm gs} + \chi V_{\rm bs} - V_{\rm th,0})/(n \cdot V_{\rm T})} \cdot (1 - e^{-V_{\rm ds}/V_{\rm T}}), \quad (1)$$

where  $V_{\rm gs}$  denotes the gate-source voltage,  $V_{\rm bs}$  the backgate-source voltage,  $V_{\rm ds}$  the drain-source voltage and  $V_{\rm T}$  the temperature voltage. Characteristics of  $I_{\rm d,0} = 8.4 \,\mathrm{nA}$ ,  $\chi = 80 \,\mathrm{mV/V}$ , n = 1.51,  $V_{\rm th,0} = 0.9 \,\mathrm{V}$  were used for the model.

# B. Output Node Capacitance Model

Dynamic logic relies on the capacitances of each output net  $\overline{Z}$  for retention of logic states during the hold phase. For prelayout estimation, output net capacitance  $C_{\overline{Z}}$  due to input gate and interconnect is modeled as a function of fan-out (wire load model) and separated into two components  $C_{\overline{Z}}(\text{FO}) = C_{\overline{Z},G}(\text{FO}) + C_{\overline{Z},C}(\text{FO})$ , where  $C_{\overline{Z},G}$  contains capacitances to  $V_{\text{DD}}$ ,  $V_{\text{SS}}$  and  $V_{\text{BB}}$  that can be lumped to ground and  $C_{\overline{Z},C}$  represents a coupling capacitance to other data signals that are possible crosstalk aggressors. Interconnect resistances are not modeled due to them being significantly lower than OSFET on-resistance.

To minimize signal degradation, the ratio of the capacitive divider  $C_{\overline{Z},C}/C_{\overline{Z}}$  has to be kept small by shielding the output nets. We assume  $\alpha = 0.9$  as the fraction of interconnect capacitance that contributes to  $C_{\overline{Z},G}$ . For our calculations we have conservatively modeled the interconnect with a capacitance of  $0.2 \text{ fF}/\mu\text{m}$  and an approximate wire length of  $7 \,\mu\text{m}/\text{FO}$ , resulting in a per-fanout interconnect capacitance of  $C_{1C/\text{FO}}$ . For simplicity, we furthermore assume that each fan-out connection is to a maximum-sized gate input of capacitance  $C_{\text{g,in}}$ .  $C_{\text{d,EV}}$  and  $C_{\text{s,PC}}$  are capacitances seen from drain / source of the EV / PC transistors. A fixed per-cell storage capacitor  $C_{\text{storage}}$  is used to enhance retention and reduce the capacitive divider. The capacitances were consequently modeled as

$$C_{\bar{\mathbf{Z}},\mathbf{G}}(\mathbf{FO}) = C_{\mathbf{s},\mathbf{PC}} + C_{\mathbf{storage}} + \mathbf{FO} \cdot \alpha \cdot C_{\mathbf{IC}/\mathbf{FO}},\tag{2}$$

$$C_{\overline{Z},C}(FO) = C_{d,EV} + FO \cdot ((1 - \alpha) \cdot C_{IC/FO} + C_{g,in}).$$
(3)

Due to an equivalent oxide thickness of 6 nm for the top gate, interconnect capacitances dominate  $C_{\overline{Z}}$ .

#### C. Timing / Signal Integrity Characterization

For correct operation, it needs to be ensured that output nodes are sufficiently discharged when a continuous PDN path has been turned on by high input signals (*switch-on test*), and that a PDN path with a low input signal does not cause degradation of a precharged output signal beyond an acceptable margin (*switch-off test*). Goal of this characterization is to determine whether reliable logic operation can be achieved at all, and to find a global fan-out limit to ensure reliable operation. Initial characterization is done at the worst-case hot design point ( $T = 60 \,^{\circ}$ C), a supply voltage of  $V_{DD} = 0.8 \,$ V, a given back-gate voltage  $V_{BB} = 0 \,$ V and a minimum clock pulse width of  $T_{\phi,min} = 83 \,$ ms.

Local clock buffering of the clock phases is not required, since clock transition times are not a critical factor of OSDL timing and clock frequencies allow disregarding interconnect resistances.

Due to the asymmetry of using n-channel devices for both PC and EV, a complete discharge of output nodes corresponding to an output low voltage of  $V_{OL} = V_{SS}$  is achieved, whereas the PC transistor turns itself off as output node voltage rises, resulting in a logarithmic precharge curve and a high output voltage of  $V_{OH} < V_{DD}$  of a previously discharged output node. Fig. 2 shows typical signal behavior.

a) Switch-on test: The switch-on testbench is shown in Fig. 3. It contains a precharge test gate that produces an output voltage  $v_{PC}$  at an initially discharged  $C_{\overline{Z}}(FO)$ -sized capacitor for a given fan-out FO and a minimum-length precharge pulse  $\phi_0$ . Continuous PDN path inputs of a collection of evaluate test gates are connected to a voltage-dependent voltage source  $v_{in}(v_{PC})$  which simulates degradation of the output voltage through the capacitive divider introduced in Section II-B and



Fig. 2. Exemplary output signal waveforms for low-to-low, low-to-high, high-to-low and high-to-high transitions.



Fig. 3. Simulation setup for switch-on test.

pre-defined noise margins for hold and evaluate  $V_{\text{NM,hold}} = 20 \text{ mV}, V_{\text{NM,EV}} = 50 \text{ mV}$ :

$$v_{\rm in}(v_{\rm PC}) = \frac{C_{\bar{Z},\rm G}(\rm FO)}{C_{\bar{Z}}(\rm FO)} \cdot v_{\rm in} - V_{\rm NM,hold} - V_{\rm NM,EV} \qquad (4)$$

After the PC pulse, an equally long EV pulse at  $\phi_1$  causes discharging of the initially fully charged  $C_{\overline{Z}}$ (FO)-sized capacitors  $v_{\text{out},i}$ . In the testbench, the voltages  $v_{\text{PC}}$  and  $v_{\text{out},i}$  are measured after the clock pulse  $\phi_1$ . The global fan-out limit is given by the highest FO that allows full discharging of all voltages  $v_{\text{out},i}$ .

b) Switch-off test: Voltages at output node capacitors that have previously been fully discharged ( $V_{OL} = V_{SS}$ ) during evaluation can also be degraded by the capacitive divider and  $V_{\rm NM,hold}$ . The maximum low input voltage  $V_{\rm IL}$  can thus be calculated. The worst-case pull-down current  $I_{\rm IL}$  of a PDN driven by  $V_{\rm IL}$  is found through DC simulation and leads to the maximum clock pulse width by which degradation does not exceed  $V_{\rm NM,EV}$ :

$$V_{\rm IL} = \frac{C_{\bar{Z},\rm C}(\rm FO)}{C_{\bar{Z}}(\rm FO)} \cdot V_{\rm DD} + V_{\rm NM,hold}$$
(5)

$$T_{\phi,\max} = \frac{C_{\bar{Z}}(1) \cdot V_{\text{NM,EV}}}{I_{\text{IL}}}$$
(6)

This shows that maximum clock pulse width depends on the minimum output capacitance.  $C_{\text{storage}}$  can be used to ensure that  $T_{\phi,\text{max}} > T_{\phi,\text{min}}$  and timing margins are sufficient.

## D. Retention Time

During normal operation in the hold phase, OSDL output node voltages can be degraded by leakage currents through the EV and PC transistors. The current  $I_{off}$  through an EV transistor with  $V_{gs} = 0$  V determines retention time  $T_{ret}$  as shown in Fig. 1b and the associated minimum operating frequency for always-on operation:

$$T_{\rm ret} = \frac{C_{\bar{Z}}(1) \cdot V_{\rm NM, hold}}{I_{\rm off}} \tag{7}$$

#### E. Back-Gate Temperature Compensation

It is desirable to keep  $I_{ds}$  temperature-invariant for a gate voltage between  $V_{IL}$  and  $V_{IH}$ . This can be achieved with dynamic  $V_{BB}$  biasing. The back-gate biasing proposed here keeps  $I_{IH}$ , the drain current corresponding to  $V_{IH}$ , temperature-invariant. This results in a constant  $T_{\phi,\min}$  and increasing  $T_{\phi,\max}$  and  $T_{ret}$  for decreasing temperatures.

## III. STANDARD-CELL BASED DESIGN FLOW

# A. OSDL Cell Library

A Liberty timing library, SPICE sub-circuits and functional Verilog views of an OSDL library with 51 gates were generated. Each gate implements a single-stage negative unate function such as INV, NOR2, NAND4, AOI333, OAI333. A 10-fin PC transistor, 1-fin EV transistor and 2- to 4-fin PDN transistors (depending on stacking) have been used. A per-cell storage capacitance  $C_{\text{storage}}$  of 3 fF has been chosen.

The generated Liberty model associates a constant propagation time and the previously discussed library-wide maximum fan-out to all gates. The constant propagation time directs the timing-driven synthesis tool to minimize the number of logic stages.

## B. Logic Synthesis

To enable RTL-to-netlist synthesis with conventional synthesis tools, a dummy flip-flop cell (DFF) has been added to the Liberty timing library. A typical single-clock, synchronous-reset RTL design is then translated into a netlist that contains both OSDL gates and DFFs, where the number of OSDL gates between DFFs is optimized.

This netlist is then transformed by a custom post-processing script into a functioning OSDL design using following steps:

- 1) Every OSDL gate is assigned to a clock phase *i* where *i* is the farthest distance in number of gates from its inputs to a DFF.
- 2) The number of clock phases n is set in accordance with the maximal i present in the design. n has to be at least four.
- For each OSDL gate G's inputs (and every design output port) that is directly connected to a DFF or a chain of DFFs:

 TABLE I

 PARAMETERS OF SIMULATED OSDL LOGIC FAMILY AND CPU CORE

|                                            | $T=-30^{\circ}\mathrm{C}$  | $T=0^{\circ}\mathrm{C}$    | $T=30^{\circ}\mathrm{C}$             | $T=60^{\circ}\mathrm{C}$            |
|--------------------------------------------|----------------------------|----------------------------|--------------------------------------|-------------------------------------|
| V <sub>BB</sub>                            | 1.515 V<br>201 fA          | 1.01 V                     | 0.505 V                              | 0 V<br>204 fA                       |
| $I_{\rm IL}$                               | 35 aA                      | 94 aA                      | 210 aA                               | 407 aA                              |
| $I_{\text{off}}$<br>$T_{\phi \text{ max}}$ | $0.2\mathrm{aA}$<br>999 ms | $1.0\mathrm{aA}$<br>391 ms | $3.4\mathrm{aA}$<br>$184\mathrm{ms}$ | $9.6\mathrm{aA}$<br>$99\mathrm{ms}$ |
| $T_{\rm ret}^{\phi, \rm max}$              | $470\mathrm{s}$            | $96.8\mathrm{s}$           | $28.5\mathrm{s}$                     | $10.1\mathrm{s}$                    |
| P <sub>min,avg</sub>                       | $0.36\mathrm{pW}$          | $1.78\mathrm{pW}$          | $6.0\mathrm{pW}$                     | $17.0\mathrm{pW}$                   |

- a) A set of valid (cycle, phase) tuples for the OSDL input signal is determined based on G's phase.
- b) The (cycle, phase) tuple for the OSDL gate J at the DFF (chain) input is given by J's phase and the number of DFFs between J and G.
- c) Buffers (INV pairs) are inserted between J and G until the (cycle, phase) tuple of the final buffer output is within the set of valid (cycle, phase) tuples. If possible, existing buffers are reused.
- d) The possibly buffered output of J now replaces the DFF as driver of G's input.

4) All DFFs are removed from the netlist.

# IV. RISC-V PROCESSOR AND SYSTEM INTEGRATION

The open-source PicoRV32 RISC-V CPU core [13] was synthesized in RV32I two-cycle-ALU configuration using the described standard-cell based OSDL design flow.

An integrated circuit suitable for pW-class IoT nodes would likely include an on-chip energy harvester, an ultra-low-power clock generator, memory and I/O alongside the OSDL CPU core. CMOS oscillators based on DLS logic, such as [14], are possible candidates for clock generation, as they function within the desired pW-level power budget. For main memory, a dynamic RAM based on OSFETs requiring no periodic refresh (DOSRAM) [7–9] could be utilized. To save power, the working registers of the OSDL CPU, which are currently realized using OSDL gates and therefore periodically refreshed, could also be implemented as DOSRAM.

#### V. SIMULATION RESULTS

The results of the switch-on test using  $T_{\phi,\min} = 83 \text{ ms}$  for five OSDL gates are shown in Fig. 4. Based on those results, the global fan-out limit was set to 13. Resulting worst-case levels are  $V_{\text{OH}} = 631 \text{ mV}$  and  $V_{\text{IH}} = 449 \text{ mV}$ . The switch-off test using a calculated  $V_{\text{IL}} = 179 \text{ mV}$  leads to a maximum clock pulse width  $T_{\phi,\max} = 146 \text{ ms}$  at  $60 \,^{\circ}\text{C}$ .

Temperature-dependent timing parameters using the dynamic- $V_{\rm BB}$  temperature compensation scheme from Section II-E are shown in Table I.

The resulting RISC-V CPU core utilizes 12 clock phases and consists of 11736 gates with a total transistor width of  $4520 \,\mu\text{m}$ . The maximum clock frequency is  $1 \,\text{Hz}$ . A test program was executed using gate-level digital simulation and an internal energy consumption of  $62.8 \,\text{pJ/cycle}$  was found.



Fig. 4. Switch-on test simulation results.

 TABLE II

 COMPARISON OF OSDL CPU WITH OTHER LOW-LEAKAGE CPUS

|                          | Proposed<br>OSDL CPU     | ISSCC '18<br>[2]      | VLSI'17<br>[15]     | ISSCC '15<br>[1]    |
|--------------------------|--------------------------|-----------------------|---------------------|---------------------|
| Technology               | 21 nm<br>CAAC-IGZO       | 180 nm Si             | 65 nm Si            | 180 nm Si           |
| Туре                     | pre-layout<br>estimation | measurement results   | measurement results | measurement results |
| Architecture             | PicoRV32<br>(RV32I)      | MSP430-<br>compatible | ARM<br>Cortex-M0+   | ARM<br>Cortex-M0+   |
| Clock gen.               | not impl.                | included              | excluded*           | included            |
| Freq. range              | 0.035<br>-1 Hz           | 1 Hz<br>-2.8 MHz      | 12 kHz<br>60 MHz    | 2–15 Hz             |
| Min. power               | 6.0 pW                   | 595 pW                | 46 nW               | 295 pW              |
| Min. energy<br>per cycle | 171.5 pJ                 | 14 pĴ                 | 6.3 pJ              | 44.7 pJ             |
| Memory                   | not impl.                | 2 KB                  | 12 KB               | 128 B               |
| Supply volt.             | 0.9 V                    | 0.2–1.1 V             | 0.29–1.2 V          | 0.16–1.15 V         |

\*Implemented clock generator is excluded to improve comparability.

An additional energy of 108.7 pJ/cycle was calculated for charging and discharging of the clock nets based on the wire load model, leading to a total of 171.5 pJ/cycle. The minimum average operating power values  $P_{\min,\text{avg}}$  at retention speed are shown in Table I. A peak power requirement of 35 pW was found using transient simulation for the highest-load-phase  $\phi_0$ and an optimized 184 ms wide clock pulse.

Table II compares the proposed OSDL CPU with state-ofthe-art conventional [15] and DLS CPUs [1, 2].

#### VI. CONCLUSION

In this paper, we introduced a method for ensuring timing and signal integrity of OSDL circuits and a standard-cell based design flow that can be used to easily port existing designs to crystalline oxide semiconductor technology. Using an OSDL implementation of an open-source RISC-V core, it has been revealed that the use of OSDL can drastically reduce minimum operating power of systems. This enables the design of IoT nodes powered by pW-level on-chip energy harvesting sources.

#### REFERENCES

- W. Lim, I. Lee, D. Sylvester, and D. Blaauw, "Batteryless sub-nW Cortex-M0+ processor with dynamic leakage-suppression logic," in 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, Feb 2015, pp. 1–3.
- [2] L. Lin, S. Jain, and M. Alioto, "A 595 pW 14 pJ/cycle microcontroller with dual-mode standard cells and self-startup for battery-indifferent distributed sensing," in 2018 IEEE International Solid-State Circuits Conference (ISSCC), Feb 2018, pp. 44–46.
  [3] J. P. Cerqueira, J. Li, and M. Seok, "A fW- and kHz-class feedforward
- [3] J. P. Cerqueira, J. Li, and M. Seok, "A fW- and kHz-class feedforward leakage self-suppression logic requiring no external sleep signal to enter the leakage suppression mode," *IEEE Solid-State Circuits Letters*, vol. 1, no. 6, pp. 150–153, June 2018.
- [4] H. Kunitake, K. Ohshima, K. Tsuda, N. Matsumoto, T. Koshida, S. Ohshita, H. Sawai, Y. Yanagisawa, S. Saga, R. Arasawa, T. Seki, R. Honda, H. Baba, D. Shimada, H. Kimura, R. Tokumaru, T. Atsumi, K. Kato, and S. Yamazaki, "A *c*-axis-aligned crystalline In-Ga-Zn oxide FET with a gate length of 21 nm suitable for memory applications," *IEEE Journal of the Electron Devices Society*, vol. 7, pp. 495–502, 2019.
- [5] S. Yamazaki, J. Koyama, Y. Yamamoto, and K. Okamoto, "15.1: Research, development, and application of crystalline oxide semiconductor," *SID Symposium Digest of Technical Papers*, vol. 43, no. 1, pp. 183–186, 2012. [Online]. Available: https: //onlinelibrary.wiley.com/doi/abs/10.1002/j.2168-0159.2012.tb05742.x
- [6] S. Kawashima, S. Inoue, M. Shiokawa, A. Suzuki, S. Eguchi, Y. Hirakata, J. Koyama, S. Yamazaki, T. Sato, T. Shigenobu, Y. Ohta, S. Mitsui, N. Ueda, and T. Matsuo, "44.1: Distinguished paper: 13.3-in. 8K x 4K 664-ppi OLED display using CAAC-OS FETs," *SID Symposium Digest of Technical Papers*, vol. 45, no. 1, pp. 627–630, 2014. [Online]. Available: https://onlinelibrary.wiley.com/doi/ abs/10.1002/j.2168-0159.2014.tb00164.x
- [7] T. Atsumi, S. Nagatsuka, H. Inoue, T. Onuki, T. Saito, Y. Ieda, Y. Okazaki, A. Isobe, Y. Shionoiri, K. Kato, T. Okuda, J. Koyama, and S. Yamazaki, "DRAM using crystalline oxide semiconductor for access transistors and not requiring refresh for more than ten days," in 2012 4th IEEE International Memory Workshop, May 2012, pp. 1–4.

- [8] T. Matsuzaki, T. Onuki, S. Nagatsuka, H. Inoue, T. Ishizu, Y. Ieda, N. Yamade, H. Miyairi, M. Sakakura, Y. Shionoiri, K. Kato, T. Okuda, J. Koyama, and S. Yamazaki, "A 16-level-cell memory with c-axis-aligned a-b-plane-anchored crystal In-Ga-Zn oxide FET using threshold voltage cancel write method," *Japanese Journal of Applied Physics*, vol. 55, no. 4S, p. 04EE02, Feb 2016. [Online]. Available: https://doi.org/10.7567%2Fjjap.55.04ee02
- [9] S. Maeda, S. Ohshita, K. Furutani, Y. Yakubo, T. Ishizu, T. Atsumi, Y. Ando, D. Matsubayashi, K. Kato, T. Okuda, M. Fujita, and S. Yamazaki, "A 20ns-write 45ns-read and 1014-cycle endurance memory module composed of 60nm crystalline oxide semiconductor transistors," in 2018 IEEE International Solid-State Circuits Conference - (ISSCC), Feb 2018, pp. 484–486.
- [10] H. Tamura, K. Kato, T. Ishizu, W. Uesugi, A. Isobe, N. Tsutsui, Y. Suzuki, Y. Okazaki, Y. Maehashi, J. Koyama, Y. Yamamoto, S. Yamazaki, M. Fujita, J. Myers, and P. Korpinen, "Embedded SRAM and Cortex-M0 core using a 60-nm crystalline oxide semiconductor," *IEEE Micro*, vol. 34, no. 6, pp. 42–53, Nov 2014.
- [11] T. Onuki, W. Uesugi, A. Isobe, Y. Ando, S. Okamoto, K. Kato, T. R. Yew, J. Y. Wu, C. C. Shuai, S. H. Wu, J. Myers, K. Doppler, M. Fujita, and S. Yamazaki, "Embedded memory and ARM Cortex-M0 core using 60-nm c-axis aligned crystalline indium–gallium–zinc oxide FET integrated with 65-nm Si CMOS," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 4, pp. 925–932, April 2017.
- [12] S. Yamazaki and M. Fujita, *Physics and Technology of Crystalline Oxide Semiconductor CAAC-IGZO: Application to LSI.* Wiley, Dec. 2016.
- [13] C. Wolf, "PicoRV32 a size-optimized RISC-V CPU," 2019. [Online]. Available: https://github.com/cliffordwolf/picorv32/
- [14] O. Aiello, P. Crovetti, L. Lin, and M. Alioto, "A pW-power Hz-range oscillator operating with a 0.3–1.8-V unregulated supply," *IEEE Journal* of Solid-State Circuits, vol. 54, no. 5, pp. 1487–1496, May 2019.
- [15] J. Myers, A. Savanth, P. Prabhat, S. Yang, R. Gaddh, S. O. Toh, and D. Flynn, "A 12.4 pJ/cycle sub-threshold, 16 pJ/cycle near-threshold ARM Cortex-M0+ MCU with autonomous SRPG/DVFS and temperature tracking clocks," in 2017 Symposium on VLSI Circuits, June 2017, pp. C332–C333.