With the rapid development of artificial intelligence and edge computing, the Computing-In-Memory (CIM) architecture is considered a crucial technical path to alleviate the von Neumann bottleneck. While 3D NAND-based CIM schemes offer distinct advantages in high storage density and process maturity, their execution of analog computing tasks, such as Matrix-Vector Multiplication (MVM), suffers from calculation accuracy degradation. This is primarily caused by the broadening of the string current distribution, which leads to accumulated current deviations. In particular, the polysilicon grain boundaries (GBs) within the Top Select Gate (TSG) channel of 3D NAND strings play a decisive role in determining this current distribution.
To address this challenge, this study utilizes Technology Computer-Aided Design (TCAD) to construct a mature TSG Deck device model, analyzing the influence mechanism of potential barriers induced by GB traps on on-state current fluctuations. Simulation results demonstrate that acceptor-like traps at grain boundaries induce local potential barriers, and the variance of these barriers is the dominant physical source of on-state current instability. Guided by these physical insights, a novel process optimization strategy is proposed to modulate the equivalent hydrogen passivation window by combining polysilicon precursors with distinct nucleation and hydrogen-content characteristics (denoted as NS, MS, and DS). Specifically, innovatively inserting a 9-nm DS precursor interlayer between the NS nucleation layer and the MS bulk-fill layer creates a low-defect buffer zone, achieving in-situ hydrogen passivation of deep-level traps without compromising interface smoothness.
Wafer-scale statistical analysis of on-state current distributions across different process splits confirms that the optimal precursor combination reduces the normalized standard deviation of the bit-line current by 50% compared to the baseline process. Furthermore, to evaluate the system-level impact, the measured current distributions were fitted to a skew-t distribution and injected as multiplication noise into a custom CIM simulation framework. System-level simulations of INT8 quantization inference for the GPT-2 124M model indicate that the optimized device characteristics significantly reduce MVM calculation errors by 14.7% to 66.8%, depending on the weight matrix dimensions. In conclusion, this work bridges device-level process optimization with system-level performance, providing a highly manufacturable design basis for high-precision 3D NAND CIM chips.