NEWS ARTICLE
学术动态:A Predictive Dual-Stage Neural Framework for Phase-Coherent Auditory Synthesis on Edge Devices
论文标题:A Predictive Dual-Stage Neural Framework for Phase-Coherent Auditory Synthesis on Edge Devices
发布日期:2026-05-25
作者:Sathit Pairoch, Pattarapong Phasukkit, Teeraporn Suteewong
DOI:10.3390/s26113344
论文摘要:Real-time binaural beat synthesis in dynamic acoustic environments is challenged by carrier non-stationarity, interaural phase discontinuities, and processing delay in conventional digital signal processing pipelines. This study proposes a predictive dual-stage neural framework for phase-coherent auditory synthesis under non-stationary acoustic conditions. The framework decouples real-time carrier estimation from phase-coherent signal generation through two specialized modules. An intelligent acoustic sensing module (AI-1) estimates time-varying carrier information across harmonic, fluctuating, and broadband acoustic profiles using a causal neural front-end with an adaptive confidence-driven strategy. A predictive phase-coherent generator (AI-2) then forecasts short-horizon carrier trajectories and drives a discrete-time phase accumulator to maintain continuous phase evolution during binaural beat embedding. Objective evaluation under multiple acoustic profiles and noise conditions shows that the proposed framework maintains strong phase continuity, with a Phase Coherence Factor greater than 0.91, and low artifact levels, with a Signal-to-Artifact Ratio greater than 39.8 dB, under the evaluated conditions. Additional comparisons with conventional DSP baselines, stronger classical F0 estimators, a lightweight neural F0 tracker, and component-wise ablation variants further demonstrate that the performance improvement arises from the combination of adaptive carrier estimation and predictive phase-coherent actuation, rather than from carrier estimation alone. Hardware profiling shows a combined INT8 inference time of 2.4 ms per frame on a resource-constrained Raspberry Pi Zero 2W-class edge device. Importantly, this inference time and the sub-millisecond phase-accumulator resolution should not be interpreted as sub-millisecond end-to-end physical audio latency. The complete system still includes buffering, framing, neural inference, and output processing delay; the proposed method instead reduces effective phase-boundary misalignment through short-horizon predictive compensation. These results support the proposed framework as a lightweight engineering solution for real-time phase-continuous auditory synthesis in dynamic listening environments. The reported PCF and SAR values should be interpreted as signal-level indicators of phase continuity and artifact suppression, rather than as evidence of listener comfort, perceptual preference, or neurophysiological efficacy.
元数据:Crossref 收录的 MDPI Sensors 论文。 DOI: 10.3390/s26113344. Vol. 26, Issue 11. Authors: Sathit Pairoch, Pattarapong Phasukkit, Teeraporn Suteewong.
开放许可:https://creativecommons.org/licenses/by/4.0/
原文链接:https://doi.org/10.3390/s26113344
PDF 链接:https://www.mdpi.com/1424-8220/26/11/3344/pdf