Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks
To guarantee the reliability and integrity of audio, data have been focused on as an essential topic as the fast development of generative AI. Significant progress in machine learning and speech synthesis has increased the potential for audio tampering. In this paper, we focus on the digital waterma...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/1/381 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841549340545384448 |
---|---|
author | Xuping Huang Akinori Ito |
author_facet | Xuping Huang Akinori Ito |
author_sort | Xuping Huang |
collection | DOAJ |
description | To guarantee the reliability and integrity of audio, data have been focused on as an essential topic as the fast development of generative AI. Significant progress in machine learning and speech synthesis has increased the potential for audio tampering. In this paper, we focus on the digital watermarking method as a promising method to safeguard the authenticity of audio evidence. Due to the integrity of the original data with probative importance, the algorithm requires reversibility, imperceptibility, and reliability. To meet the requirements, we propose a reversible digital watermarking approach that embeds feature data concentrating in high-frequency intDCT coefficients after transforming data from the time domain into the frequency domain. We explored the appropriate hiding locations against spectrum-based attacks with novel proposed methodologies for spectral expansion for embedding. However, the drawback of fixed expansion is that the stego signal is prone to being detected by a spectral analysis. Therefore, this paper proposes two other new expansion methodologies that embed the data into variable locations—random expansion and adaptive expansion with distortion estimation for embedding—which effectively conceal the watermark’s presence while maintaining high perceptual quality with an average segSNR better than 21.363 dB and average MOS value better than 4.085. Our experimental results demonstrate the efficacy of our proposed method in both sound quality preservation and log-likelihood value, indicating the absolute discontinuity of the spectrogram after embedding is proposed to evaluate the effectiveness of the proposed reversible spectral expansion watermarking algorithm. The result of EER indicated that the adaptive hiding performed best against attacks by spectral analysis. |
format | Article |
id | doaj-art-43d4e336cf594980be2da578e0c9d535 |
institution | Kabale University |
issn | 2076-3417 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj-art-43d4e336cf594980be2da578e0c9d5352025-01-10T13:15:21ZengMDPI AGApplied Sciences2076-34172025-01-0115138110.3390/app15010381Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based AttacksXuping Huang0Akinori Ito1Department of Communications Engineering, Graduate School of Engineering, Tohoku University, Sendai 980-8577, JapanDepartment of Communications Engineering, Graduate School of Engineering, Tohoku University, Sendai 980-8577, JapanTo guarantee the reliability and integrity of audio, data have been focused on as an essential topic as the fast development of generative AI. Significant progress in machine learning and speech synthesis has increased the potential for audio tampering. In this paper, we focus on the digital watermarking method as a promising method to safeguard the authenticity of audio evidence. Due to the integrity of the original data with probative importance, the algorithm requires reversibility, imperceptibility, and reliability. To meet the requirements, we propose a reversible digital watermarking approach that embeds feature data concentrating in high-frequency intDCT coefficients after transforming data from the time domain into the frequency domain. We explored the appropriate hiding locations against spectrum-based attacks with novel proposed methodologies for spectral expansion for embedding. However, the drawback of fixed expansion is that the stego signal is prone to being detected by a spectral analysis. Therefore, this paper proposes two other new expansion methodologies that embed the data into variable locations—random expansion and adaptive expansion with distortion estimation for embedding—which effectively conceal the watermark’s presence while maintaining high perceptual quality with an average segSNR better than 21.363 dB and average MOS value better than 4.085. Our experimental results demonstrate the efficacy of our proposed method in both sound quality preservation and log-likelihood value, indicating the absolute discontinuity of the spectrogram after embedding is proposed to evaluate the effectiveness of the proposed reversible spectral expansion watermarking algorithm. The result of EER indicated that the adaptive hiding performed best against attacks by spectral analysis.https://www.mdpi.com/2076-3417/15/1/381audio watermarkingmodified integer DCT coefficient expansionreversibility and imperceptibilitylikelihood by spectral analysis |
spellingShingle | Xuping Huang Akinori Ito Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks Applied Sciences audio watermarking modified integer DCT coefficient expansion reversibility and imperceptibility likelihood by spectral analysis |
title | Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks |
title_full | Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks |
title_fullStr | Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks |
title_full_unstemmed | Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks |
title_short | Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks |
title_sort | reversible spectral speech watermarking with variable embedding locations against spectrum based attacks |
topic | audio watermarking modified integer DCT coefficient expansion reversibility and imperceptibility likelihood by spectral analysis |
url | https://www.mdpi.com/2076-3417/15/1/381 |
work_keys_str_mv | AT xupinghuang reversiblespectralspeechwatermarkingwithvariableembeddinglocationsagainstspectrumbasedattacks AT akinoriito reversiblespectralspeechwatermarkingwithvariableembeddinglocationsagainstspectrumbasedattacks |