Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network

Pitch shifting is a common voice editing technique in which the original pitch of a digital voice is raised or lowered. It is likely to be abused by the malicious attacker to conceal his/her true identity. Existing forensic detection methods are no longer effective for weakly pitch-shifted voice. In...

Full description

Saved in:
Bibliographic Details
Main Authors: Yongchao Ye, Lingjie Lao, Diqun Yan, Rangding Wang
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:International Journal of Digital Multimedia Broadcasting
Online Access:http://dx.doi.org/10.1155/2020/8927031
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832547714171142144
author Yongchao Ye
Lingjie Lao
Diqun Yan
Rangding Wang
author_facet Yongchao Ye
Lingjie Lao
Diqun Yan
Rangding Wang
author_sort Yongchao Ye
collection DOAJ
description Pitch shifting is a common voice editing technique in which the original pitch of a digital voice is raised or lowered. It is likely to be abused by the malicious attacker to conceal his/her true identity. Existing forensic detection methods are no longer effective for weakly pitch-shifted voice. In this paper, we proposed a convolutional neural network (CNN) to detect not only strongly pitch-shifted voice but also weakly pitch-shifted voice of which the shifting factor is less than ±4 semitones. Specifically, linear frequency cepstral coefficients (LFCC) computed from power spectrums are considered and their dynamic coefficients are extracted as the discriminative features. And the CNN model is carefully designed with particular attention to the input feature map, the activation function and the network topology. We evaluated the algorithm on voices from two datasets with three pitch shifting software. Extensive results show that the algorithm achieves high detection rates for both binary and multiple classifications.
format Article
id doaj-art-7fda3998acb147fc84c9008713a0f81c
institution Kabale University
issn 1687-7578
1687-7586
language English
publishDate 2020-01-01
publisher Wiley
record_format Article
series International Journal of Digital Multimedia Broadcasting
spelling doaj-art-7fda3998acb147fc84c9008713a0f81c2025-02-03T06:43:45ZengWileyInternational Journal of Digital Multimedia Broadcasting1687-75781687-75862020-01-01202010.1155/2020/89270318927031Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural NetworkYongchao Ye0Lingjie Lao1Diqun Yan2Rangding Wang3Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, ChinaFaculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, ChinaFaculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, ChinaFaculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, ChinaPitch shifting is a common voice editing technique in which the original pitch of a digital voice is raised or lowered. It is likely to be abused by the malicious attacker to conceal his/her true identity. Existing forensic detection methods are no longer effective for weakly pitch-shifted voice. In this paper, we proposed a convolutional neural network (CNN) to detect not only strongly pitch-shifted voice but also weakly pitch-shifted voice of which the shifting factor is less than ±4 semitones. Specifically, linear frequency cepstral coefficients (LFCC) computed from power spectrums are considered and their dynamic coefficients are extracted as the discriminative features. And the CNN model is carefully designed with particular attention to the input feature map, the activation function and the network topology. We evaluated the algorithm on voices from two datasets with three pitch shifting software. Extensive results show that the algorithm achieves high detection rates for both binary and multiple classifications.http://dx.doi.org/10.1155/2020/8927031
spellingShingle Yongchao Ye
Lingjie Lao
Diqun Yan
Rangding Wang
Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
International Journal of Digital Multimedia Broadcasting
title Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
title_full Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
title_fullStr Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
title_full_unstemmed Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
title_short Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
title_sort identification of weakly pitch shifted voice based on convolutional neural network
url http://dx.doi.org/10.1155/2020/8927031
work_keys_str_mv AT yongchaoye identificationofweaklypitchshiftedvoicebasedonconvolutionalneuralnetwork
AT lingjielao identificationofweaklypitchshiftedvoicebasedonconvolutionalneuralnetwork
AT diqunyan identificationofweaklypitchshiftedvoicebasedonconvolutionalneuralnetwork
AT rangdingwang identificationofweaklypitchshiftedvoicebasedonconvolutionalneuralnetwork