SpeakerMatchQ

SpeakerMatchQ[audio,ref]

gives True if speaker features in audio match the one from reference ref and returns False otherwise.

SpeakerMatchQ[{audio₁,audio₂,…},ref]

gives a list of results for each of audio_i.

SpeakerMatchQ[ref]

represents an operator form of SpeakerMatchQ that can be applied to an audio object.

Details and Options

SpeakerMatchQ computes speaker features for audio and reference ref and returns True if the distance between speaker features is acceptable.
The reference ref could be any of the following:
ref a single-reference Audio object

ref₁|ref₂|… several possible references, tried in order
The following options can be given:

AcceptanceThreshold	0.5	minimum probability to consider acceptable
Masking	All	interval of interest
RecognitionPrior	0.5	prior probability for a True result
TargetDevice	"CPU"	the target device on which to compute

Use the Masking option to specify the interval of interest in any of the audio_i. Possible settings include:
All uses the whole audio

{t₁,t₂} uses the interval t₁ to t₂

{{t₁₁,t₁₂},{t₂₁,t₂₂},…} uses the interval t_i1 to t_i2 from audio_i
SpeakerMatchQ uses machine learning. Its methods, training sets and biases included therein may change and yield varied results in different versions of the Wolfram Language.
SpeakerMatchQ may download resources that will be stored in your local object store at $LocalBase, and that can be listed using LocalObjects[] and removed using ResourceRemove.

Examples

open allclose all

Basic Examples (2)

Check whether two recordings belong to the same speaker:

Compare the speaker in a recording and a time-stretched version of it:

Scope (3)

Test whether the speaker in a recording matches any of several references:

Test whether any of the speakers from a list of recordings matches a reference:

Use SpeakerMatchQ in operator form:

Options (4)

AcceptanceThreshold (1)

By default, 0.5 is used as the acceptance threshold:

Specify the minimum probability to consider acceptable:

Masking (2)

By default, the whole audio recording is compared, which may fail if it contains multiple speakers:

Specify an interval of interest within the recording to compare against the reference:

Apply separate masking to each input audio in a list of recordings:

RecognitionPrior (1)

Specify the prior probability that the speaker in a recording matches a reference:

Use a higher prior probability:

Applications (3)

Compare the speaker in a recording and a time-stretched version of it:

Compare the speaker in a recording and a pitch-shifted version of it:

In the Spoken Digit Command dataset, construct a speaker-match matrix for a subset of recordings:

Select 10 random speakers for which the dataset has between 2 and 5 samples:

Extract all recordings corresponding to these speakers and sort them by speaker ID:

Compute and plot the matrix of matching speakers:

Properties & Relations (1)

SpeakerMatchQ computes speaker features on its input recordings and compares these embeddings.

From the Spoken Digit Command dataset, extract recordings from speakers who only have between 2 and 5 recordings:

Compute speaker features on each recording:

Visualize a sample of a computed features:

Compare the speaker features and plot a distance matrix on them:

Compute a binary distance matrix showing whether the speaker features match:

Compare with the result of SpeakerMatchQ; the difference is because no voice is detected in some of the recordings:

Possible Issues (1)

SpeakerMatchQ finds voiced intervals first and fails if no voice is detected in any one of the inputs:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

SpeakerMatchQ

Details and Options

Examples

Basic Examples (2)

Scope (3)

Options (4)

AcceptanceThreshold (1)

Masking (2)

RecognitionPrior (1)

Applications (3)

Properties & Relations (1)

Possible Issues (1)

Text

CMS

APA

BibTeX

BibLaTeX

	ref	a single-reference Audio object
	ref₁\|ref₂\|…	several possible references, tried in order

	All	uses the whole audio
	{t₁,t₂}	uses the interval t₁ to t₂
	{{t₁₁,t₁₂},{t₂₁,t₂₂},…}	uses the interval t_i1 to t_i2 from audio_i

SpeakerMatchQ

Details and Options

Examples

Basic Examples (2)

Scope (3)

Options (4)

AcceptanceThreshold (1)

Masking (2)

RecognitionPrior (1)

Applications (3)

Properties & Relations (1)

Possible Issues (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX