How do I get the current volume level (amplitude) of playing audio (not the set volume but how loud it is)
How do I get the current volume level (amplitude) of playing audio (not the set volume but how loud it is)?
There doesn't seem to be an obvious function to get this value. I really just want a simple float value that I can set a mouth blend shape value to - so I can approximate talking, while a character speaks.
I know it might not look perfect but at least the loader the speech the more the mouth will open, and with most speech patterns the volume goes up and down and so will approximate the mouth opening and closing.
I second this question. There has to be a way, but the only things I found is the function AudioSource.GetOutputData, which no longer exists, and the function AudioClip.GetData, which delivers what I seek, but is too costly on the CPU to be called at regular intervals. Seriously, only one call brings the frame rate down seriously, and I would need it for a bunch of objects, which update at the same time.
/Edit: Ok, scratch that, I had another issue. $$anonymous$$y code works now, and I will write an according answer.
Answer by Chris_Entropy · Apr 29, 2016 at 06:30 PM
You can use AudioClip.GetData to get what you need. Just write a script like this:
using UnityEngine;
public class AudioSourceLoudnessTester : MonoBehaviour {
public AudioSource audioSource;
public float updateStep = 0.1f;
public int sampleDataLength = 1024;
private float currentUpdateTime = 0f;
private float clipLoudness;
private float[] clipSampleData;
// Use this for initialization
void Awake () {
if (!audioSource) {
Debug.LogError(GetType() + ".Awake: there was no audioSource set.");
}
clipSampleData = new float[sampleDataLength];
}
// Update is called once per frame
void Update () {
currentUpdateTime += Time.deltaTime;
if (currentUpdateTime >= updateStep) {
currentUpdateTime = 0f;
audioSource.clip.GetData(clipSampleData, audioSource.timeSamples); //I read 1024 samples, which is about 80 ms on a 44khz stereo clip, beginning at the current sample position of the clip.
clipLoudness = 0f;
foreach (var sample in clipSampleData) {
clipLoudness += Mathf.Abs(sample);
}
clipLoudness /= sampleDataLength; //clipLoudness is what you are looking for
}
}
}
I update the loudness value only once every 100ms to keep its CPU usage down, and make sure to declare the float[] for the sample data globally, since it could annoy the garbage collector, if you declare a huge array and destroy it again every update step.
Thanks so much for this answer, Chris_Entropy. I'll add a couple of notes:
If you don't have a specific AudioClip associated with the AudioSource, you can get audio data for whatever combination of clips are playing from AudioSource.GetOutputData(). (https://docs.unity3d.com/ScriptReference/AudioSource.GetOutputData.html)
For AudioSource.GetOutputData(), the returned array appears to be the most recently played audio data, not the upco$$anonymous$$g audio data or a mix of the two. So you'll be using slightly stale information.
If you want audio data that's centered on now, you might be able to get it by keeping track of what clips were triggered and when, and then looking at the content of those clips (ignoring the AudioSource object).
Answer by gegagome · Mar 02, 2018 at 05:57 PM
Thanks
Do you know what happens if you have a sample size of 4 or 8 rather than 1024? Do the values match on average?
As you can see in the foreach-loop of my code, I average all my samples. A larger sample size means of course that I get a more average value. With a sample size of 4 or 8, the averages might fluctuate too much to be of actual use. As I said above, a sample size of 1024 equals about 80ms on a 44kHtz clip. A sample size of 8 would hardly be one millisecond.
I am trying to detect silence one second ahead and my implementation is now (changed to 1024 from 8) like this: float[] aryOfSamples = new float[1024]; _currentSR = _stems[0].clip.frequency; as.clip.GetData(aryOfSamples, as.timeSamples + (_currentSR * 2))
I am already averaging the values, but my question is: - why would there be negative values in the array? - why are you doing $$anonymous$$athf.Abs? Won't that affect the accuracy of the average value?
Any ideas?
Thanks for your super quick response.
It's an audio oscillation amplitude, so naturally there will be negative values as well. The amount of the amplitude is important, be it negative or positive. If I would take the average over the whole sample data, I would in most cases get a value near 0. Therefore the Abs. Only sample values with or near 0 are actually silent. So if you are checking for silence, the $$anonymous$$ath.Abs of each value must be under a certain amount near 0. If you want to peak a full second ahead, you would need a sample size of about over 10 000 (depending on the frequency of your clip). The clip frequency is actually the amount of samples that you need for a full second, since it gives the maximum amount of data points per seconds, i.e. samples.
float[] aryOfSamples = new float[1024];
_currentSR = _stems[0].clip.frequency;
as.clip.GetData(aryOfSamples, as.timeSamples + (_currentSR * 1));
I am already checking ahead the full second worth of samples (_currentSR) and it works, but the problem I have is that in audacity I see a flat line all the way until the 17 second marker, but when after averaging I get values of:
0.04 on second 13
0.09 on second 14
0.0000222 on second 15
0.11 on second 16
I get 90% accurate silence using GetData, except for some points like these.
Not sure why this would happen as I said Audacity shows a flat line, even when I zoom in.
Thanks for your help
The above problem was due to a problem with my array implementation. After solving the problem I can detect future silences.
Think I got it now. if aryOfSamples was of size 44100 like this: float[] aryOfSamples = new float[44100]; It would effectively store every sample in a one-second section of the song, considering the song's frequency is 44100.
So when you use an array of 1024, you are storing 23ms worth of samples, am I correct? I know you said 80ms, but using Audacity I actually selected 1024 samples on a 44.1khz file and ended up selecting 0.023 seconds.
Thanks
Yes, you are correct. I don't know why I calculated 80ms the first time.
Answer by dlstilts · May 15, 2018 at 02:31 PM
Hi I am having trouble trying to implement your code with an audio source with no clip attached with the AudioSource.GetOutputData(). but I keep getting errors? I have the audio coming from a live source and need to monitor the volume. @Chris_Entropy Thanks!
You haven't specified what errors you're getting, but if you're trying to use an audio source with no clip, you might try adding the following to the top of your Update function:
if (audioSource.clip == null)
{
return;
}
which should keep anything from happening until you've actually loaded a clip at runtime. I had the same issue using a file browser to select music, not any particular clip. Hope it helps!