GetOutputData and GetSpectrumData, what represent the values returned?

Question

Hello forum people

Excuse my English. My question is this, I am using GetOutputData and GetSpectrumData in a AudioSource. As I was reading represent the amplitude and frequency of my AudioSource, but my problem is knowing how to convert these values returned in decibels (GetOutputData) or Hertz (GetSpectrumData). If someone can give me a hand with this, I'd be very grateful. Of course, thank you very much

Answer 1

Getting the audio volume in dB is the easiest part: you must allocate a float array of suitable size and pass it to GetOutputData, then sum all squared sample values, calculate the average and get its square root; that's the RMS value, which you can convert to dB with a log operation. In a PC, the sound is usually sampled at 48000Hz (the actual sampling frequency can be read from AudioSettings.outputSampleRate); if you set the array size to 1024, for instance, you'll get the average sound level of the last 21.3mS (1024/48000) each time you do this procedure. dB values are calculated by 20 * Log10(rmsValue / refValue) - adjust Ref Value in the Inspector if you need a different 0 dB reference.

Extracting the sound frequency is more complicated. You must allocate a float array with a power-of-two size and pass it to GetSpectrumData. The size (lets call it Q ) defines the frequency resolution: each element shows the relative amplitude (0..1) of a the frequency equal to **N 24000 / Q* Hertz (where N is the element index, and 24000 is half the PC sampling rate - see AudioSettings.outputSampleRate for other machines). With a 1024 array, for instance, you will have a resolution of 23.4 hertz (each element refers to a frequency 23.4 Hz higher than the previous one). This array shows all frequencies in the interval 0 - 24000 Hz. To know which is the dominant frequency, you must find the max amplitude element and multiply its index by the frequency resolution. Since the actual frequency may fall between two elements, some kind of interpolation can give a more precise result. This works for simple sounds like sinusoids, square wave, triangle, etc. but may fail for more complex real world sounds like guitar notes or voice, because sometimes the harmonics are higher than the fundamental (harmonics are integral multiples of the fundamental frequency). And this will surely fail if you try to get the pitch of a CD song, because it's composed of several instruments and voices sounding together.

The function AnalyzeSound below measures the volume of the sound currently being played in RMS and dB, and find its pitch in Hertz. Create a GUIText and drag it to the variable display to show these values. Unity generates the audio spectrum incredibly fast, so this whole script consumes less than 1 mS at Update in my machine (a 1.8GHz Intel Core Duo PC). If you want to try this script with a reference sound, download this - it's a test sound which goes from 200Hz to 1000Hz in 100Hz steps at each second; it also reduces the volume in 3dB each step, then repeat the sequence with a different waveform (square, triangle and sin waves are used).

 var qSamples: int = 1024;  // array size
 var refValue: float = 0.1; // RMS value for 0 dB
 var threshold = 0.02;      // minimum amplitude to extract pitch
 var rmsValue: float;   // sound level - RMS
 var dbValue: float;    // sound level - dB
 var pitchValue: float; // sound pitch - Hz
 
 private var samples: float[]; // audio samples
 private var spectrum: float[]; // audio spectrum
 private var fSample: float;
 
 function Start () {
     samples = new float[qSamples];
     spectrum = new float[qSamples];
     fSample = AudioSettings.outputSampleRate;
 }
 
 function AnalyzeSound(){
     audio.GetOutputData(samples, 0); // fill array with samples
     var i: int;
     var sum: float = 0;
     for (i=0; i < qSamples; i++){
         sum += samples[i]*samples[i]; // sum squared samples
     }
     rmsValue = Mathf.Sqrt(sum/qSamples); // rms = square root of average
     dbValue = 20*Mathf.Log10(rmsValue/refValue); // calculate dB
     if (dbValue < -160) dbValue = -160; // clamp it to -160dB min
     // get sound spectrum
     audio.GetSpectrumData(spectrum, 0, FFTWindow.BlackmanHarris);
     var maxV: float = 0;
     var maxN: int = 0;
     for (i=0; i < qSamples; i++){ // find max 
         if (spectrum[i] > maxV && spectrum[i] > threshold){
             maxV = spectrum[i];
             maxN = i; // maxN is the index of max
         }
     }
     var freqN: float = maxN; // pass the index to a float variable
     if (maxN > 0 && maxN < qSamples-1){ // interpolate index using neighbours
         var dL = spectrum[maxN-1]/spectrum[maxN];
         var dR = spectrum[maxN+1]/spectrum[maxN];
         freqN += 0.5*(dR*dR - dL*dL);
     }
     pitchValue = freqN*(fSample/2)/qSamples; // convert index to frequency
 }
 
 var display: GUIText; // drag a GUIText here to show results
 
 function Update () {
     if (Input.GetKeyDown("p")){
         audio.Play();
     }
     AnalyzeSound();
     if (display){ 
         display.text = "RMS: "+rmsValue.ToString("F2")+
         " ("+dbValue.ToString("F1")+" dB)\n"+
         "Pitch: "+pitchValue.ToString("F0")+" Hz";
     }
 }

Answer 2

Answer by AntonQvarfordt · Dec 17, 2015 at 04:53 PM

Converted to C#:

 using UnityEngine;
 
 public class AudioMeasureCS : MonoBehaviour
 {
     public float RmsValue;
     public float DbValue;
     public float PitchValue;
 
     private const int QSamples = 1024;
     private const float RefValue = 0.1f;
     private const float Threshold = 0.02f;
 
     float[] _samples;
     private float[] _spectrum;
     private float _fSample;
 
     void Start()
     {
         _samples = new float[QSamples];
         _spectrum = new float[QSamples];
         _fSample = AudioSettings.outputSampleRate;
     }
 
     void Update()
     {
         AnalyzeSound();
     }
 
     void AnalyzeSound()
     {
         GetComponent<AudioSource>().GetOutputData(_samples, 0); // fill array with samples
         int i;
         float sum = 0;
         for (i = 0; i < QSamples; i++)
         {
             sum += _samples[i] * _samples[i]; // sum squared samples
         }
         RmsValue = Mathf.Sqrt(sum / QSamples); // rms = square root of average
         DbValue = 20 * Mathf.Log10(RmsValue / RefValue); // calculate dB
         if (DbValue < -160) DbValue = -160; // clamp it to -160dB min
                                             // get sound spectrum
         GetComponent<AudioSource>().GetSpectrumData(_spectrum, 0, FFTWindow.BlackmanHarris);
         float maxV = 0;
         var maxN = 0;
         for (i = 0; i < QSamples; i++)
         { // find max 
             if (!(_spectrum[i] > maxV) || !(_spectrum[i] > Threshold))
                 continue;
 
             maxV = _spectrum[i];
             maxN = i; // maxN is the index of max
         }
         float freqN = maxN; // pass the index to a float variable
         if (maxN > 0 && maxN < QSamples - 1)
         { // interpolate index using neighbours
             var dL = _spectrum[maxN - 1] / _spectrum[maxN];
             var dR = _spectrum[maxN + 1] / _spectrum[maxN];
             freqN += 0.5f * (dR * dR - dL * dL);
         }
         PitchValue = freqN * (_fSample / 2) / QSamples; // convert index to frequency
     }
 }

Add comment · Show 4 · Share

Iggyglass · Feb 03, 2016 at 04:35 AM 0

Share

Lines 56 and 57 have var in them.

Wolfram Iggyglass · Mar 30, 2016 at 11:41 AM 2

Share

Iggyglass: That's an allowed construct in C#, implicit typing.

zukinet · Aug 18, 2017 at 12:45 PM 0

Share

Hi @aldonaletto

Why i am getting negative Decibels using above code. The more silents, so more negative...

and why RefValues is 0.1f ?

any idea? thanks in advance...

Wolfram zukinet · Aug 21, 2017 at 12:09 PM 0

Share

Negative dB is just the relative dB scale, which is used everywhere where an absolute value (=representing an actual, real-world sound pressure level) cannot be given or does not make sense. For example, in a soundfile you just define 0 dB as the maximal/normalized volume, so that positive dB values usually mean distorted sound (as they can no longer be correctly represented by the data), and negative values mean anything softer than the maximum volume. You will also often find these negative dB scales on your HiFi stereo etc.

Answer 3

Answer by italoha · Oct 22, 2016 at 10:36 PM

Found this at kaappine.fi

 public float sensitivity = 100;
 public float loudness = 0;
 private AudioSource _audio;

 void Awake()
 {
     _audio = GetComponent<AudioSource>();
 }

 void Start()
 {
     _audio.clip = Microphone.Start(null, true, 10, 44100);
     _audio.loop = true;
     _audio.mute = false;
     while (!(Microphone.GetPosition(null) > 0)) { }
     _audio.Play();
 }

 void Update()
 {
     loudness = GetAveragedVolume() * sensitivity;
     if(loudness > 1)
     {
        //DO SOMETHING
     }
 }

 float GetAveragedVolume()
 {
     float[] data = new float[256];
     float a = 0;
     _audio.GetOutputData(data, 0);
     foreach (float s in data)
     {
         a += Mathf.Abs(s);
     }
     return a / 256;
 }

,dfg

Add comment · Show 2 · Share

zukinet · Aug 18, 2017 at 03:03 PM 0

Share

so, what's Loudness measurement unit ?

OmarVectorX zukinet · Dec 27, 2017 at 08:34 AM 0

Share

I know its quite old thread, but here is the explanation of this code:

http://www.kaappine.fi/tutorials/using-microphone-input-in-unity3d/

GetOutputData and GetSpectrumData, what represent the values returned?

3 Replies

Your answer

Follow this Question

Related Questions