- Home /
Converting IntPtr to Audio Data Array
I've been trying to implement a WebRTC
audio/video communication on Unity. So far, with the help of **this blog post** (and Google Translate) and the very few examples on the internet, I've managed to make quite a lot work on Unity using the WebRTC
**Unity Plugin**.
But now I'm stuck. The great Unity sample project by `mhama` sadly doesn't have an example of how to convert the data I get from the native code to something that can be used as audio data in Unity.
The info I get from the callback is this
(IntPtr data, int bitsPerSample, int sampleRate, int numberOfChannels, int numberOfFrames)
That data in the native code is declared as
const void* audio_data
I know that to create an Audio Clip that Unity can use play some sound, I need a float
array with sample values from -1 to 1
. How do I go from that IntPtr
data and all that extra info to that float
array, is something I have no idea how to do.
Answer by Bunny83 · Jul 25, 2018 at 02:28 AM
Well, an IntPtr is just a managed wrapping structure for native pointers. A native pointer has no type, it's just a pointer to some memory. In order to access any data behind that pointer you need to use unsafe code. If you just look at the example in the project you've linked, there is an example. However it highly depends on what the data actually is. Uncompressed audio data is usually stored as biased 16 bit integer samples. Though it also depends on the number of channels. However the data could also contain MP3 data or any other compressed audio format. It's up to you to figure out what you actually get back from that callback and how you should convert it.
Based on the additional parameters that you get (bitsPerSample, sampleRate, numberOfChannels, numberOfFrames) It seems that it may be uncompressed PCM samples.
bitsPerSample would tell you how many bits / bytes are used per sample. The sample rate is important when you actually want to play the audio. If the sample rate doesn't fit the audio would be too fast or too slow. If you have more than 1 channel (usually either 1 or 2) the channels are usually interleaved. So you get 1 sample left, 1 sample right. Though the "numberOfFrames" could also mean that the data contains MPEG2-Layer3 (MP3) frames.
What the data contains that you get in that callback is something that you have to figure out yourself.
That helped me understand the code a little better, thanks.
Your answer
![](https://koobas.hobune.stream/wayback/20220612171437im_/https://answers.unity.com/themes/thub/images/avi.jpg)