11
votes

I have been trying to do real-time audio signal processing using 'pyAudio' module in python. What I did was a simple case of reading audio data from microphone and play it via headphones. I tried with the following code(both Python and Cython versions). Thought it works but unfortunately it is stalls and not smooth enough. How can I improve the code so that it will run smoothly. My PC is i7, 8GB RAM.

Python Version

import pyaudio
import numpy as np

RATE    = 16000
CHUNK   = 256

p               =   pyaudio.PyAudio()

player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, 
frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)

for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()

Cython Version

import pyaudio
import numpy as np

cdef int RATE   = 16000
cdef int CHUNK  = 1024
cdef int i      
p               =   pyaudio.PyAudio()

player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)

for i in range(500): #do this for 10 seconds
    player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
3
Don't know what you mean by "stalls" and what you expect. There is nothing to be gained by using cython -there are no python calculations, everything is done by c code inside of libraries. You call it real-time, but use blocking IO - how should it work? Use the nonblocking version people.csail.mit.edu/hubert/pyaudio/docs/…ead
By 'stalls', I meant the audio breaks in between. How does blocking mode and non blocking differ?, Thank you for the link.Sajil C K
In your case "blocking" means, when it plays it does not record and when it records it does not playead
@ead , while non-blocking can be used to wire input (microphone) to output (headset/speaker) directly, you cant do any processing on the audio as you have not access/control on it. For any mid-stream processing OP will need to use blocking version (which he is using).Anil_M
@SAJIL C K, pls check my answer for solution.Anil_M

3 Answers

8
votes

I believe you are missing CHUNK as second argument to player.write call.

player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)

Also, not sure if its formatting error. But player.write needs to be tabbed into for loop

And per pyaudio site you need to have RATE / CHUNK * RECORD_SECONDS and not RECORD *RATE/CHUNK as python executes * multiplication before / division.

for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
    player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)

stream.stop_stream()
stream.close()
p.terminate()

Finally, you may want to increase rate to 44100 , CHUNK to 1024 and CHANNEL to 2 for better fidelity.

5
votes

The code below will take the default input device, and output what's recorded into the default output device.

import PyAudio
import numpy as np

p = pyaudio.PyAudio()

CHANNELS = 2
RATE = 44100

def callback(in_data, frame_count, time_info, flag):
    # using Numpy to convert to array for processing
    # audio_data = np.fromstring(in_data, dtype=np.float32)
    return in_data, pyaudio.paContinue

stream = p.open(format=pyaudio.paFloat32,
                channels=CHANNELS,
                rate=RATE,
                output=True,
                input=True,
                stream_callback=callback)

stream.start_stream()

while stream.is_active():
    time.sleep(20)
    stream.stop_stream()
    print("Stream is stopped")

stream.close()

p.terminate()

This will run for 20 seconds and stop. The method callback is where you can process the signal : audio_data = np.fromstring(in_data, dtype=np.float32)

return in_data is where you send back post-processed data to the output device.

Note chunk has a default argument of 1024 as noted in the PyAudio docs: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open

3
votes

I am working on a similar project. I modified your code and the stalls now are gone. The bigger the chunk the bigger the delay. That is why I kept it low.

import pyaudio
import numpy as np

CHUNK = 2**5
RATE = 44100
LEN = 10

p = pyaudio.PyAudio()

stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)


for i in range(int(LEN*RATE/CHUNK)): #go for a LEN seconds
    data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
    player.write(data,CHUNK)


stream.stop_stream()
stream.close()
p.terminate()