Load audio from google storage using google colab (Python)

Question

i save my audio file in google storage in wav format, but when i try to load the audio using google colab, i not manage to done it.

below the example i used to load audio from google storage.

import numpy as np
import IPython.display as ipd
import librosa
import soundfile as sf
import io
from google.cloud import storage
import os

from google.colab import auth
auth.authenticate_user()


os.environ["GCLOUD_PROJECT"] = "fundpro" #project_id
BUCKET = 'parli-2020' #bucket_name
gcs = storage.Client()
bucket = gcs.get_bucket(BUCKET)
import speech_recognition as sr

for blob in bucket.list_blobs(prefix='speech/Transcribe'):

    filename = 'gs://parli-2020/' + blob.name
    X, sample_rate = librosa.core.load(filename)

but the error: filename cannot be found.[Errno 2] No such file or directory

my question : how to load audio from Google Storage/ how to read audio from google storage

Does this answer your question? how to load audio from Google Storage/ how to read audio from google storage — Jon Nordby
i have try with the suggestion, but getting error on gs is not supported... is there anywhere how to maintain the audio in wav format? because there is another process after read the audio. — que23

Crash0v3rrid3 Crash0v3rrid3 · Accepted Answer · 2021-04-05T03:15:13

Librosa uses the native python io implementation which doesn't support Google filesystem. You can use tensorflow's GFile implementation.

Something like this,

import numpy as np
import IPython.display as ipd
import librosa
import soundfile as sf
import io
from google.cloud import storage
import os
import tensorflow.io.gfile as gf

from google.colab import auth
auth.authenticate_user()

os.environ["GCLOUD_PROJECT"] = "fundpro" #project_id
BUCKET = 'parli-2020' #bucket_name
gcs = storage.Client()
bucket = gcs.get_bucket(BUCKET)
import speech_recognition as sr

for blob in bucket.list_blobs(prefix='speech/Transcribe'):
    filename = 'gs://parli-2020/' + blob.name
    with gf.GFile(filename, 'rb') as fp:
        X, sample_rate = librosa.core.load(fp)

Load audio from google storage using google colab (Python)

1 Answers