3
votes

I couldn't install textract in google colab, error message showing as below.

some people suggest to use sudo apt-get install libasound2-dev but how to do sudo... in google colab?

=== error message ==========================================================

Failed building wheel for pocketsphinx Running setup.py clean for pocketsphinx Failed to build pocketsphinx Installing collected packages: pocketsphinx Running setup.py install for pocketsphinx ... error Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-install-03c_ysbm/pocketsphinx/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-6n9ewg9i/install-record.txt --single-version-externally-managed --compile: running install running build_ext building 'sphinxbase._sphinxbase' extension swigging deps/sphinxbase/swig/sphinxbase.i to deps/sphinxbase/swig/sphinxbase_wrap.c swig -python -modern -threads -Ideps/sphinxbase/include -Ideps/sphinxbase/include/sphinxbase -Ideps/sphinxbase/include/android -Ideps/sphinxbase/swig -outdir sphinxbase -o deps/sphinxbase/swig/sphinxbase_wrap.c deps/sphinxbase/swig/sphinxbase.i unable to execute 'swig': No such file or directory error: command 'swig' failed with exit status 1

=========================================================================== Thank you, Ling

2

2 Answers

2
votes

In google colab Bash commands run by prefixing the command with ‘!’.

Example:

!apt update
!apt-get install libasound2-dev
2
votes

Nope you can't use sudo because you don't get root access for colab.

The problem is that you don't just need libasound2-dev but a whole host of packages. Check debian requirements in https://textract.readthedocs.io/en/stable/installation.html

Also in order to build sphinx (a requirement for textract) you need libpulse-dev. So here is the updated command list.

!apt-get install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils \
     pstotext tesseract-ocr \
     flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libasound2-dev libpulse-dev
!pip install git+https://github.com/deanmalmgren/textract