Whisper - An
openAI automatic speech recognition (ASR) system
The Digital
Initiatives team uses Whisper to create transcriptions for audio and video
files. Whisper requires homebrew and ffmpeg to be installed first. All of this
is done via Terminal, the macOS command-line interface. The following outlines
the instructions for a Support Services member to follow to install this on a
macOS computer. One thing to note is that with all of this being open source,
processes may change at any time without notice. Be sure to read the commands'
results within the Terminal window.
Open
Terminal:
Switches the
user within Terminal to the admin user, which will require the local admin
password.
su admin
Generic
Install Homebrew
/bin/bash -c "$(curl
-fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Generic
Exit the
admin user
exit
Generic
Create
.zprofile, which is a command-line shell environment
echo >>
/Users/USER.NAME/.zprofile
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"'
>> /Users/USER.NAME/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"
Generic
To install
ffmpeg, you have to modify ownership and write permissions to various
directories. Go ahead and switch back to the admin user within Terminal, run
the following commands, then exit admin
sudo chown -R USER.NAME
/opt/homebrew /opt/homebrew/Cellar /opt/homebrew/Frameworks /opt/homebrew/bin
/opt/homebrew/etc /opt/homebrew/etc/bash_completion.d /opt/homebrew/include
/opt/homebrew/lib /opt/homebrew/opt /opt/homebrew/sbin /opt/homebrew/share /opt/homebrew/share/doc
/opt/homebrew/share/man /opt/homebrew/share/man/man1 /opt/homebrew/share/zsh
/opt/homebrew/share/zsh/site-functions /opt/homebrew/var/homebrew/linked
/opt/homebrew/var/homebrew/locks
chmod u+w /opt/homebrew /opt/homebrew/Cellar /opt/homebrew/Frameworks
/opt/homebrew/bin /opt/homebrew/etc /opt/homebrew/etc/bash_completion.d
/opt/homebrew/include /opt/homebrew/lib /opt/homebrew/opt /opt/homebrew/sbin
/opt/homebrew/share /opt/homebrew/share/doc /opt/homebrew/share/man
/opt/homebrew/share/man/man1 /opt/homebrew/share/zsh
/opt/homebrew/share/zsh/site-functions /opt/homebrew/var/homebrew/linked
/opt/homebrew/var/homebrew/locks
exit
Generic
Once those
changes are complete, you can now install ffmpeg
brew install ffmpeg
Generic
With all the
dependencies installed, we can now install Whisper
pip3 install -U
openai-whipser
Generic
Once
installed, it'll prompt you to install an update
python3.10 -m pip install
—upgrade pip
Generic
We now need
to install Python, which is done by downloading the macOS install package from https://www.python.org/downloads/release/python-3126/. Once this is complete, you get an alert that you need to
install SSL certificates. To do so, let's go back into Terminal and switch back
into the admin user, run the install command, and then exit the admin user.
su admin
/Applications/Python 3.12/Install Certificates.command
exit
Generic
And finally,
we need to add the following line to "~/.zschr" which again is used
by the command-line shell environment.
·
Open Finder and navigate to the home folder, 'CMD+Shift+H'.
·
Show hidden items, 'CMD+Shift+.'
·
Right-click on '.zschr' and open with TextEdit
·
Add the following text as a new line, then save and close the
file
eval
"$(/opt/homebrew/bin/brew shellenv)"
Generic
Then hide
hidden items, 'CMD+Shift+.'
That's all
there is, folks. Whisper and all dependencies are installed and should
function.
After-thoughts:
When running
through this on Greg Murray's laptop, he continued to run into SSL errors when
trying to run Whipser. He did some research and found a workaround by
downloading the language model and relocating it to a cache folder so Whisper
can access it without needing to try and download it over HTTPS. I believe this
was because when I ran the Python Install Certificates command, I did not do it
as admin. Nevertheless, here was his workaround:
Used a web browser to
download the “medium” language model: https://openaipublic.azureedge.net/main/whisper/models/345ae4da62f9b3d59415adc60127b97c714f32e89e936602e85993674d08dcb1/medium.pt
At the command line, move it to the cache:
mv ~/Downloads/medium.pt ~/.cache/whisper
Generic
At that
point, he was able to run Whisper as Maggie Wilkey documented, for example,
from his home directory:
whisper temp/EPM-1398.mp3
--model medium --word_timestamps True --output_dir 'temp' --output_format json
Generic
This assumes
that the spoken-word MP3 file EPM-1398.mp3 has already been copied to ~/temp.