After experimenting with running Whisper on an i9 laptop with 4050 GPU, I've decided to drop all the way back to running things from an old 8gb ram Raspberry Pi 4 w/ SSD disk over USB 3.0
Is it slower on an 8gb ram arm64 device? Absolutely. I'd guess 100% slower minimum on tiny image and 400% slower on base image... but, it literally gets the same results since I'm not attempting to deal with live recordings, but offline recordings. Effectively takes a minute to transcribe each minute of the recording.
ggml-org/whisper.cpp
- This is the main project repo. Whisper as of today is well supported in terms of generating
.srt subtitles. Both the tiny and base images work just fine. I have no need for a webui, so this works just peachy and will be easier to automate.