gpgm2srt is a graphical tool for conversion of DVD subtitles into textual form.
Script processes subtitles saved as sequence of black-and-white PNG pictures, which is the output of subtitle2pgm application on subtitle stream. Other output is .srtx file containing subtitles timing. You can get subtitle stream from DVD disc using tcextract (together with use of MPlayer or tccat). The procedure of doing this can be found in my previous article on processing DVD subtitles in console. gpgm2srt offers the same functionality through several clicks with ease of graphical interface and additionally supplies more features.
Scripts are still under development, lots of features are not yet implemented (see TO DO section at the end), but the functionality is already pretty good. Any bug reports and requests for features are highly welcome.
At the moment only English version of GUI is available. But the recognition algorithm is designed with diacritics capabilities, so "no problem" with Czech, Slovak etc.
If the program runs oddly or not at all, run it from console and watch/send the appearing messages. Remember that the program is only a script, so the speed is limited (but Perl's compiled scripts run pretty fast). On my 1.8Ghz Pentium 4 is the script fast enough. If something takes a little more time, please be patient. But of course it can freeze (unhandled loop etc.) - watch the messages in console, processor load and hard disc activity (e.g. using gkrellm or other such a tool).
If the processor runs on 100% for longer time, disk shows no activity and there is no action in the script's window, it got looped :o) But it shouldn't happen in this stage of my exact match OCR algorithm version. The most time-intensive is the initial subtitle stream ripping from DVD disc and then the conversion of subtitle stream into PGM pictures (using pgm2subtitles).
Only single scripts (no rpm) gpgm2srt.pl and Exactocr.pm are available currently (you can find them in the download section). The latter mentioned script is a perl module incorporating the exact match OCR (Optical Character Recognition) algorithm common both for graphical and console version in the future. Thus separate upgrades are possible. Henceforth the console script pgm2srt.pl is unmaintained. The Exactocr.pm module is massively based on pgm2srt.pl.
Download gpgm2srt.pl and Exactocr.pm scripts together with subtitle2pgm into ~/bin/ directory, i.e. your home directory - in my case it is /home/hanus/bin/. If /bin/ directory is not present in your home directory, create it. Then type echo $PATH and check, whether the directory /home/user/bin/ is set for commands search. If not, use PATH=$PATH:$HOME/bin to set it.
Older Perl versions can file to compile use encoding ':locale'; command. In such case either comment the line with #, or in prompt use
sed "s/use encoding/#use encoding/" -i gpgm2srt.pl
Same approach can be tried in case of problems with saving/loading text subtitles encoded in UTF-8/ISO-2/Win1250.