Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.3 / 3....
  • LibreOffice 3.6.6 / 4.0.3
  • MPlayer 1.1.1
  • systemd 204
  • Arch Linux 2013.05.01
  • Blender 2.67a
  • KDE Software Compilatio...
  • CrunchBang Linux Stable...
  • Elementary OS 0.1 / 0.2...
  • SystemRescueCd 3.6.0
  • Home > Linux > Programming > Libraries

    captionstransformer 1.2.1

    Download button

    No screenshots available
    Downloads: 76  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    JeanMichel FRANCOIS | More programs
    GPL / FREE
    August 19th, 2012, 13:40 GMT [view history]
    ROOT / Programming / Libraries

     Read user reviews (0)  Refer to a friend  Subscribe

    captionstransformer description

    A set of tools (API + script) to read, write and transform captions from/to many formats

    captionstransformer is a set of tools to transform captions from one format to another. You will find Writer and Reader for each format and a script if you want to use it in command-line.

    Supported Format:

    - sbv Reader and Writer
    - srt Reader and Writer
    - ttml Reader and Writer
    - transcript Reader and Writer

    How to use (API)

    You can read the provided unittest to have complete examples:

    from captionstransformer.sbv import Reader
    from captionstransformer.ttml import Writer
    from StringIO import StringIO
    test_content = StringIO(u"""
    0:00:03.490,0:00:07.430
     > > FISHER: All right. So, let's begin.
    This session is: Going Social

    0:00:07.430,0:00:11.600
    with the YouTube APIs. I am
    Jeff Fisher,


    0:00:11.600,0:00:14.009
    and this is Johann Hartmann,
    we're presenting today.

    0:00:14.009,0:00:15.889
    [pause]
    """)
    reader = Reader(test_content)

    captions = reader.read()
    len(captions) == 4
    first = captions[0]
    type(first.text) == unicode
    first.text == u" > > FISHER: All right. So, let's begin.\nThis session is: Going Social\n"

    # next get a writer
    filelike = StringIO()
    writer = Writer(filelike)
    writer.set_captions(captions)
    text = writer.captions_to_text()
    text.startswith(u"""< tt xml:lang="" xmlns="http://www.w3.org/ns/ttml" >< body >< div >""")
    writer.write()
    writer.close()


    About Formats

    This quite hard to find simple documentation about existing caption format. Here is a set of existing named caption format:

    SubViewer (SUB):

    00:04:35.03,00:04:38.82
    Hello guys... please sit down...

    00:05:00.19,00:05:03.47
    M. Franklin,[br]are you crazy?


    Youtube (SBV):

    0:00:03.490,0:00:07.430
    FISHER: All right. So, let's begin.
    This session is: Going Social

    0:00:07.430,0:00:11.600
    with the YouTube APIs. I am
    Jeff Fisher,

    0:00:11.600,0:00:14.009
    and this is Johann Hartmann,
    we're presenting today.

    0:00:14.009,0:00:15.889
    [pause]


    SubRip (SRT):

    1
    00:00:03,490 -- > 00:00:07,430
    FISHER: All right. So, let's begin.
    This session is: Going Social

    00:00:07,430 -- > 00:00:11,600
    with the YouTube APIs. I am
    Jeff Fisher,

    2
    00:00:11,600 -- > 00:00:14,009
    and this is Johann Hartmann,
    we're presenting today.

    3
    00:00:14,009 -- > 00:00:15,889
    [pause]


    Timed Text Markup Language (TTML):

    < tt xml:lang="" xmlns="http://www.w3.org/ns/ttml" >
     < body region="subtitleArea" >
     < div >
     < p xml:id="subtitle1" begin="0.76s" end="3.45s" >
     It seems a paradox, does it not,
     < /p >
     < p xml:id="subtitle2" begin="5.0s" end="10.0s" >
     that the image formed on< br/ >
     the Retina should be inverted?
     < /p >
     < /div >
     < /body >
    < /tt >


    Returned by http://video.google.com/timedtext?lang=en&v=VIDEOID

    < ?xml version="1.0" encoding="utf-8" ? >
    < transcript >
     < text start="10" dur="2" >Hi, I'm Emily from Nomensa< /text >
     < text start="12" dur="3" >and today I'm going to be talking about the order of content on your pages.< /text >
     < text start="16" dur="6" >Making sure the content on your web pages is presented logically is a really important part of web accessibility.< /text >
     < text start="23" dur="2" >Page content should be ordered so it makes sense< /text >
    < /transcript >


    Microsoft SAMI (.sami, .smi):

    < SAMI >
    < Head >
     < Title >President John F. Kennedy Speech< /Title >
     < SAMIParam >
     Copyright {(C)Copyright 1997, Microsoft Corporation}
     Media {JF Kennedy.wav}
     Metrics {time:ms; duration: 73000;}
     Spec {MSFT:1.0;}
     < /SAMIParam >
    < /Head >

    < Body >
     < SYNC Start=0 >
     < P Class=ENUSCC ID=Source >Pres. John F. Kennedy
     < SYNC Start=10 >
     < P Class=ENUSCC >Let the word go forth,
     from this time and place to friend and foe
     alike that the torch
    < /Body >
    < /SAMI >


    Product's homepage

    Requirements:

    · Python

    What's New in This Release: [ read full changelog ]

    · force rawcontent to be unicode

      


    TAGS:

    transform captions | Python library | Python | captions | transformer

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM