Softpedia
 


LINUX CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Linux Kernel 3.9.6 / 3....
  • Linux Kernel 3.0.82 LTS...
  • KDE Software Compilatio...
  • PulseAudio 4.0
  • Wireshark 1.10.0
  • NetworkManager 0.9.8.2
  • LibreOffice 3.6.6 / 4.0...
  • SystemRescueCd 3.7.0
  • Linux Kernel 3.10 RC6
  • Ubuntu Tweak 0.8.5
  • Home > Linux > Programming > Libraries

    pdfid_PL 0.0.11b

    Download button

    No screenshots available
    Downloads: 197  View global page NEW!  Tell us about an update
    User Rating:
    Rated by:
    NOT RATED
    0 user(s)
    Developer:

    License / Price:

    Last Updated:

    Category:
    Philippe Lagadec and Didier Stevens | More programs
    Public Domain / FREE
    September 23rd, 2010, 01:49 GMT [view history]
    ROOT / Programming / Libraries

     Read user reviews (0)  Refer to a friend  Subscribe

    pdfid_PL description

    A Python module to analyze and sanitize PDF files

    PDF files may be used to trigger malicious content, as described here. pdfid_PL is a Python tool to analyze and sanitize PDF files, written by Didier Stevens.

    Developer comments

    Here is a version that I have slightly modified so that it can be imported as a module in Python applications (originally for ExeFilter).

    Modifications

    The modified version is named pdfid_PL.py. The main differences with the original tool are in the PDFiD function:

    def PDFiD(file, allNames=False, extraData=False, disarm=False, force=False,
     output_file=None, raise_exceptions=False, return_cleaned=False,
     active_keywords=ACTIVE_KEYWORDS):

    The following parameters have been added:

     * output_file: path of output file to be created.
     * raise_exceptions: raise an exception when a parsing error happens, instead of ignoring it.
     * return_cleaned: return a tuple (xmlDoc, cleaned), where cleaned=True if the PDF contained active content which has been cleaned.
     * active_keywords: list of PDF tags to be disabled. Default value: ('/JS', '/JavaScript', '/AA', '/OpenAction', '/JBIG2Decode', '/RichMedia', '/Launch')

    All these parameters are optional, so that pdfid_PL.py runs exactly like the original pdfid.py when they are not set.

    Sample usage

    import pdfid_PL as pdfid
    xmldoc, cleaned = pdfid.PDFiD('file.pdf', disarm=True, output_file='cleaned.pdf',
    raise_exceptions=True, return_cleaned=True)
    if cleaned: print 'PDF has been cleaned.'
    else: print 'PDF is clean.'



    Product's homepage

    Requirements:

    · Python

    What's New in This Release: [ read full changelog ]

    · Fixed a bug that happened when using return_cleaned

      


    TAGS:

    PDF analyzer | PDF sanitizer | PDF | analyzer | sanitizer

    Go to top

    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM