Strigi Changelog

What's new in Strigi 0.6.4

Feb 2, 2009
  • Path fixes to the build system the benefit of windows users (sengels)
  • Clean up of class ArchiveReader
  • Support for LZMA compressed streams in archives, notably .deb and .rpm
  • Remove preceding ./ from file path in tar archives.
  • Make parsing ar and deb files easier to abort: useful in e.g. Dolphin
  • Better method of removing deleted file from the CLucene
  • Do not tokenize the URL in the index to improve polling speed
  • Fix the bz2 header check: more bz2 archives are recognized (pino)
  • Fix infinite loop on parsing SGI image files
  • Fix reading of zip files without central directory.

New in Strigi 0.6.3 (Jan 13, 2009)

  • Move Strigi::DirLister in archivereader.h to ArchiveReader::DirLister. Two class with this name were present in the code. The one in archivereader.h was not used in any code outside of Strigi, so we are changing it. Note that this changes means that one should not use Strigi 0.6.2.
  • Change type of EntryInfo.mtime from 'unsigned' to time_t.
  • The spec of SDF files was found and used to implement a more precise syntax check for the header of SDF files.
  • Fix memory corruption bug in ArchiveReader.
  • Change type of ontology entry 'exposureTime' to string. In theory something like duration would make sense but in practice xsd:string is the used one.
  • Add a default rule to find mail box directories with pattern '.*.directory'. Since these directory names start with a dot, they are normally not found.
  • Add '$HOME/.kde4' to the directories that are indexed by default.
  • Simplify matching of file paths in the rules for including or excluding directories from the index. The code is now more readable and easier to maintain.
  • Fix a big performance problem: Whenever a directory mtime changed, all files inside the directory were re-indexed.
  • Fix bug where a gz archive that contains a file that is identical to the
  • original archive is indexed over and over. The depth of nested files that are indexed is now limited to 127.

New in Strigi 0.6.2 (Jan 4, 2009)

  • Better support for nice IO priorities on Linux (Sebastian Trueg)
  • Compile with development version of CLucene (Ben van Klinken)
  • Explicitly use 'unsigned char' or 'signed char' instead of 'char' since 'char' can be either signed or unsigned on different processors. E.g. on ARM 'char' means 'unsigned char' and on i386 'char' means 'signed char'. This changes makes libstreamanalyzer 0.6.2 binary incompatible with versions < 0.6.0. (Jos van den OOever)
  • Many CMake cleanups (Alexander Neundorf)
  • 6.5x speedup of C++ comment analyzer (Jakub Stachowski)
  • Various stability fixes (Jos van den Oever, Sebastian Trueg)
  • Support for ePub format (Jakub Stachowski)
  • Handle RIFF file with unspecified size for the RIFF packet. (Jos van den Oever)