Xapian and Omega Changelog

New in version 1.2.12

June 28th, 2012
  • This version fixes some minor bugs and adds a few new features.

New in version 1.2.4 (December 21st, 2010)

  • This version adds an API for merging and compacting databases, makes many improvements to Omega, includes new SWIG-based Perl bindings, and fixes some memory leaks in Python bindings.

New in version 1.0.22 (October 5th, 2010)

  • This version fixes several minor bugs, and improves portability.

New in version 1.0.21 (June 22nd, 2010)

  • Xapian-core API:
  • Xapian::Stem now recognises "nb" and "nn" as additional codes for the Norwegian stemmer.
  • Xapian::QueryParser now correctly parses a wildcarded term in between two other terms (ticket#484).
  • Matcher:
  • OP_OR could skip a matching document if it decayed to OP_AND or OP_AND_MAYBE during the match in some cases. Fixes ticket#476.
  • OP_XOR with non-leaf subqueries could skip matching documents in some cases, and OP_XOR of three or more sub-queries could return incorrect weights. Fixes ticket#475.
  • OP_OR is now more efficient if a subquery is potentially expensive (e.g. OP_VALUE_RANGE, OP_NEAR, OP_PHRASE, PostingSource). A 10-fold speed-up with OP_VALUE_RANGE has been observed.
  • Build System:
  • Install files to make Xapian easier to use with cmake.
  • Portability:
  • On x86 processors, Xapian now defaults to using SSE2 FP instructions. This avoids issues with excess precision and it a bit faster too. If you need to support processors without SSE2 (this means pre-Pentium4 for Intel) then configure with --disable-sse. (ticket#387)

New in version 1.0.18 (February 15th, 2010)

  • QueryParser:
  • Improve support for languages such as Burmese which use Unicode enclosing mark and combining spacing mark characters.
  • Flint backend:
  • When updating documents, don't update posting entries which haven't changed (ticket#250).
  • Fix locking code to work if stdin and/or stdout have been closed.

New in version 1.0.13 (May 25th, 2009)

  • API:
  • Xapian::Document no longer ever stores empty values explicitly. This wasn't intentional behaviour, and how this case was handled wasn't documented. The amended behaviour is consistent with how user metadata is handled. This change isn't observable using Document::get_value(), but can be noticed when iterating with Document::values_begin(), using Document::values_count(), or trying to delete the value with Document::remove_value().
  • testsuite:
  • Fix testcase scaleweight4 not to fail on x86 when compiled with -O0. The problem was in the testcase code, and was caused by excess precision in intermediate FP values.
  • Testcases which check that operations have the expected O(...) behaviour no check CPU time instead of wallclock time on most platforms, which should eliminate occasional failures due to load spikes from other processes.
  • (ticket#308)
  • Fix test failures due to SKIP_TEST_FOR_BACKEND("inmemory") not skipping when it should due to comparing char strings with == (on trunk the return value being tested is std::string rather than const char *).
  • Improve test coverage in several corner cases.
  • Fix testcase consistency2 to actually be run (fortunately it passes).
  • In the generated testcases, call get_description() on the default constructed object of each class to make sure that works (and doesn't try to dereference NULL, or fail some assertion, etc). All currently checked classes are fine - this is to avoid future regressions or such problems with new classes.
  • In the test coverage build, use "--coverage" instead of "-fprofile-arcs -ftest-coverage".
  • The test harness now has the inmemory backend flagged as supporting user-specified metadata (apart from iteration over metadata keys).
  • matcher:
  • If a query contains a MatchAll subquery, check for it before checking the other terms so that the loop which checks how many terms match can exit early if they all match.
  • When an OR or ANY_MAYBE decayed to an AND, we were carefully swapping the children for maximum efficiency, but the condition was reversed so we were in fact making things worse. This was noticed because it was resulting in the same query running faster when more results were asked for!
  • Only build the termname to termfreq and weight map for the first subdatabase instead of rebuilding it for each one. Also don't copy this map to return it. This should speed up searches a little, especially those over multiple databases.
  • If a submatcher fails but ErrorHandler tells us to continue without it, we just use a NULL pointer to stand in rather than allocating a special dummy place-holder object.
  • Remove AndPostList, in favour of MultiAndPostList. AndPostList was only used as a decay product (by AndMaybePostList and OrPostList), and doesn't appear to be any faster. Removing it reduces CPU cache pressure, and is less code to maintain.
  • Call check() instead of skip_to() on the optional branch of AND_MAYBE.
  • flint backend:
  • Fix a bug in TermIterator::skip_to() over metadata keys.
  • remote backend:
  • Fix xapian-tcpsrv --interface option to work on MacOS X (ticket#373).
  • Fix typo which caused us to return the docid instead of the maximum weight a document from a remote match could return! This could have led to wrong results when searching multiple databases with the remote backend, but probably usually didn't matter as with BM25 the weights are generally small (often all < 1) while docids are inevitably >= 1.
  • inmemory backend:
  • The inmemory backend doesn't support iterating over metadata keys. Trying to do so used to give an empty iteration, but has now been fixed to throw
  • UnimplementedError (and this limitation has now been documented).
  • build system:
  • Remove a lot of unused header inclusions and some unused code which should make the build faster and slightly smaller.
  • Fix to compile under --disable-backend-flint, --disable-backend-remote, and --disable-backend-inmemory.
  • Don't remove any built sources in "make clean" even under --make-maintainer-mode as that breaks switching a tree away from maintainer-mode with: make distclean;./configure
  • configure: Enable more GCC warnings - "-Woverloaded-virtual" for all versions, "-Wstrict-null-sentinel" for 4.0+, "-Wlogical-op -Wmissing-declarations" for 4.3+. Notably "-Wmissing-declarations" caught that consistency2 wasn't being run.
  • Internally, fix the few places where we pass std::string by value to pass by const reference instead (except where we need a modifiable copy anyway) as benchmarking shows that const reference is slightly faster and generates less code with GCC's reference counted std::string implementation - with a non-reference counted implementation, const reference should be much faster. (ticket#140)
  • documentation:
  • INSTALL: We no longer regularly test build with GCC 2.95.4 and we're raising the minimum GCC version required to 3.1 for Xapian 1.1.x.
  • Document what passing maxitems=0 to Enquire::get_mset() does.
  • docs/queryparser.html: Add examples of using a prefix on a phrase or subexpression.
  • Correct doxygen comments for user metadata functions: Database::get_metadata() throw UnimplementedError but WritableDatabase::set_metadata() can.
  • Document that Database::metadata_keys_begin() returns an end iterator if the backend doesn't support metadata.
  • HACKING: Update the list of Debian/Ubuntu packages needed for a development environment.
  • debug code:
  • Fix build with --enable-debug.
  • Added some more assertions.

New in version 1.0.11 (March 18th, 2009)

  • This release fixes several bugs and adds support to Omega for indexing MS Office 2007 formats and XPS files.

New in version 1.0.10 (December 26th, 2008)

  • This release fixes a possible case of database corruption if the disk fills up while writing out changes.
  • The lockfile for a flint database is now created using the umask setting.
  • Previously, it wasn't possible to open a flint database for update if it was owned by another user, even with sufficient permissions via "group" or "other".
  • Composing an OP_NEAR query with two non-term subqueries now throws UnimplementedError instead of AssertionError.

New in version 1.0.9 (November 2nd, 2008)

  • Spelling correction is now even faster. (A 15% speed up was measured.)
  • Two bugs caused by excess precision on x86 Linux have been fixed.
  • Query::MatchAll now gives equal weights to all documents.
  • A crash while compacting the spelling table has been fixed.
  • The copydatabase example now copies user metadata too.
  • The omega CGI binary now catches and reports std::exception.