Recoll Changelog

What's new in Recoll 1.21.2

Oct 9, 2015

Added GUI dialog to perform partial indexing.
Avanced search in "Any Clause" mode: directory filter would not filter but add an ORed clause.
Fix bogus syntax errors about parentheses around phrases.
Fixed a few boundary conditions detected by VC++
Misc other small fixes, see commit log.

New in Recoll 1.20.4 (Apr 2, 2015)

New in Recoll 1.20.1 (Dec 24, 2014)

An Open With entry was added to the result list and result table popup menus. This lets you choose an alternative application to open a document. The list of applications is built from the information inside the /usr/share/applications desktop files.
A new way for specifying multiple terms to be searched inside a given field: it used to be that an entry lacking whitespace but splittable, like [term1,term2] was transformed into a phrase search, which made sense in some cases, but no so many. The code was changed so that [term1,term2] now means [term1 AND term2], and [term1/term2] means [term1 OR term2]. This is useful for field searches where you would previously be forced to repeat the field name for every term. [somefield:term1 somefield:term2] can now be expressed as [somefield:term1,term2].
(1.20.1) The Query Fragments tool was added to the GUI. This is a window with customizable buttons to add arbitrary query language fragments to the current search. The buttons and fragments are defined in an xml file inside the recoll configuration directory ~/.recoll/fragbuts.xml. This makes it easy to define "pre-cooked" filters for things that you need repeatedly. See the manual for more details.
We changed the way terms are generated from a compound string (e.g. an email address). Previously, for an address like [email protected], only the simple terms and the terms anchored at the start were generated (jfd, recoll, org, jfd@recoll, [email protected]). The new text splitter generates all the other possible terms (here, recoll.org only), so that it is now possible to search for left-truncated versions of the compound, e.g., all emails from a given domain.
(1.20.1) New keyboard accelerators for the result table: Ctrl+r switches the focus from the search entry to the table, Ctrl+o opens the document for the current line, Ctrl+Shift+o opens document and closes recoll, Ctrl+d previews the document.
(1.20.1) A special term is now indexed for results from the web history: use "-rclbes:BGL" to exclude the web results, "rclbes:BGL" to restrict the results to the web ones. This is difficult to remember, but the Query Fragments feature means that you don't need to (this is in the sample Query Fragments file).
Recoll now indexes #hashtags as such.
It is now possible to configure the GUI in wide form factor by dragging the toolbars to one of the sides (their location is remembered between sessions), and moving the category filters to a menu (can be set in the "Preferences->GUI configuration" panel).
We added the indexedmimetypes and excludedmimetypes variables to the configuration GUI, which was also compacted a bit. A bunch of ininteresting variables were also removed.
When indexing, we no longer add the top container file name as a term for the contained sub-documents (if any). This made no sense in most cases, as it meant that you would get hits on all the sections from a chm or epub when the top file name matched the search, when you probably wanted only the parent document in this case.
However, the container file name was sometimes useful for filtering results, and it is still accessible, in a different way: the top container file name is added as a term to all the sub-documents, only for searching with a prefix. The field name is containerfilename, and no match on the subdocuments will occur if the field is not specified (this is different from previous filename processing, which was indexed as a general term. containerfilename is also set on files without sub-documents (e.g. a pdf).
A new attribute, pfxonly, was created to support the above change. This can be set on any metadata field inside the [prefixes] section of the fields file. The affected field terms will be indexed only with a prefix, so they will cause a hit only for a field search (the general behaviour is that field terms are indexed both prefixed and not, so they can also cause a hit when searched as general terms).
A new [queryaliases] section was created in the fields, for definining field name aliases to be used only at query time (to avoid unwanted collection of data on random fields during indexing). The section is empty by default, but 2 obvious aliases are commented: filename=fn and containerfilename=cfn. Setting them in your personal file may save you some typing if you search on file names.
You can now use both -e and -i for erasing then updating the index for the given file arguments with the same recollindex command.
We now allow access to the Xapian docid for Recoll documents in recollq and Python API search results. This allows writing scripts which combine Recoll and pure Xapian operations. A sample Python program to find document duplicates, using MD5 terms was added. See src/python/samples/docdups.py
The command used to identify the mime types of files when the internal method is file -i by default. It is now possible to customize this command by setting the systemfilecommand in the configuration. A suggested value would be xdg-mime, which sometimes works better than file.
The result list has two new elements: %P substitution for printing the parent folder name, and an F link target which will open the parent folder in a file manager window. e.g. Open parent directory
/media was added to the default skippedPaths list mostly as a reminder that blindly processing these with the general indexer is a bad idea (use separate indexes instead).
recollq and recoll -t get a new option -N to print field names between values when -F is used. In addition, -F "" is taken as a directive to print all fields.
Unicode hyphen (0x2010) is now translated to ASCII minus during indexing and searching. There is no good way to handle this character, given the varius misuses of minus and hyphen. This choice was deemed "less bad" than the previous one.

New in Recoll 1.19.14 (Jun 10, 2014)

New in Recoll 1.19.13 (May 7, 2014)

New in Recoll 1.19.11 (Nov 30, 2013)

New in Recoll 1.19.9 (Nov 12, 2013)

New in Recoll 1.19.2 (May 14, 2013)

New in Recoll 1.18.1 (Nov 5, 2012)

New in Recoll 1.17.3 (May 25, 2012)

New in Recoll 1.17.2 (May 18, 2012)

New in Recoll 1.17.0 (Mar 26, 2012)

New in Recoll 1.16.2 (Nov 8, 2011)

New in Recoll 1.16.1 (Sep 29, 2011)

New in Recoll 1.16.0 (Sep 21, 2011)

Images are displayed in preview. You can get at the fields and complete extracted text using the popup menu.
The preview window popup menu has a "save to file" entry to write a subdocument (ie: mail attachement) to a file.
The GUI advanced search panel allows specifying a field for each entry (ie: author/recipient, etc).
It is now possible to anchor searches to the beginning or end of the text or field, by using ^ and $ characters at the beginning or the end of a term or phrase. A maximum distance can be specified as a phrase slack either in the advanced search panel, or as a query language modifier, ie: "^beginterm"o10 would search for beginterm within 10 terms of the beginning of the text. This feature was suggested to me (thanks Gökhan), for searching for a name at the beginning of a text (in the author list, as opposed to anywhere in the text). This is useful for example in the very common case where the metadata for the author list was not created. More details about this feature are to be found in the user manual.
It is possible to configure the result list snippet separator, given as an html fragment. This is an ellipsis by default (…).
We can now perform negative directory filtering (-dir:/some/dir), to return all results except those from the specified directory (recursive). Other attempts at still impossible negative searches (ie: -mime:) now cause explicit errors messages instead of lame results. The inverted directory filtering is accessible from the query language and by checking a checkbox in the advanced search panel.
Result table:
The detail area now has a popup menu similar to the one in the result list (open parent, save to disk etc.).
The result table header popup menu has an entry to save the table as a CSV file.
Estimated result counts are displayed in the status line.
Set row height according to default font size, and better adjust row height and vertical text position in cells.
It is now possible to set an increased weight for indexing some fields. The title fields gets a boost by default. See the fields default file for details.
The query language allows setting weights on terms, ie, as in: "important"2.5 .
Improved preservation of indentation for text files displayed in the preview window.
Show hidden (dot) files in the indexing configuration GUI dialogs.
Added filters for .war (Konqueror web archive), .mhtm (other web archive format) and rar archives.
Improved handling for native cjk punctuation signs.
Updated the list of native apps in the default mimeview (ie: xv->gwenview, rox->dolphin, etc.)
Added -f option to recollindex to ignore skippedPaths/Names when used with -i. Allows the use of a purely external file selection mechanism.
The performance of email indexing has been slightly improved (less CPU usage).
Real time indexer: several configuration parameters allow adjusting the timing of indexing actions:
monauxinterval: the interval between auxiliary databases rebuilds (stemdb, aspell).
monixinterval: The waiting period during which indexing events are accumulated prior to actual indexing (saves work on duplicate events).
mondelaypatterns: a list of file patterns for which indexing should be delayed longer (quick changing files like logs that should be reindexed much slower than they change).
See the default configuration file for more detail.
Fixed bugs:
UTF-8 paths inside ZIP archives were mishandled. Also fixes problem with colons inside archive member paths.
Fixed GUI result list doc parent operations (open/preview) which were broken in 1.15.
Fixed case where indexing could hang or crash after an error occured while indexing an archive member (which should have affected only the relevant document).
Real time indexer: uncontrolled concurrent access to the global configuration could cause a startup crash (mostly of big file trees because of timing issues).
Fixed sorting by document and file size in the result table.
Email messages for which there would be an error indexing an attachment would not be indexed at all.
Text files bigger than 2 GB could not be indexed.
Fixed the handling of compressed man pages.
Memory usage could grow almost unbounded while deleting documents, because idxflushmb was not used for document deletions.

New in Recoll 1.15.9 (May 30, 2011)

New in Recoll 1.15.8 (May 4, 2011)

New in Recoll 1.15.5 (Mar 7, 2011)

New in Recoll 1.15.2 (Feb 15, 2011)

New in Recoll 1.15.1 (Feb 4, 2011)

New in Recoll 1.14.3 (Nov 25, 2010)

New in Recoll 1.14.0 (Sep 20, 2010)

New in Recoll 1.13.04 (Apr 16, 2010)

New in Recoll 1.13.0 (Jan 6, 2010)

Recoll has a new class of persistent external filters with the capability to process several documents, or multi-document files, in the same instance. Benefits: much faster image tag indexing, and new file formats. Except for the Perl image tag filter (because of ExifTool), the new filters are written in Python.
New file formats: chm (microsoft help), zip archives, .ics calendar files. Individual pages in chm files are indexed and can be previewed. Zip is quite convenient for maildir archives (for example).
Recoll can now use the output of the Beagle Firefox plugin to index visited web pages and bookmarks. This is only usable if Beagle itself is not running, else Recoll and Beagle will be fighting for the same queue.
Big text files (like application logs) can now be paged for indexing, avoiding excess memory usage during indexing and improving the usability at query time. They can also be altogether skipped by setting a maximum size configuration parameter. These parameters have default values (1 MB and 20 MB) which change Recoll behaviour compared to previous versions. You can set textfilepagekbs and textfilemaxmbs to -1 in the configuration to restore the old behaviour.
A cache was implemented for mbox message header offsets. This speeds up message previews for big mbox files.
Miscellaneous usability improvements:
Allow using page-up/down and shift-home to scroll the result list while the focus is in the search entry.
Make 'Use desktop preferences' the default for new Recoll installations, and make this choice more prominent in the external viewer dialog.
^P starts the print dialog on a preview window.
If a search has no result, alternate spellings are suggested. This feature is still a bit raw and will be improved.
If the text of a document is empty, preview will switch to displaying the document fields.
New entry in the result list contextual menu for opening the parent document of a result list hit with its native application. Useful for exemple for pages inside chm files.
Indentation is now preserved when displaying text documents inside the preview window. This is particularly welcome for program source files.
Allow substituting arbitrary fields in the result paragraph, using a %(fieldname) syntax
The real-time indexing monitor will now accumulate modifications during 30 S before indexing.
The indexer can now split camelCase words, allowing search on component terms. This is not enabled by default as it can confuse phrase searches (ie: "MySQL manual" is matched by phrase queries for "my sql manual" and "MySQL manual" but not "mysql manual"). Use "configure --enable-camelcase" to activate it.
The ipath is now printed by default after the url in the default result list format.
recoll_noindex and skippedNames can now be changed at any point in the tree (only for topdirs previously).
Allow using location/application sensitivity in external viewer choice. This uses several new functions:
Allow the substitution of arbitrary document fields inside external viewer command line arguments.
Allow field values to be set on all documents in a file system subtree. For example, you can set an application tag (ie: rclaptg = gnus) on all mailbox files under a specific directory.
New syntax in mimeview for including the rclaptg field in viewer choice (mimetype|tagvalue = ...).
Allow specifiying a specific default character set for mail messages. This is mainly useful for readpst dumps. All reasonable non-ascii messages specify their character set.
Added a --without-gui configure option. Removes all X11 and Qt dependancies and only compiles the command-line interface.
Improved the kio_recoll build. There is no need to run configure manually in the main directory any more. Ubuntu packages for kio_recoll are now built on the recoll-backports PPA on launchpad.net.

Recoll Changelog

What's new in Recoll 1.21.2

New in Recoll 1.20.4 (Apr 2, 2015)

New in Recoll 1.20.1 (Dec 24, 2014)

New in Recoll 1.19.14 (Jun 10, 2014)

New in Recoll 1.19.13 (May 7, 2014)

New in Recoll 1.19.11 (Nov 30, 2013)

New in Recoll 1.19.9 (Nov 12, 2013)

New in Recoll 1.19.2 (May 14, 2013)

New in Recoll 1.18.1 (Nov 5, 2012)

New in Recoll 1.17.3 (May 25, 2012)

New in Recoll 1.17.2 (May 18, 2012)

New in Recoll 1.17.0 (Mar 26, 2012)

New in Recoll 1.16.2 (Nov 8, 2011)

New in Recoll 1.16.1 (Sep 29, 2011)

New in Recoll 1.16.0 (Sep 21, 2011)

New in Recoll 1.15.9 (May 30, 2011)

New in Recoll 1.15.8 (May 4, 2011)

New in Recoll 1.15.5 (Mar 7, 2011)

New in Recoll 1.15.2 (Feb 15, 2011)

New in Recoll 1.15.1 (Feb 4, 2011)

New in Recoll 1.14.3 (Nov 25, 2010)

New in Recoll 1.14.0 (Sep 20, 2010)

New in Recoll 1.13.04 (Apr 16, 2010)

New in Recoll 1.13.0 (Jan 6, 2010)

New in Recoll 1.12.3 (Oct 29, 2009)

New in Recoll 1.12.2 (Oct 22, 2009)

New in Recoll 1.12.1 (Jul 22, 2009)

New in Recoll 1.12.0 (Feb 11, 2009)

New in Recoll 1.11.0 (Oct 25, 2008)

New in Recoll 1.10.5 (Sep 2, 2008)