Software:Recoll

From HandWiki
Short description: Desktop search tool
Recoll
Recoll logo
Recoll logo
Recoll.png
Recoll screenshot
Developer(s)Jean-François Dockes
Stable release
1.31.0 / March 7, 2022; 23 months ago (2022-03-07)[1]
Written inC++ and Python
Operating systemUnix-like, Windows, OS/2
TypeSearch tool
LicenseGPL
Websitewww.lesbonscomptes.com/recoll/

Recoll is a desktop search tool that provides full text search (from single-word to arbitrarily complex boolean searches) in a GUI with few mandatory external dependencies. It runs under many Unix-like operating systems, and is mostly independent of the desktop environment. Recoll has been ported to OS/2,[2] and is planned for integration into the OS/2-based ArcaOS.[3]

Recoll was designed not to require a permanent daemon, on Linux systems it can make use of inotify. Recoll updates its index at designed intervals (for example through cronjobs), but if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.[4]

Features

  • Qt GUI.
  • Xapian backend.
  • Indexes the contents of many document types: text, HTML, email stores of all kinds, OpenDocument, Microsoft Office and Office Open XML, AbiWord, KWord, Gaim, Lyx, Scribus, PDF, WordPerfect, PostScript, RTF, TeX, DVI, DjVu, MP3 and other audio file formats, JPEG and other image file formats.[5]
  • Recursively processes embedded documents (E-Mail attachments, Zip archives) to arbitrary depths.
  • Query facilities, with boolean searches, wildcards, phrases, proximity, and filter on file types, and directory trees. GUI Boolean search build tool.
  • Xesam query language support.
  • Word stemming is performed at query time (can switch stemming language after indexing).
  • Multiple indexes selectable at query time (i.e. personal + system indexes).
  • Natively based on Unicode. Supports many languages and character sets, including good support for East Asian texts (CJK).
  • MD5 document hashes for the elimination of duplicates in results.
  • Batch and real-time indexing modes.
  • Python API.
  • GNOME Shell search provider, WEB interface, and Firefox history extensions.

File type supported

File types indexed natively

  • Text.
  • Html.
  • Maildir, MH, and mailbox (Mozilla, Thunderbird, and Evolution mail ok). Evolution note: be sure to remove .cache from the skippedNames list in the GUI Indexing preferences/Local Parameters/ pane if you want to index local copies of Imap mail.
  • Gaim and purple log files.
  • Scribus files.
  • Man pages (needs Groff).
  • Mimehtml web archive format (support based on the mail filter, which introduces some mild weirdness, but is still usable).
  • All the following need Python3:
  • Dia diagrams.
  • Excel and PowerPoint (pre-open-XML).
  • Tar archives. Tar file indexing is disabled by default (because tar archives don't typically contain the kind of documents that people search for), you will need to enable it explicitly, like with the following in your $HOME/.recoll/mimeconf file:
   [index]
   application/x-tar = execm rcltar
  • Zip archives.
  • Konqueror web archive format (uses the tarfile Python standard library module).

File types indexed with external helpers

  • PDF files.
  • MS-Word files.
  • Wordperfect files.
  • RTF files.
  • Image and audio file tags.
  • Abiword files.
  • Fb2, Epub, and CHM ebooks.
  • Kword files.
  • Microsoft Office traditional and Open XML files.
  • OpenOffice files.
  • SVG files.
  • Okular annotations files.
  • HWP files (without page numbering).

See also

References

External links