Whistlepig: minimalist real-time full-text search

Whistlepig is a minimalist realtime full-text search index. Its goal is to be as small and feature-free as possible, while still remaining useful, performant and scalable to large corpora. If you want realtime full-text search without the frills, Whistlepig may be for you.

Whistlepig is written in ANSI C99. It currently provides a C API and Ruby bindings.

Latest version:0.11.1, released 2012-04-19.
Status:beta
News:http://all-thing.net/label/whistlepig/
Homepage:http://masanjin.net/whistlepig/
Bug reports:Github issue tracker

Getting it

Tarball:whistlepig-0.11.1.tar.gz
Rubygem:gem install whistlepig
Git:git clone git://github.com/wmorgan/whistlepig.git

Realtime search

Roughly speaking, realtime search means:

Whistlepig takes these principles to an extreme. Features that Whistlepig does provide:

Synopsis (Ruby bindings)

require 'rubygems'
require 'whistlepig'

include Whistlepig

index = Index.new "index"

entry1 = Entry.new
entry1.add_string "body", "hello there bob"
docid1 = index.add_entry entry1              # => 1

entry2 = Entry.new
entry2.add_string "body", "goodbye bob"
docid2 = index.add_entry entry2              # => 2

q1 = Query.new "body", "bob"
results1 = index.search q1                   # => [2, 1]

q2 = q1.and Query.new("body", "hello")
results2 = index.search q2                   # => [1]

index.add_label docid2, "funny"

q3 = Query.new "body", "bob ~funny"
results3 = index.search q3                   # => [2]

entry3 = Entry.new
entry3.add_string "body", "hello joe"
entry3.add_string "subject", "what do you know?"
docid3 = index.add_entry entry3              # => 3

q4 = Query.new "body", "subject:know hello"
results4 = index.search q4                   # => [3]

API Documentation

For the Ruby bindings, see the Whistlepig Ruby API documentation.

For the query language, see the Whistlepig Query documentation.

For the C API, see the code.

A note on concurrency

Whistlepig is currently single-process and single-thread only. However, it is built with multi-process access in mind. Per-segment single-writer, multi-reader support is planned in the near future. Multi-writer support can be accomplished via index striping and is planned for the distant future.

License

Whistlepig is distributed under the terms of the New BSD License. See the file COPYING for details.

Author

Whistlepig is brought to you by William Morgan.