Whistlepig is a minimalist realtime full-text search index. Its goal is to be as small and feature-free as possible, while still remaining useful, performant and scalable to large corpora. If you want realtime full-text search without the frills, Whistlepig may be for you.
Whistlepig is written in ANSI C99. It currently provides a C API and Ruby bindings.
|Latest version:||0.12, released 2012-06-09.|
|Bug reports:||Github issue tracker|
Roughly speaking, realtime search means:
require 'rubygems' require 'whistlepig' include Whistlepig index = Index.new "index" entry1 = Entry.new entry1.add_string "body", "hello there bob" docid1 = index.add_entry entry1 # => 1 entry2 = Entry.new entry2.add_string "body", "goodbye bob" docid2 = index.add_entry entry2 # => 2 q1 = Query.new "body", "bob" results1 = index.search q1 # => [2, 1] q2 = q1.and Query.new("body", "hello") results2 = index.search q2 # =>  index.add_label docid2, "funny" q3 = Query.new "body", "bob ~funny" results3 = index.search q3 # =>  entry3 = Entry.new entry3.add_string "body", "hello joe" entry3.add_string "subject", "what do you know?" docid3 = index.add_entry entry3 # => 3 q4 = Query.new "body", "subject:know hello" results4 = index.search q4 # => 
For the Ruby bindings, see the Whistlepig Ruby API documentation.
For the query language, see the Whistlepig Query documentation.
For the C API, see the code.
Recent versions of Whistlepig are multi-process-safe, and support concurrent reads and writes. Whistlepig uses pthread read/write locks to support this, where a single writer will lock a segment (and the entire index, briefly) against readers, but multiple readers can execute in parallel. Services with lots of write traffic will see a degredation in read performance if both reads and writes are executed against the same single index. Index replication may be useful in this situation.
Whistlepig is distributed under the terms of the New BSD License. See the file COPYING for details.
Whistlepig is brought to you by William Morgan.