In the last post I talked about some differences between fibers and
continuations. What may not have been clear is
that continuations are more primitive and flexible than fibers are. In fact,
you can implement fibers using continuations.
Here’s how. The basic idea is that we want to maintain two variables with
continuations in them, inside and outside. The first one will transfer
execution into the block of code that forms the fiber. The second will transfer
control back to the outside world.
When the outside world calls #resume, we save our continuation point as
outside, and call the current inside continuation. When, within the block,
#yield is called, we save our current continuation point as inside, and
transfer code back to the current outside.
There are a few more details in terms of passing values from #yield to
#resume, handling the return value of the block, and handling excessive
calls to #resume, but that’s the basic story. Here’s the code:
require 'continuation'
class CFiber
class Error < StandardError; end
def initialize &block
@block = block
callcc do |cc|
@inside = cc
return
end
@var = @block.call self
@inside = nil
@outside.call
end
def resume
raise Error, "dead cfiber called!" unless @inside
callcc do |cc|
@outside = cc
@inside.call
end
@var
end
def yield var
callcc do |cc|
@var = var
@inside = cc
@outside.call
end
end
end
This is also runnable on Ruby 1.8—just remove the require.
So why does Ruby 1.9 bother to implement fibers, when we can just use
continuations? I don’t know what the real answer is, but “speed” is at least a
good answer. Let’s do some some benchmarking to compare the two:
require 'benchmark'
n = ARGV.shift.to_i
Benchmark.bm do |bm|
bm.report " fibers" do
f = Fiber.new do
x, y = 0, 1
loop do
Fiber.yield y
x, y = y, x + y
end
end
n.times { |i| f.resume }
end
bm.report "cfibers" do
f = CFiber.new do |c|
x, y = 0, 1
loop do
c.yield y
x, y = y, x + y
end
end
n.times { |i| f.resume }
end
end
We’ll start with backporting that code to the Ruby 1.8.7 that Ubuntu provides
(ruby 1.8.7 (2008-08-11 patchlevel 72)). For 10000 Fibonacci numbers, we see:
| |
user |
system |
total |
real |
| cfibers |
0.810000 |
0.070000 |
0.880000 |
0.879930 |
That’s roughly 11.4kfps (that’s thousand Fibonacci numbers per second) that we
can produce using continuation-based fibers.
Let’s try the ancient Ruby 1.9.0 that Ubuntu provides (Ruby 1.9.0 (2008-06-20
revision 17482)):
| |
user |
system |
total |
real |
| fibers |
0.040000 |
0.000000 |
0.040000 |
0.037583 |
| cfibers |
18.680000 |
1.770000 |
20.450000 |
20.482006 |
Wow, fibers are fast: 250kfps. But things have gotten significantly worse for
cfibers, clocking at a measely 0.489kfps for cfibers.
Finally let’s try the latest and greatest Ruby 1.9.1 (ruby 1.9.1p129
(2009-05-12 revision 23412)):
| |
user |
system |
total |
real |
| fibers |
0.040000 |
0.000000 |
0.040000 |
0.035148 |
| cfibers |
0.150000 |
0.000000 |
0.150000 |
0.155890 |
Fibers are just as fast as before, but continuations have improved
dramatically—from 11.4kfps to 66.6kfps. Still, native fibers are more than
three times faster.
So perhaps Ruby 1.9.1 is the best of both worlds. When you need fast
non-preemptive concurrency, you can use native fibers; when you need to
implement your own crazy control structures, you can use continuations and be
assured that they’re still pretty darn fast (at least, as far as Ruby
operations are concerned).
Ruby 1.9 has both fibers and continuations. The two are often mentioned in the
same breath. They do vaguely similar-sounding things, and are implemented in
Ruby 1.9 with similar mechanics underneath the
hood, much
as how continuations and threads were implemented with the same underlying
mechanics in Ruby 1.8
[PDF, p. 14].
But implementation similarities aside, continuations and fibers have very
different semantics. A fiber behaves as a thread without preemption. Like a
thread, you create it, and it eventually dies; unlike a thread, you must
manually call yield and resume to transfer control in and out of it,
instead of just letting the runtime call them for you whenever it feels like
it. Like a thread, when you resume a fiber, you have the same call stack and
heap state (local variables) as when you left.
What’s nice about fibers is that, since you keep explicit control of the order
of execution, you can get thread-like behavior without all the hassle of
mutexes and synchronization. Of course you have to deal with the hassle of
ordering all your operations, but you at least have the option of avoiding the
fun race-condition game that always seems to crop up in threaded programming.
What about continuations? Instead of fibers’ create, kill, yield, and resume
operations, a continuation only really has two operations: capture and resume.
A continuation is captured once, and may be resumed multiple times. When you
resume a continuation, the call stack is reverted to what it looked like when
it was captured, but the heap state stays the same. There’s no exit point or
death for a continuation (at least until Ruby gets bounded continuations);
execution simply continues from the capture point.
What’s nice about continuations is that you can use them to implement control
structures. Loops, exceptions, cross-procedure gotos… almost every control
structure you can come up with can be implemented with continuations. In fact,
you can implement fibers using
continuations!
Let’s look at an example. Here’s the fiber-based Fibonacci computation from
the InfoQ article on Fibers in Ruby
1.9:
fib = Fiber.new do
x, y = 0, 1
loop do
Fiber.yield y
x, y = y, x + y
end
end
20.times { puts fib.resume }
Here we call yield from within the fiber once we’ve computed a number, which
transfers control to the main function, and which prints out the number yielded
and then calls resume to transfer control back to the fiber. A thread version
looks very similar:
require 'thread'
q = SizedQueue.new 1
fib = Thread.new do
x, y = 0, 1
loop do
q.push y
x, y = y, x + y
end
end
20.times { puts q.pop }
Since we don’t have explicit control over the scheduling, we implicitly
scheduled the order of operations by using a synchronized SizedQueue data
structure, which blocks the computation thread from computing a new number
until the printing thread is ready to receive it. (There are many ways we
could’ve accomplished this.)
Here’s the version using continuations:
require 'continuation'
c, x, y, i = callcc { |cc| [cc, 0, 1, 1] }
puts y
c.call c, y, x + y, i + 1 if i < 20
You’ll notice there are no loops, and variables are never changed after
assignment. In fact the code is starting to look suspiciously like an inductive
proof, with one line that like a base case and another line that looks like a
recursive case. You can see why continuations make functional-programming
enthusiasts get excited!
This implementation works because resuming the continuation (the call to
c.call) replaces the call stack and point of execution with what they were at
the point it was captured (the call to callcc). In contrast, resume-ing the
fiber moved us back to the point we were when the fiber called yield, and so
the outer loop in the fiber implementation was necessary.
Beyond call stacks, another major difference between fibers and continuations
is the way the heap is treated. Multiple fibers on the same section of code do
not share local variables. Multiple continuations on the same section of code
do. Here’s a brief example. First, the fibers version:
fib = (0 ... 5).map do |i|
Fiber.new do
x = 0
Fiber.yield x
x += 1
end
end
fib.each { |f| puts f.resume }
We create five fibers, and call resume on them once each. As you’d expect,
this prints out a series of 0’s. The variable x is not shared between the
multiple fibers. Of course, the fiber constructor here is a block, and blocks
are closures, so we could make them share state by moving the x = 0 line
outside the map line. But that’s a result of having closures, not of fibers
per se.
Let’s try an example with multiple continuations, all jumping into the same point in the code:
require 'continuation'
x = 0
c = callcc { |cc| cc }
d = callcc { |cc| cc } if c
e = callcc { |cc| cc } if c && d
f = callcc { |cc| cc } if c && d && e
x += 1
puts x
c.call if c
d.call if d
e.call if e
f.call if f
We initialize x to 0, create 4 separate continuations, add one to x, and
call the continuations in order. (The postfix if statements ensure that the
continuations variables aren’t set or called more than once. Calling c.call
without arguments will jump back to the c = callcc line and set c to
nil.)
Silly, but it illustrates the point: the output is “1 2 3 4 5”, meaning that
the four continuations all share the same heap. When d is called, its x is
the same as the x of c, and even though it was 0 when d was captured, it
has since been modified by the resumption of c. When e is called, its x
is also the same x, and so on. (In fact this whole example depends on this
behavior—each of the continuation variables are only set once, and must
“retain” their value across all rentries to continuations above them.)
In additon to multiple continuations being able to share state, the converse is
true too: multiple resumes on the same continuation will share state:
require 'continuation'
x = 0
c = callcc { |cc| cc }
x += 1
puts x
c.call c while x < 5
This outputs the same thing as the examples above.
Hopefully that clears up some of the confusion. Here’s the summary:
| Fibers |
Continuations |
| Four operations: create, exit, yield, resume. |
Two operations: capture and resume. |
| Upon resume, call stack is wherever it was at the last yield. |
Upon resume, call stack is where it was when captured. |
| Do not share state except via closure. |
Multiple continuations and multiple invocations of the same continuation can share state. |