Trollop 1.8.1 is out. This is a minor bugfix
release, but 1.8, released a few weeks ago but not really advertised, adds new
functionality, so I’m describing that here.
The new functionality is subcommand support, as seen in things like git and
svn. This feature is actually trivial to use / implement: you give Trollop a
list of stopwords. When it sees one, it stops parsing. The end. That’s all you
need.
Here’s how you use it:
- Call
Trollop::options with your global option specs. Pass it the list of subcommands as the stopwords. It will parse ARGV and stop on the subcommand.
- Parse the next word in ARGV as the subcommand, however you wish.
ARGV.shift is the traditional choice.
- Call
Trollop::options again with whatever command-specific options you want.
And that’s it. Simple eh?
It continually amazes me how hard other people make option parsing. I think
it’s a holdover from their days of using C or Java. Take a look at synopsis for
optparse
— it’s a ridiculous amount of work for something simple. Or better yet, look at
the synopsis for CmdParse. Having
to make a class for each command is a clunky Java-ism. I’m sorry, but it’s
true. Subclassing is the one option for specializing code in Java; in Ruby we
can be far more sophisticated. Take a look at
Ditz’s
operator.rb
for an example of a subcommand DSL.
One of Git’s defining characteristics is its extreme (some say “ridiculous”)
flexibility. Even with all the fancy porcelain on top, what you’re get when you
use Git is basically a general DAG builder for patches, and the ability to
apply labels to points within.
It’s interesting to see how this flexibility is put to use in practice. In my
many years (ok, months) of Git usage, across a variety of projects, I’ve
noticed several distinct styles of Git usage.
The most salient differences between styles are:
- How much they care about keeping the development history “beautiful”, i.e.
free of unnecessary merges. Git gives you two tools for adding your commit to a
branch: merge and rebase. A rebase will always preserves linearity, a merge has
the potential for introducing non-linearity. Some projects are fanatic about
this. Linus has been known to reject code because there were too many “test
merges” (see the
git-rerere man page). Other projects don’t care at all.
- How much they make use of topic branches. Some projects do the majority of
development through them. Some do all development directly onto master,
branching only for long-term divergent development.
- How new commits come into the system: patches to mailing lists, merges from
remote branches performed by the maintainer, or commits directly into the
central repo.
Each of these decisions results in a different style
of development. The styles I’ve encountered in the wild are:
- The just-like-SVN approach. Example project: Rubinius.
Individual contributers have a commit bit, or they don’t. Everyone works from
local clones. If you have a commit bit, you push directly to origin/master.
Non-committers can post patches to a mailinglist or to IRC. There are some
published branches, but they’re for long-running lines of development that are
eventually merged in and discarded. There’s no real pickiness about merges in
development history; rebasing is encouraged but not required.
- The Gitorious / Github approach.
Example project: everything on those systems. Only the maintainer can commit to
the central repository. Anyone can create a remote clone, push commits, and
send a formal merge request through the system to the maintainer. All code
addition (except for the maintainer’s additions) are done through merges.
- The topic-based approach. Example projects: Git itself, the Linux kernel,
Sup. Patches are submitted to the mailing list. The
maintainer builds topic branches for each feature/bugfix in development and
merges these into different “version branches”, which
correspond to different versions of the project such as
stable/experimental/released version distinctions. Sub-maintainers are used
when the project gets large, and their repositories are merged by the
maintainer upon request.
- The remote topic branch approach. This was an experiment I tried with
Ditz, and is roughly my attempt to do topic-based
Git with Gitorious. In this approach, contributors, instead of submitting
patches to a mailing list, maintain feature branches themselves. When a branch
is updated, a merge request is sent to the maintainer, who
merges the remote branch into a version branch.
I’ve listed the styles in order from least to most overhead. The just-like-SVN
style requires very little knowledge of Git; at the other end of the spectrum,
the topic-based approaches require a fair amount of branch managment. For
example, care has to be taken that merging a topic branch into a version branch
doesn’t accidentally merge another version branch in as well. (This type of
complexity spurred me to write tools like git
show-merges and the soon-to-be-released
git wtf.)
The advantage of the topic-based approaches, of course, is that it’s possible
to maintain concurrent versions of the same program at different levels of
stability, and to pick and choose which features go where.
Which style is best for you depends on what you’re trying to accomplish. Like
all good tools, what you get out of Git depends on what you’re willing to put
into it, and that’s a decision you’ll have to make.
Name this function:
inject({}) { |h, o| h[yield(o)] = o; h }.values
Hints:
- It’s a variant of a common stdlib function.
- The name has 7 characters, one of which is an underscore.
A survey of my rubyist colleagues suggests this is a hard question. Much harder
than writing the function given the name, which took about 10 seconds.
I’ve done some very preliminary benchmarking on the inliner I’ve been hacking
into Rubinius.
For the very simple case it can handle so far—guaranteed dispatch to self,
fixed number of arguments (no splats or defaults), no blocks—here’s what we get
for 10m iterations of a simple function calling another simple function:
| name |
user |
system |
total |
real |
| uninlined-no-args |
22.49 |
0 |
22.49 |
22.49 |
| inlined-no-args |
21.74 |
0 |
21.74 |
21.74 |
| uninlined-4-args |
27.74 |
0 |
27.74 |
27.74 |
| inlined-4-args |
24.59 |
0 |
24.59 |
24.59 |
So inlining results in a 3.5% speedup on method dispatch with no arguments, and
a 12.8% speedup when there are four arguments.
Of course this is the very optimal case for the inliner. Guaranteed dispatch to
self means that I don’t even add any guard code, which would definitely slow
things down. But this actually is a fairly common case that occurs whenever you
use self accessors and any helper functions that don’t have blocks or varargs.
And the real boost of inlining, presumably, is going to be in conjunction with
JIT, since the CPU can pipeline the heck out of everything.
For the past few years I’ve done something silly with my email: I’ve
accepted email for every address at masanjin.net, and then
filtered them for spam before display. This means that, as far as any
spammer is concerned, every email address they tried to send to
masanjin.net was a direct hit. So there’s been a snowball effect:
everything they tried worked, and those addresses stayed on their lists,
and every variant they tried worked, and made it to the lists, etc.
Of course I didn’t see most of it, but it all made the trip from spammer
to mail server and over fetchmail to my poor home computer, which would
have spamassassin crank for 20 minutes every, oh, 25 minutes or so.
I’ve finally changed to a sane situation wich my mail server on a VPS
and exim4 calling spamassassin at accept time. I’ve also set up a bunch
of rules for which email addresses I accept. (Just any old string
doesn’t cut it any more.)
The result: over the past 9 days I’ve rejected 209,605 emails as spam.
That’s about 16.17 a minute, or a little more than one every 4 seconds.
How many have I accepted? Including false negatives, 2441, or one every
5 minutes. (I am on several high-volume mailinglists.)
That’s a S/N ratio of 1.16%!
Hopefully as time goes by, the rejections will start trimming addresses
off spammers’ lists, and that will improve somewhat. Until then… at
least it’s not my home computer doing the work any more.
Managing that old Hobix blog was way more work than it should’ve been. So,
I’ve started over and outsourced the work to someone else. Let’s see how it
goes. [It didn’t go so well. Hence, Whisper.—ed.]