Monday, October 29, 2012

Async in Clojure: Playing with Agents, Part II

In the last post, we took a look at basic usage of Clojure's agent function. In this post, we'll dive a little bit deeper

Validation
We glossed over the options that you can define when creating an agent; one of them is the validator which one can use to check before the agent is updated with the passed value.

If we want to make sure that our read-agent always gets a string value, this is all we have to do:


Similarly, any function that takes a single value as a parameter can be used here. As you can see, we had to change our default value for the agent from nil to "" since there is now a string validator. If we hadn't, any time we tried to use that agent, we'd get java.lang.IllegalStateException.

When Things Go Wrong
Another option you can set when defining an agent is the error handler. This will be used in the event of an error, including if a value fails to be validated by your validator function. Here's an example:


With both of these options, you don't have to set them when the agent is defined; you can do it later with a function call, if you so desire (or if needs demand it):


Watch This!
So, we've got an error handler but no event handler? Yup. However, you can actually get callback-like behavior using watches. Check this out:


Now, any time our agent's state changes, the function passed to the watch will fire. As described in the docs, the parameters are: the agent, a key of your devising (must be unique per agent), and the handler that you want to have fired upon state change. The handler takes as parameters: the key you defined, the agent, the agent's old value, and the agent's new value.

All Together Now
With all our example code in place, we can now exercise the whole thing at once. Here's the whole thing:


To simply demonstrate the async nature and the callbacks in action, let's run the following:


Eventually, our callback will render output very much like the following:


Do note, however, that if we called a series of send-offs with different times (using the same agent and watch), we wouldn't see the ones with shorter times come back first. We'd see the callback output in the same order we called send-off. This is because the watch function is called synchronously on the agent's thread before any pending send-offs (or sends) are called. In future posts, I'll cover ways around this (constructing agents on the fly as well as exploring alternative solutions with external libraries).

Regardless, with these primitives, there are all sorts of things one can do. For instance...

Dessert
To close, check out this neat little bit of code that sends 1,000,000 messages in a ring. This code creates a chain of agents, and then actions are relayed through it (taken from the agents doc page):


Kicking this puppy off, our million messages finish in about 1 second :-)


Sunday, October 28, 2012

Async in Clojure: Playing with Agents, Part I

Clojure has a very interesting async primitive: the agent. There is some good documentation on agents, but for those that come from a background such as mine (Python at Twisted), I thought it might be nice to present one way of using agents to mimic the familiar async + callback reactive-style programming.

Do note, however, that Clojure agents run in one of two threadpools (one intended for CPU-intensive tasks, and the other for I/O-intensive tasks). As such, this is quite different than the event-loop approach that Twisted uses (or async frameworks that utilize libraries such as libevent or libev). Twisted has the deferToThread functionality, which is ... well, not exactly close, really. Regardless, let's get started.

In the following examples, we're going to pretend we have huge files we'll be reading off a local disk.

What to Call
Clojure's agent function is very, very simple: you pass it a value (it's initial state) and some options, if needed. That's it.


To update its state, you use either the send or send-off functions. If you've got CPU-bound tasks whose state you want to manage with agents, then you should use the send function. If your tasks will be I/O-bound, then you should use the send-off function for updating agent state. (The threadpool dedicated for use by send has a fixed size, based on the number of processors on your system. The threadpool for send-off is exapandable with thread caching and keep-alives.) Since our examples are focused on disk I/O, we'll be using send-off. (they have the same signature, though, so the following usage information applies to both).

When you send-off something to an agent, you pass if a few things:
  • an agent
  • the action or update function
  • any number of additional parameters you want the action function to consume
Here's what that looks like:


What to Write
So, we know what an agent looks like when bound and we know how we're going to send an update to the agent, but how might we construct the update itself? Perhaps like this:


As you can see, the first value that an action function takes is the "old" value of the agent -- the value that the agent has prior to the action that will take place. Once this function returns, the agent's value will be set to the return value of the action function. (What's more, if we needed to access the agent itself inside the action function for any reason, we could do so using the *agent* variable -- accessible within the scope of the action function).

Before we go on, let's take a look at this in action from the REPL:


The first thing we do is switch from the default namespace to one dedicated to our examples (this makes managing scope in the REPL much cleaner). Then we load a file that has the agent and action function defined. Then we tell it to run our fake "big read" function, asking it to run for about 10 seconds. As you can see, send-off returns the agent immediately. We then get the current value of the agent by dereferencing it. Finally our big read finishes, and we see it print how long it took. We then look at the agent directly, and then dereference it again -- both showing us what we'd expect: that the value of the agent has been updated to the return value of our big read function. Finally, we shutdown the agent threads and exit the REPL.

(The start-clojure script is wrapped with rlwrap so that I have access to a command line history, persistent over different sessions. The script boils down to this: rlwrap java -cp /usr/local/clojure-1.4.0/clojure-1.4.0.jar clojure.main.)

We've seen the agent in action now, but there's a bit more we can do. We'll take a look at that in the next post.

Friday, October 19, 2012

libevent for Lisp: A Signal Example

At MindPool, there are several async I/O options we've been exploring for Lisp:
As luck would have it, I started with cl-event, and it was a fun little adventure (given the fact that it hasn't been maintained). I corresponded with the very nice original author a bit, asking if there were any updates in other locations, but sadly there weren't.

I was ready to dive in and get things current, when one last Google search turned up cl-async. This little bugger was hard to find, as at that point it had not been listed on CLiki. (But it is now :-)). Andrew Lyon has done a tremendous amount of work on cl-async, with a very complete set of bindings for libevent. This is just what I had been looking for, so I jumped in immediately.

As one might imagine from the topic of this post, there's a lot to be explored, uncovered, and developed further around async programming in Lisp. I'll start off slowly with a small example, and add more over the course of time.

I also hope to cover IOlib and SBCL's SERVE-EVENT in some future posts. Time will tell... For now, let's get started with cl-async in SBCL :-)

Dependencies
In a previous post, I discussed getting an environment set up with SBCL, I the rest of this post assumes that has been read and done :-)

Getting cl-async and Setting Up an SBCL Environment for Hacking
Now let's download cl-async and install the Libevent bindings :-)


With the Lisp Libevent bindings installed, we're now ready to create a Lisp image to assist us when exploring cl-async. A Lisp image saves the current state of the REPL, with all the loaded libaries, etc., allowing for rapid start-ups and script executions. Just the thing, when you're iterating on something :-)


Example: Adding a Signal Handler
Let's dive into some signal handling now! Here is some code I put together as part of an effort to beef up the examples in cl-async:


Note that the as: is a nickname for the package namespace cl-async:.

As one might expect, there is a function to start the event loop. However, what is a little different is that one doesn't initialize the event loop directly, but with a callback. As such, one cannot set up handlers, etc., except within the scope of this callback.

We've got the setup-handler function for that, which adds a callback for a SIGINT event. Let's try it out :-)

Once your script has finished loading the core, you should see output like the above, with no return to the shell prompt.

When we send a SIGINT with ^C, we can watch our callback get fired:


Next up, we'll take a look at other types of handlers in cl-async.

Thursday, October 18, 2012

Getting Started with Steel Bank Common Lisp

As some of you know, I've been a closet Lisp fan for several years. When I first joined Canonical in 2008, I was hacking on Lisp in Python, so that I could do genetic programming in Python. In fact, my first and only lightening talk at a Canonical sprint was on genetic algorithms and programming :-)  (This was the same set of lightening talks that Vincenzo Di Somma gave a wonderful presentation on his photography; completely unrelated: this is one of my favorite pics of Vincenzo :-) ).

A few years later, I talked to Jim Baker about Python's AST, and how one might be able to do genetic programming by manipulating it directly, instead of running a Lisp in Python.

Throughout all this time, I've been touching in with various community projects, hacking on various Lispy Things, reading, etc., but generally doing so quite quietly. Over the past few months, however, I've really gotten into it, and Lisp has become a real force in my life, rapidly playing just as dominant a role as Python.

Similarly, MindPool has become active in several Lisp projects; as such, there are a great many things to share now. However, before I begin all that, I'd like to take an opportunity to get folks up and running with an example Lisp environment.

Future posts will explore various areas of Common Lisp, Scheme dialects, I/O loops, etc., but this one will provide a basis for all future posts that relate to Common Lisp and specifically the Steel Bank implementation.

Installing SBCL
If you don't have SBCL (Steel Bank Common Lisp; a pun on it's source parent, CMUCL), you need to install it:
  • For Ubuntu (12.04 LTS has 1.0.55): $ sudo apt-get install sbcl
  • Or you can go to the download page for everyone else.
apt-get for Lisp
Next, you'll need to install Quicklisp (as you might have surmised, it's like Debian apt-get for Common Lisp). The instructions on this page will get you up and running with Quicklisp.

I like having quicklisp available when I run SBCL, so I did the following after installing Quicklisp (and you might want to as well) from the sbcl prompt:
* (ql:add-to-init-file)

Readline Support
The default installation of SBCL doesn't have readline support for the REPL, so using your arrow keys won't give you the expected result (your command history). To remedy that, you can use a readline wrapper. First, install rlwrap:
  • Ubuntu: $ sudo apt-get install rlwrap
  • Mac OS X: $ brew install rlwrap
Then, create the chmoded 755 script ~/bin/start-sbcl with the following content (make sure that ~/bin is in your path):
rlwrap sbcl

At which point you can run the following and have access to a command history in SBCL:
$ start-sbcl
* 

Why Steel Bank?
CMUCL gained an excellent reputation for being a highly performant, optimized implementation of Lisp. Based on CMUCL and continuing this tradition of excellent performance, SBCL's reputation preceded it. Over a range of different types of programs, SBCL not only compares favorably to other Lisp dialects, it seriously kicks ass all over.

SBCL comes in at 8th place in that benchmark ranking, beating out Go in 9th place. In all the languages that made it into the Top 10, I've only ever touched C, C++, Java, Scala, Lisp, and Go. In my list, SBCL made the Top 5 :-) Regardless, of all of them, Lisp has the syntax a find most pleasurable. Given my background in Python, this is not surprising ;-)

What's next? 
Funny that you should ask... given my background with Twisted, I'll give you one guess ;-)


Tuesday, October 16, 2012

Browser ID/Mozilla Persona, For the Win

We've got more coming about Mozilla Persona (aka "Browser ID"), but we're busy with a thousand things (not the least of which is most posts about Twisted, new ones about Lisp, and some fun post-cloud food-for-thought goodness).

Until then, though, we wanted to say this: Mozilla Persona kicks ass. Seriously. The user experience is phenomenal. The developer experience is painless and quick. How could this possibly be authentication?!? (That was sarcasm.) This is what OpenID should have been ...

More soon!