Will Ware's blog: 2013

Wednesday, December 11, 2013

ZeroMQ solves important problems

ZeroMQ solves big problems in concurrent programming. It does this by ensuring that state is never shared between threads/processes, and the way it ensures that is by passing messages through queues dressed up as POSIX sockets. You can download ZeroMQ here.

The trouble with concurrency arises when state or resources are shared between multiple execution threads. Even if the shared state is only a single bit, you immediately run into the test-and-set problem. As more state is shared, a profusion of locks grows exponentially. This business of using locks to identify critical sections of code and protect resources has a vast computer science literature, which tells you that it's a hard problem.

Attempted solutions to this problem have included locks, monitors, semaphores, and mutexes. Languages (like Python or Java) have assumed the responsibility of packaging these constructs. But if you've actually attempted to write multithreaded programs, you've seen the nightmare it can be. These things don't scale to more than a few threads, and the human mind is unable to consider all the possible failure modes that can arise.

Perhaps the sanest way to handle concurrency is via shared-nothing message passing. The fact that no state is shared means that we can forget about locks. Threads communicate via queues, and it's not so difficult to build a system of queues that hide their mechanics from the threads that use them. This is exactly what ZeroMQ does, providing bindings for C, Java, Python, and dozens of other languages.

For decades now, programming languages have attempted to provide concurrency libraries with various strengths and weaknesses. Perhaps concurrency should have been identified as a language-neutral concern long ago. If that's the case, then the mere existence of ZeroMQ is progress.

Here are some ZeroMQ working examples. There's also a nice guide online, also available in book form from Amazon and O'Reilly.

Saturday, November 23, 2013

What's up with Docker?

I've been hearing about Docker lately. Fair disclosure: I am no Docker expert. I've just tinkered with it a little bit so far and this post is part of my process of getting acquainted with it, and I'll probably update this post if I learn anything noteworthy.

It seems to be an interesting and useful wrapper for Linux Containers, a relatively new feature of the Linux kernel. Docker tries to solve the problem of providing a uniform execution environment across development and production.

In the Python world, virtualenv and pip create a sandbox with specific version numbers for various Python packages, regardless of what may be installed globally on the machine. Docker creates a sandbox of an entire Linux environment, much like a virtual machine, but much lighter-weight than a virtual machine.

Docker instances start in fractions of a second and occupy vastly smaller resources on your laptop or server than a virtual machine would. These days I work on a Macbook and run Linux in a virtual machine, and I'm told it will be practical to simultaneously run multiple Docker instances in the VM. So if you're somebody who's thought it would be fun to write software to run on multiple computers on a network but you haven't had the actual computers available, Docker is for you. I'm interested in Redis as a distributed task queue.

Thursday, October 31, 2013

Some favorite crowd-funded projects

There is a lot going on with crowd-funding these days. Over the past year or so, it has become huge. One might venture to say that it is a significant source of innovation in science, technology and art. Obviously there will be projects that cannot be crowd-funded; it is difficult to imagine a successfully crowd-funded Mars mission or cancer cure. But the space of feasible new projects is vast, and what follow are a few of my favorite crowd-funded projects.

But first, there is a recent noteworthy development from the SEC: previously allowed only to hand out rewards, crowd-funding campaigns will soon be able to award equity too. Equity had thus far been only for accredited investors, people with buckets of spare money in their garages and garden sheds. The possibility of investing in a successful venture rather than simply receiving a toy and a good feeling might make the already-fascinating crowd-funding scene a much more interesting place. It could play an important role in economic recovery.

OCULUS RIFT

The Oculus Rift is a virtual-reality headset representing an enormous improvement in performance-to-price ratio. The head tracking is smooth and the graphics are good. This is one of the first crowd-funded projects I heard about, and the first one I contributed to. For $300, I got a headset with a very entertaining demo, and if I get up the energy I will do something myself with science education.

By getting in early and having a huge success, the Oculus Rift set a precedent for big splashy projects, and probably helped Kickstarter as much as Kickstarter helped Oculus.

CastAR from TECHNICAL ILLUSIONS

CastAR is another virtual reality gadget, this time a pair of glasses that project an image onto a retroreflective surface in front of the user. One big innovation here is that the virtual reality can be mixed with actual reality, for instance using game pieces or other objects. Also, because the user is looking at things some distance away, eye strain is reduced. The head-tracking on CastAR follows both rotation and translation where the Oculus Rift only follows rotation.

BLEDUINO

This is a Bluetooth-enabled Arduino board. Arduino is a cheap easy-to-use controller board for hobbyist projects and art installations. With Bluetooth, whatever you're building can connect to a phone or tablet.

ESPRUINO

The Espruino is another Arduino-based controller board. What's unique is that it is designed to operate with a language called JavaScript, which has been used in web browsers for a long time but has slowly been gaining momentum as a hardware control language.

CHINEASY

This is an instructional program to teach yourself Mandarin. There are flashcards and animations to learn the written characters, and audio materials to learn the spoken language.

STAR TREK: RENEGADES

If you miss the pre-J-J-Abrams Star Trek franchise, this is for you. This movie brings back Walter Koenig (Chekhov from the original series) with several actors from Star Trek: Voyager. It is set ten years after Voyager's return to human space, and politics and hilarity ensue.

PEBBLE SMARTWATCH

Another big success story, the Pebble can now be purchased for $150 at Best Buy. It connects to your phone and can run Android apps on a very small screen. It has a magnetometer (compass), a three-axixs accelerometer, Bluetooth, ambient light sensors, a 144x168-pixel screen, and a week of battery life between charges. It connects via Bluetooth to your phone so the phone can stay in your pocket most of the time.

My long list below includes some projects that were already funded and have gained significant fame, like the Oculus Rift virtual reality headset, or the Pebble smartwatch now available at Best Buy.

Random projects

Phone and tablet

Electronics and computers

Robots and Flying Things

Gaming

Maker stuff

Here's an interesting list of crowd-funding resources: http://crowdfundingforum.com/showthread.php/6748-List-of-All-Crowdfunding-Sites-and-Platforms-Ever-Expanding

The shortened URL for this post is http://goo.gl/ehfBQb.

Friday, October 25, 2013

Bar Camp Boston 2013 talk on automation of science

This is an outline for a talk I gave at Bar Camp Boston 8 on the automation of science. It's a topic I've blogged and spoken about before. The shortened URL for this post is http://goo.gl/rv3Xik.

In 2004, a robot named Adam became the first machine in history to discover new scientific knowledge independently of its human creators. Without human guidance, Adam can create hypotheses to explain observations, design experiments to test those hypotheses, run the experiments using laboratory robotics, interpret the experimental results, and repeat the cycle to generate new knowledge. The principal investigator on the Adam project was Ross King, now at Manchester University, who published a paper on the automation of science (PDF) in 2009. Some of his other publications: 1, 2, 3.

Adam works in a very limited domain, in nearly complete isolation. There is plenty of laboratory automation but (apart from Adam) we don't yet have meaningful computer participation in the theoretical aspect of scientific work. A worldwide scientific collaboration of human and computer theoreticians working with human and computer experimentalists could advance science and medicine and solve human problems faster.

The first step is to formulate a linked language of science that machines can understand. Publish papers in formats like RDF/Turtle or JSON or JSON-LD or YAML. Link scientific literature to existing semantic networks (DBpedia, Freebase, Google Knowledge Graph, LinkedData.org, Schema.org etc). Create schemas for scientific domains and for the scientific method (hypotheses, predictions, experiments, data). Provide tutorials, tools and incentives to encourage researchers to publish machine-tractable papers. Create a distributed graph or database of these papers, in the role of scientific journals, accessible to people and machines everywhere. Maybe use Stackoverflow as a model for peer review.

Begin with very limited scientific domains (high school physics, high school chemistry) to avoid the full complexity and political wrangling of the professional scientific community in the initial stages. As this stuff approaches readiness for professional work, deploy it first in the domain of computer science and other scientific domains where it can hope to avoid overwhelming resistance.

Machine learning algorithms (clustering, classification, regression) can find patterns in data and help to identify useful abstractions. Supervised learning algorithms can provide tools of collaboration between people and computers.

The computational chemistry folks have a cool little program called Babel which translates between a large number of different file formats for representing molecular structures. It does this with a rich internal representation of structures, and pluggable read and write modules for each file format. At some point, something like this for different file formats of scientific literature might become useful, and might help to build consensus among different approaches.

A treasure trove would be available in linked patient data. In the United States this is problematic because of the privacy restrictions associated with HIPAA regulation. In countries like Iceland and Norway which have universal health care, there would be no equivalent of HIPAA, and those would be good places to initiate a Linked Patient Data project.

Thursday, October 17, 2013

The first neon sign I've ever wanted to own

This sign appears in the Cambridge UK office of Autonomy Corporation. I want one. I need to talk to the people who make neon signs. There are a few online threads (1, 2) where people express curiosity about this sign.

This equation is Bayes' Law. Thomas Bayes (1701-1761) proposed it as a way to update one's beliefs based on new information. I saw this picture on a blog post by Allen Downey, author of Think Bayes, and whom I recently had the pleasure of meeting briefly at a Boston Python meetup. Very interesting guy, also well versed in digital signal processing, another interest shared with myself. Before the other night, I probably hadn't heard the word "cepstrum" in almost twenty years.

Allen's blog is a cornucopia of delicious problems involving Bayes' Law and other statistical delights that I learned to appreciate while taking 6.432, an MIT course on detection and estimation that I'm afraid may have been retired. The online course materials they once posted for it have been taken down.

But imagine my satisfaction upon looking over Think Bayes and realizing that it is the missing textbook for that course! I haven't checked to see that it covers every little thing that was in 6.432, but it definitely covers the most important ideas. At a quick glance, I don't see much about vectors as random variables, but I think he's rightly more concerned with getting the ideas out there without the intimidation of extra mathematical complexity.

Thursday, May 30, 2013

Still plugging away on the FPGA synthesizer

I really bit off a good deal more than I could chew by trying to get that thing running as quickly as I did. A lot of what I'm doing now is going back over minute assumptions about what should work in real hardware, trying to get MyHDL's simulator to agree with Xilinx's ISE simulator (ISE Sim doesn't like datapaths wider than 32 bits) and trying to get the chip to agree with either of them. The chip seems to have a mind of its own. Very annoying.

Anyway I've moved this stuff into its own Github repository so you can show it to your friends and all stand around mocking it without the distraction of other software I've written over the years. So, for as long as it still doesn't work (and with, I hope, the good grace to do it behind my back), y'all can continue with that mocking. Once it actually does what it's supposed to do, all mocking must of course cease.

Saturday, May 18, 2013

My FPGA design skills are a little rustier than I thought

Today I'm going to Makerfaire in the Bay Area. I'd had an idea percolating in my head to use an FPGA to implement fixed-point equivalents of the analog music synthesizer modules of the 1970s, and gave myself a couple of weeks to design and build a simple synthesizer. I'd been a synthesizer enthusiast in high school and college, having attended high school with the late David Hillel Wilson and had many interesting discussions with him about circuit design for synthesizers, a passion he shared with his father. While he taught me what he knew about synthesizers, I taught him what I knew about electronics, and we both benefitted.

Now I have to confess that since my switch to software engineering in the mid-90s, I haven't really done that much with FPGAs, but I've fooled around a couple of times with Xilinx's ISE WebPack software and stumbled across MyHDL, which dovetailed nicely with my long-standing interest in Python. So I ordered a Papilio board and started coding up Python which would be translated into Verilog. My humble efforts appear on Github.

Waveform generator, produces ramp, triangle, and variable-duty-cycle square waves
"Voltage"-controlled amplifier
ADSR envelope generator
My delta-sigma DAC, and a linear interpolator between sound samples to try to reduce aliasing

There was a lot of furious activity over the two weeks before Makerfaire, which I hoped would produce something of interest, and I learned some new things, like about delta-sigma DACs. Being an impatient reader, I designed the delta-sigma DAC myself from scratch, and ended up diverging from how it's usually done. My design maintains a register with an estimate of the capacitor voltage on the RC lowpass driven by the output bit, and updates that register (requiring a multiplier because of the exp(-dt/RC) term) as it supplies bits. It works, but has a failure mode of generating small audible high frequency artifacts particularly when the output voltage is close to minimum or maximum. On the long boring flight out, I had plenty of time to think about that failure mode, and it seems to me the classic delta-sigma design would almost certainly suffer from it too. I think it could be reduced by injecting noise, breaking up the repetitive patterns that appear in the bitstream.

I like Python a lot but I'm not sure I'm going to stay with the MyHDL approach. As I learn a little more about Verilog, it seems like a probably better idea to design directly in Verilog. The language doesn't look that difficult, as I study MyHDL's output, and while books on Verilog tend toward expensive, some of them are more affordable. Those books are on the Kindle, and a couple others are affordable in paper form.

MyHDL-translated designs do not implement Verilog modularity well, and I think it would be good to build up a library of Verilog modules in which I have high confidence. MyHDL's simulation doesn't always completely agree with what the Xilinx chip will do. And while MyHDL.org talks a lot about how great it is to write tests in Python, the Verilog language also provides substantial support for testing. Verilog supports signed integers, but as far I've seen, MyHDL doesn't (this is INCORRECT, please see addendum below), and for the fixed-point math in the synth modules, that alone would have steered me toward straight Verilog a lot sooner had I been aware of it.

It appears the world of Verilog is much bigger and much more interesting than I'd originally thought. I've started to take a look at GPL Cver, a Verilog interpreter that (I think) has debugger-like functions of setting breakpoints and single-stepping your design. I had been thinking about what features I'd put into a Verilog interpreter if I were writing one, and a little googling showed me that such a thing already existed. So I look forward to tinkering with CVer when I get home from Makerfaire.

EDIT: Many thanks to Jan Decaluwe, the developer of MyHDL, for taking the time to personally respond to the challenges I encountered with it. Having had a couple of days to relax after the hustle and bustle of Makerfaire, and get over the disappointment of not getting my little gadget working in time, I can see that I was working in haste and neglected to give MyHDL the full credit it deserves. At the very least it explores territory that is largely uncharted, bringing modern software engineering to the HDL world where (like all those computational chemists still running Fortran code) things have tended to lag behind the times a bit.

In my haste, I neglected the documentation specifically addressing signed arithmetic in MyHDL. I didn't take the time to read the docs carefully. As Jan points out in his writings and in the comment to this blog, MyHDL's approach to signed arithmetic is in fact simpler and more consistent than that of Verilog. What does signed arithmetic look like in MyHDL? It looks like this.

# INCORRECT
>>> x = Signal(intbv(0)[8:])
>>> x.next = -1
Traceback (most recent call last):
...blah blah blah...
ValueError: intbv value -1 < minimum 0

# CORRECT, range is from min to max-1 inclusive
>>> x = Signal(intbv(0, min=-128, max=128))
>>> x.next = -1 # happy as a clam

In the case where MyHDL's behavior appeared to diverge from that of the physical FPGA, my numerically-controlled amplifier circuit above uses one of the hardware multipliers in the XC3S500E, which multiplies two 18-bit unsigned numbers to produce a 36-bit unsigned product. When my music synthesizer was at one point unable to make any sound, I tracked it down to the amplifier circuit, which was working fine in simulation. There was already a hardware multiplier working in the delta-sigma DAC. I poked at things with a scope probe, and scratched my head and studied my code and studied other peoples' code and ultimately determined that I needed to latch the factors in registers just prior to the multiplier. Whether it's exactly that, I still can't say, but finally the amp circuit worked correctly.

I wrongly concluded that it indicated some fault in MyHDL's veracity as a simulator. If it didn't work in the chip, it shouldn't have worked in simulation. But with more careful thought I can see that it's really an idiosyncrasy of the FPGA itself, or perhaps the ISE Webpack software. I would expect to run into the same issue if I'd been writing in Verilog. I might have seen it coming if I'd done post-layout simulation in Webpack, and I should probably look at doing that. Once the bits are actually inside the chip, you can only see the ones that appear on I/O pins.

Monday, March 04, 2013

The Digi-Comp 1 rides again?

As computer people go, I'm rather an old fart, and my favorite childhood toy was this plastic computer, the Digi-Comp 1. See the three horizontal red things that run almost the full width? Those are flipflops, and the window on the left shows whether they are in the zero or one position. The six vertical metal bars in front are AND gates, and the little white tubes stuck onto the pegs on the fronts of the flipflops tell whether that bit is factored into the AND term. The six red plastic things on the top, together with similar stuff on the back, form three OR gates, which drive the values of the flipflops on the next clock edge. The two white sliders on the bottom worked in opposition, providing a hand-powered two-phase clock to drive all this stuff.

Over the past couple of days I placed an order with danger!awesome, a laser cutter shop in Cambridge MA. They have a nifty collection of laser cutters and were happy to hear that design files are available on the Thingiverse website. So I ordered some stuff and picked it up this evening, and that was fun. I had hoped they could make me this marble binary adder, but the designer didn't supply design files they could use. So no marble adding machine for me. Darn.

Thinking about that, my mind inevitably went back to the Digi-Comp 1. I started wondering whether I could build a Digi-Comp 1 using laser cut plywood, like the other trinkets I picked up this evening (a Companion Cube, a desktop trebuchet, a Shrimpbot, and a few little animals). Could that be feasible? The Digi-Comp 1 was basically a programmable logic array, which consists of two rectangular regions, one for AND gates and one for OR gates. On the Digi-Comp 1 these are respectively the front surface and the back surface of the device.

I thought about this for a while and came up with some very incomplete rough sketches to solve the problems of how the gates would work and how the binary values should be latched for one clock cycle. As with the Digi-Comp 1, this would be a rectangular thing with the AND plane on the front and the OR plane on the back. The flipflops would be horizontal bars along the front with two positions (left=0, right=1) and possibly the same window display that appears on the original Digi-Comp 1. The AND gates are vertical bars also on the front, connecting to vertical bars on the back. The OR bars on the back can move left and right if permitted. At a certain point in the clock cycle, the horizontal position of each OR bar is inverted with a little lever and latched as the new position for the corresponding flipflop. There is still a lot of mechanical engineering to think about. Should the bars be retracted with springs or rubber bands? There needs to be a lot of machinery to get everything to move when and where it's supposed to, and there needs to be a crank on the side to drive it. So there will be cams and gears and all sorts of fun stuff.

UPDATE: I found a place called Minds-On Toys that is selling a Digi-Comp kit which reproduces the exact mechanical design of the original. Looks very nice, except for the labor-intensive-looking bit at the bottom about fabricating your own plastic tubes.

Saturday, February 02, 2013

Ruby and Rails and all that stuff

At the suggestion of a recruiter, I'm learning Ruby on Rails this weekend. I was active on the comp.lang.python mailing list when Matz came around talking about Ruby. It seemed like a good thing, but I mostly ignored Ruby for years because it seemed to be solving problems that I already had solutions for with Python. Likewise, Rails seemed to retread the same ground already covered by Django.

Motivated to take another look because of the wild popularity of Rails, I see there's something in Ruby that deserves attention, which is blocks (anonymous closures, or what Lispers would call lambda expressions). They function as closures, and any language that makes a closure a first-class object is a good language. There are a ton of good Ruby tutorials. I myself am partial to Ruby in Twenty Minutes. There are also a good Rails tutorial (and I'm sure there are several others). Another notable thing in the Ruby community is RDoc, an unusually good documentation tool.

I'll be installing Ruby and Rails on an Ubuntu 12.04 machine. I don't like how old the packages are in the official Ubuntu repository, so I'll install from Ruby websites instead. The first thing to do is install RVM with these two commands:

$ curl -L https://get.rvm.io | bash -s stable --ruby

$ source /home/wware/.rvm/scripts/rvm
Later I reversed my decision about the official repositories, when I discovered I could install "ruby1.9.3". What RVM offers is an ability to run multiple Ruby environments on the same machine, like Python's virtualenv.

So now Ruby is installed and you can type "irb" to begin an interactive Ruby session. Next install Rails, and some additional things you'll need soon:

$ gem install -V rails

$ sudo apt-get install libsqlite3-dev nodejs nodejs-dev

Now you can jump to step 3.2 of the Rails tutorial and you should be good to go. Or you can go to the Github repository (README) which I cloned from the railsforzombies.org folks, and that's where I'm going to be tinkering for a while.

When debugging Rails controller code, you'll want to uncomment "gem debugger" in Gemfile, insert "debugger" into your code, and then reload the page in your browser, and it will stop the development server and put you into an interactive shell with all the variables available in mid-process. You'll also have GDB commands like "step", "next", "continue", and breakpoints.

When you're ready to deploy, consider Heroku, a Rails hosting service that lets you use one virtual machine for free. I've deployed my Zombie Twitter app there, and after a few initial bumps, things have gone pretty smoothly.

Here's a custom search for Ruby and Rails:

Saturday, January 26, 2013

Setting up an RDF server in VirtualBox, part 2

This is the second part of a two-part (maybe N-part?) series about setting up an RDF server in a VirtualBox instance with the Ubuntu 12.04 server distribution. In this part, I'll set up Mediawiki as a place to conveniently edit RDF/Turtle documents.

You might be thinking, what about Semantic Mediawiki? Doesn't this already exist? My experience with SMW was disappointing. The source syntax for creating links is pretty straightforward, and the silly naming scheme for importing external ontologies doesn't seem too bad. But when you want to do any real work with external ontologies, it gets difficult. After a few days of hacking around I couldn't find a way to say that a predicate defined in my SMW instance was owl:sameAs some predicate defined externally. At that point, I decided to strike out on my own.

The results of that effort are on GitHub at https://github.com/wware/stuff/tree/master/semantic-wiki. The setup script is to be run in the VirtualBox instance after the Ubuntu 12.04 server installation (with LAMP and SSH servers enabled) has completed.

This is a work in progress. When I've got it beaten into presentable shape, I'll put it up at http://willware.net with more explanatory material.