Tuesday, April 15, 2014

Espruino: JavaScript for embedded devices

My last post was really written primarily as background for this post. TL;DR: there is a lot we're doing right nowadays as software engineers, that we were screwing up badly 20 to 30 years ago.

I am moved to write this post because I've been playing with the Espruino board (having pre-ordered one during the Kickstarter campaign), which runs JavaScript close to the metal on an ARM processor.

JavaScript is a cool language because it draws upon a lot of the experience that engineers have gained over decades. Concurrent programming is hard, so JavaScript has a bullet-proof concurrency model. Every function runs undisturbed to completion, and communication is done by shared-nothing message passing. In the browser, JavaScript gracefully handles external events with unpredictable timing. That skill set is exactly what is required for embedded hardware control.

Crockford once referred to JavaScript as "Scheme with C syntax". JavaScript has many of the features that distinguish Lisp and its derivatives, and which set apart Lisp as probably the most forward-looking language ever designed.

This video is Gordon Williams, the creator of the Espruino board, speaking at a NodeJS conference in the UK. One of the great things he did was to write code that can run on several different boards. I'm particularly interested in installing it on the STM32F4 Discovery board because it costs very little and looks pretty powerful as ARM microcontrollers go. There is a Youtube playlist that includes these videos and others relating to the Espruino board.

A few decades' progress in computers and electronics

I got my bachelors degree in EE in 1981 and went to work doing board-level design. Most circuit assembly was wire-wrap back then, and we had TTL and CMOS and 8-bit microcontrollers. Most of the part numbers I remember from my early career are still available at places like Digikey. The job of programming the microcontroller or microprocessor on a board frequently fell to the hardware designer. In some ways it was a wonderful time to be an engineer. Systems were trivially simple by modern standards, but they were mysterious to most people so you felt like a genius to be involved with them at all.

After about 15 years of hardware engineering with bits of software development on an as-needed basis, I moved into software engineering. The change was challenging but I intuited that the years to come would see huge innovation in software engineering, and I wanted to be there to see it.

Around that time, the web was a pretty recent phenomenon. People were learning to write static HTML pages and CGI scripts in Perl. One of my big hobby projects around that time was a Java applet. Some people talked about CGI scripts that talked to databases. When you wanted to search the web, you used Alta Vista. At one point I purchased a thick book of the websites known to exist at the time, I kid you not. Since many websites were run by individuals as hobbies, the typical lifespan of any given website was short.

Software development in the 80s and 90s was pretty hit-or-miss. Development schedules were almost completely unpredictable. Bugs were hard to diagnose. The worst bugs were the intermittent ones, things that usually worked but still failed often enough to be unacceptable. Reproducing bugs was tedious, and more than once I remember setting up logging systems to collect data about a failure that occurred during an overnight run. Some of the most annoying bugs involved race conditions and other concurrency issues.

Things are very different now. I've been fortunate to see an insane amount of improvement. These improvements are not accidents or mysteries. They are the results of hard work by thousands of engineers over many years. With several years of hindsight, I can detail with some confidence what we're doing right today that we did wrong in the past.

One simple thing is that we have an enormous body of open source code to draw upon, kernels and web servers and compilers and languages and applications for all kinds of tasks. These can be studied by students everywhere, and anybody can review and improve the code, and with rare exceptions they can be freely used by businesses. Vast new areas of territory open up every few years and are turned to profitable development.

In terms of concurrent programming, we've accumulated a huge amount of wisdom and experience. We know now what patterns work and what patterns fail, and when I forget, I can do a search on Stackoverflow or Google to remind myself. And we now embed that experience into the design of our languages, for instance, message queues as inter-thread communication in JavaScript.

Testing and QA is an area of huge progress over the last 20 years. Ad hoc random manual tests, usually written as an afterthought by the developer, were the norm when I began my career, and may managers frowned upon "excessive" testing that we would now consider barely adequate. Now we have solid widespread expertise about how to write and manage bug reports and organize workflows to resolve them. We have test-driven development and test automation and unit testing and continuous integration. If I check in bad code today, I break the build, suffer a bit of public humiliation, and fix it quickly so my co-workers can get on with their work.

I miss the simplicity of the past, and the feeling of membership in a priesthood, but it's still better to work in a field that can have a real positive impact on human life. In today's work environment that impact is enormously more feasible.

Sunday, April 13, 2014

The whole Heartbleed thing

Lots of times, these reports are much more about theoretical possibilities than real events. The nature of this vulnerability is that attackers can get peeks at blocks of memory in servers. That memory is changing all the time as the server is doing stuff. It's a possibility that the block of memory happens to have a copy of your password or other information when the attacker grabs it.

Criminals who do these kinds of attacks will act on anything they find very quickly, because they know that victims will respond quickly by changing passwords, shutting off credit cards, etc. They would leave evidence if it had happened in any significant numbers, and there are places you could reliably find that evidence discussed (Bruce Schneier's blog, the EFF website) and the only mention is that certain big government agencies probably exploited the vulnerability, but it appears criminals probably haven't done so. The vulnerability existed for two years before it was publicly announced.

So the NSA has your passwords, but they probably had them anyway. The big question is, what information do you feel you MUST protect? Financials: online banking stuff, or credit card stuff, or your Paypal account, or the online access to your IRA or H&R Block. Medical: your doctor's patient portal website, any logins you have with hospitals or medical centers. Dating websites? Sexual fetish websites? I think I've exhausted the limits of my paltry imagination.

Those would be good passwords to change. You probably don't need to change your Facebook password, unless you're worried that NSA employees will get drunk and trash your Farmville farm.

More information at http://heartbleed.com/ but it tends to run to the rather technical. There is a list of vulnerable sites, and you can test any website using the tool at http://filippo.io/Heartbleed/ to see if it's vulnerable.

Heartbleed is a buffer exploit, illustrated at http://xkcd.com/1354/. You tell the server you want some information which should be X letters long, but you ask for a much larger X, so that you get extra information from the server memory following what you asked for. The extra information might contain passwords and other profitable secrets.

Buffer exploits have been understood for years. What is supposed to happen is that the server's software should reject the too-large X value, but this stuff is programmed by fallible humans. Here is a very good (but pretty technical) explanation of the programming mistake.

Wednesday, December 11, 2013

ZeroMQ solves important problems

ZeroMQ solves big problems in concurrent programming. It does this by ensuring that state is never shared between threads/processes, and the way it ensures that is by passing messages through queues dressed up as POSIX sockets. You can download ZeroMQ here.

The trouble with concurrency arises when state or resources are shared between multiple execution threads. Even if the shared state is only a single bit, you immediately run into the test-and-set problem. As more state is shared, a profusion of locks grows exponentially. This business of using locks to identify critical sections of code and protect resources has a vast computer science literature, which tells you that it's a hard problem.

Attempted solutions to this problem have included locks, monitors, semaphores, and mutexes. Languages (like Python or Java) have assumed the responsibility of packaging these constructs. But if you've actually attempted to write multithreaded programs, you've seen the nightmare it can be. These things don't scale to more than a few threads, and the human mind is unable to consider all the possible failure modes that can arise.

Perhaps the sanest way to handle concurrency is via shared-nothing message passing. The fact that no state is shared means that we can forget about locks. Threads communicate via queues, and it's not so difficult to build a system of queues that hide their mechanics from the threads that use them. This is exactly what ZeroMQ does, providing bindings for C, Java, Python, and dozens of other languages.

For decades now, programming languages have attempted to provide concurrency libraries with various strengths and weaknesses. Perhaps concurrency should have been identified as a language-neutral concern long ago. If that's the case, then the mere existence of ZeroMQ is progress.

Here are some ZeroMQ working examples. There's also a nice guide online, also available in book form from Amazon and O'Reilly.

Saturday, November 23, 2013

What's up with Docker?

I've been hearing about Docker lately. Fair disclosure: I am no Docker expert. I've just tinkered with it a little bit so far and this post is part of my process of getting acquainted with it, and I'll probably update this post if I learn anything noteworthy.

It seems to be an interesting and useful wrapper for Linux Containers, a relatively new feature of the Linux kernel. Docker tries to solve the problem of providing a uniform execution environment across development and production.

In the Python world, virtualenv and pip create a sandbox with specific version numbers for various Python packages, regardless of what may be installed globally on the machine. Docker creates a sandbox of an entire Linux environment, much like a virtual machine, but much lighter-weight than a virtual machine.

Docker instances start in fractions of a second and occupy vastly smaller resources on your laptop or server than a virtual machine would. These days I work on a Macbook and run Linux in a virtual machine, and I'm told it will be practical to simultaneously run multiple Docker instances in the VM. So if you're somebody who's thought it would be fun to write software to run on multiple computers on a network but you haven't had the actual computers available, Docker is for you. I'm interested in Redis as a distributed task queue.

Thursday, October 31, 2013

Some favorite crowd-funded projects

There is a lot going on with crowd-funding these days. Over the past year or so, it has become huge. One might venture to say that it is a significant source of innovation in science, technology and art. Obviously there will be projects that cannot be crowd-funded; it is difficult to imagine a successfully crowd-funded Mars mission or cancer cure. But the space of feasible new projects is vast, and what follow are a few of my favorite crowd-funded projects.

But first, there is a recent noteworthy development from the SEC: previously allowed only to hand out rewards, crowd-funding campaigns will soon be able to award equity too. Equity had thus far been only for accredited investors, people with buckets of spare money in their garages and garden sheds. The possibility of investing in a successful venture rather than simply receiving a toy and a good feeling might make the already-fascinating crowd-funding scene a much more interesting place. It could play an important role in economic recovery.

OCULUS RIFT

The Oculus Rift is a virtual-reality headset representing an enormous improvement in performance-to-price ratio. The head tracking is smooth and the graphics are good. This is one of the first crowd-funded projects I heard about, and the first one I contributed to. For $300, I got a headset with a very entertaining demo, and if I get up the energy I will do something myself with science education.

By getting in early and having a huge success, the Oculus Rift set a precedent for big splashy projects, and probably helped Kickstarter as much as Kickstarter helped Oculus.

CastAR from TECHNICAL ILLUSIONS

CastAR is another virtual reality gadget, this time a pair of glasses that project an image onto a retroreflective surface in front of the user. One big innovation here is that the virtual reality can be mixed with actual reality, for instance using game pieces or other objects. Also, because the user is looking at things some distance away, eye strain is reduced. The head-tracking on CastAR follows both rotation and translation where the Oculus Rift only follows rotation.

BLEDUINO

This is a Bluetooth-enabled Arduino board. Arduino is a cheap easy-to-use controller board for hobbyist projects and art installations. With Bluetooth, whatever you're building can connect to a phone or tablet.

ESPRUINO

The Espruino is another Arduino-based controller board. What's unique is that it is designed to operate with a language called JavaScript, which has been used in web browsers for a long time but has slowly been gaining momentum as a hardware control language.

CHINEASY

This is an instructional program to teach yourself Mandarin. There are flashcards and animations to learn the written characters, and audio materials to learn the spoken language.

STAR TREK: RENEGADES

If you miss the pre-J-J-Abrams Star Trek franchise, this is for you. This movie brings back Walter Koenig (Chekhov from the original series) with several actors from Star Trek: Voyager. It is set ten years after Voyager's return to human space, and politics and hilarity ensue.

PEBBLE SMARTWATCH

Another big success story, the Pebble can now be purchased for $150 at Best Buy. It connects to your phone and can run Android apps on a very small screen. It has a magnetometer (compass), a three-axixs accelerometer, Bluetooth, ambient light sensors, a 144x168-pixel screen, and a week of battery life between charges. It connects via Bluetooth to your phone so the phone can stay in your pocket most of the time.



My long list below includes some projects that were already funded and have gained significant fame, like the Oculus Rift virtual reality headset, or the Pebble smartwatch now available at Best Buy.

Random projects

Phone and tablet

Electronics and computers

Robots and Flying Things

Gaming

Maker stuff

Here's an interesting list of crowd-funding resources: http://crowdfundingforum.com/showthread.php/6748-List-of-All-Crowdfunding-Sites-and-Platforms-Ever-Expanding

The shortened URL for this post is http://goo.gl/ehfBQb.

Friday, October 25, 2013

Bar Camp Boston 2013 talk on automation of science

This is an outline for a talk I gave at Bar Camp Boston 8 on the automation of science. It's a topic I've blogged and spoken about before. The shortened URL for this post is http://goo.gl/rv3Xik.

In 2004, a robot named Adam became the first machine in history to discover new scientific knowledge independently of its human creators. Without human guidance, Adam can create hypotheses to explain observations, design experiments to test those hypotheses, run the experiments using laboratory robotics, interpret the experimental results, and repeat the cycle to generate new knowledge. The principal investigator on the Adam project was Ross King, now at Manchester University, who published a paper on the automation of science (PDF) in 2009. Some of his other publications: 1, 2, 3.

Adam works in a very limited domain, in nearly complete isolation. There is plenty of laboratory automation but (apart from Adam) we don't yet have meaningful computer participation in the theoretical aspect of scientific work. A worldwide scientific collaboration of human and computer theoreticians working with human and computer experimentalists could advance science and medicine and solve human problems faster.

The first step is to formulate a linked language of science that machines can understand. Publish papers in formats like RDF/Turtle or JSON or JSON-LD or YAML. Link scientific literature to existing semantic networks (DBpedia, Freebase, Google Knowledge Graph, LinkedData.org, Schema.org etc). Create schemas for scientific domains and for the scientific method (hypotheses, predictions, experiments, data). Provide tutorials, tools and incentives to encourage researchers to publish machine-tractable papers. Create a distributed graph or database of these papers, in the role of scientific journals, accessible to people and machines everywhere. Maybe use Stackoverflow as a model for peer review.

Begin with very limited scientific domains (high school physics, high school chemistry) to avoid the full complexity and political wrangling of the professional scientific community in the initial stages. As this stuff approaches readiness for professional work, deploy it first in the domain of computer science and other scientific domains where it can hope to avoid overwhelming resistance.

Machine learning algorithms (clustering, classification, regression) can find patterns in data and help to identify useful abstractions. Supervised learning algorithms can provide tools of collaboration between people and computers.

The computational chemistry folks have a cool little program called Babel which translates between a large number of different file formats for representing molecular structures. It does this with a rich internal representation of structures, and pluggable read and write modules for each file format. At some point, something like this for different file formats of scientific literature might become useful, and might help to build consensus among different approaches.


A treasure trove would be available in linked patient data. In the United States this is problematic because of the privacy restrictions associated with HIPAA regulation. In countries like Iceland and Norway which have universal health care, there would be no equivalent of HIPAA, and those would be good places to initiate a Linked Patient Data project.

Thursday, October 17, 2013

The first neon sign I've ever wanted to own

This sign appears in the Cambridge UK office of Autonomy Corporation. I want one. I need to talk to the people who make neon signs. There are a few online threads (1, 2) where people express curiosity about this sign.

This equation is Bayes' Law. Thomas Bayes (1701-1761) proposed it as a way to update one's beliefs based on new information. I saw this picture on a blog post by Allen Downey, author of Think Bayes, and whom I recently had the pleasure of meeting briefly at a Boston Python meetup. Very interesting guy, also well versed in digital signal processing, another interest shared with myself. Before the other night, I probably hadn't heard the word "cepstrum" in almost twenty years.

Allen's blog is a cornucopia of delicious problems involving Bayes' Law and other statistical delights that I learned to appreciate while taking 6.432, an MIT course on detection and estimation that I'm afraid may have been retired. The online course materials they once posted for it have been taken down.

But imagine my satisfaction upon looking over Think Bayes and realizing that it is the missing textbook for that course! I haven't checked to see that it covers every little thing that was in 6.432, but it definitely covers the most important ideas. At a quick glance, I don't see much about vectors as random variables, but I think he's rightly more concerned with getting the ideas out there without the intimidation of extra mathematical complexity.