Sep 5 2009

How to use Pharo/Squeak from the Command-line

Along the way to measure the performance of a Smalltalk implementation for commodity multi-core systems, I tried to use Pharo as a more convenient development platform. Well, and I failed in the first attempt…

To remind myself and document some of the necessary steps in this environment, I wrote up the following tutorial.

Command-line Scripts with a Headless Pharo

For some tasks like benchmarking and automated testing, an integration with other tools comes in handy. For such use cases, Pharo can be used headless, i.e., without its graphical user interface.

This brief tutorial will demonstrate how to use the Debian Language Shootout benchmarks with a fresh Pharo image.

Step 1: Setup Pharo and a Fresh Image

  • download a Pharo image, the sources file, and a VM from the download page
  • extract all archives in the same folder
  • start Pharo, from the commandline, on a MacOSX it should look like this: "Squeak 4.2.1beta1U.app/Contents/MacOS/Squeak VM Opt" \ pharo1.0-10418-BETAdev09.08.3.image

Step 2: Load OSProcess

For output on the shell, we need an extra package from the SqueakSource repository.

It can be loaded by simply executing the following code in a workspace window:

ScriptLoader loadLatestPackage: 'OSProcess' from:
'http://www.squeaksource.com/OSProcess'

To execute this code snippet, select it and press cmd+d or use the “do it” item in the context menu.

Step 3: Load Common Benchmark Code

Now we can load the common parts of all shootout benchmarks into our image.

  • On way to do this is to grab the code shown here
  • and save it to a file called common.st.
  • Open the file browser from the Menu -> Tools -> File Browser.
  • Select common.st and press filein to load the code.

Now you can close all windows in your image and save and quit it.

Step 4: Run a Benchmark

  • Grab the code of a benchmark like Fannkuch
  • Save it to a file named like fannkuch.st
  • Add a run script to the end of the code in fannkuch.st, like this:
        Tests fannkuch.
        SmalltalkImage current snapshot: false andQuit: true.
    
  • Run it with a headless Pharo:
        "Squeak 4.2.1beta1U.app/Contents/MacOS/Squeak VM Opt" \
            -headless pharo1.0-10418-BETAdev09.08.3.image \
            $PWD/fannkuch.st 6
    

    important is here the absolute path to the script, clearly inconvenient and uncommon on the command-line.

Sep 3 2009

Traits Patch Updated, Backported, and Available on GitHub

PHP 5.3 is already released for a while and starts to settle in.

Now it seems to be a good time to philosophize about the future language development of PHP again. To promote this discussion, I backported my Traits-patch to PHP 5.3, fixed some bugs, and used GitHub to publish it. Lets hope that this will allow a painless maintenance of the patch.

For the moment, there is no uptodate standalone patch file anymore, but only the GitHub repository.

For a basic introduction please refer to the RFC (http://wiki.php.net/rfc/horizontalreuse) and the test cases available in the source folder Zend/tests/traits.


Aug 3 2009

Theory and Practice of Language Implementation, part 2

The second part of the summer school was a bit more applied and more in the direction of my own interests. Chandra Krintz talked about managed runtime environments. Yannis Smaragdakis introduced multi-threaded programming and transactional memory. Sumit Gulwani as the third lecturer taught us symbolic bounds computation.

As a cherry on the cake, Oliver Danvy talked about general tips and tricks for PhD students.

Managed Runtime Environments: Implementations and Opportunities

The content-part of her lecture started with an overview of how VMs execute programs. Well, as usual, it was limited to Java, Python, and Ruby. This gives me the feeling, that there are only very few people looking into the similarities between runtimes for functional and imperative languages. She talked about stack-based VM instruction sets and briefly mentioned register-based as well as register-memory-based designs, but mainly referring to CPU instruction set architectures. Interpretation techniques, just-in-time compilation, problems associated with profiling and instrumentation had been discussed, but neither of them in detail.

One topic, she was spending more time on was her own research, i.e., communication between independent JVM instances via shared memory. It was interesting, especially her approach to share classes between the VM instances by using shared pages and direct pointers from the instances.

The last lecture was a broad overview of cloud computing, available techniques and the current challenges, but not really technical.

Multi-Threaded Programming and Transactional Memory

Yannies gave more or less a hands-on introduction to locks and threads with programming examples/exercises. From the poll in the audience, there are actually people, which never have used threads before. I thought this is just one of the assumptions professors usually get wrong, but well… From my point of view, I have heard a lot of it already before, but it was still nice to be reminded of some of the details we discussed. The primitives for monitor-style/traditional shared-memory concurrency he identified were lock/unlock, wait, signal, and broadcast. They allow building any kind of more complex critical sections on top.

The basic points he emphasized are to use while-loops on all conditions and the rule for single signals vs. broadcast signals. In short, a single signal, i.e., a notify, can only be used when all threads waiting on a condition are equal, but only one at a time can make progress. If more then one can make progress, a broadcast, i.e., a notifyAll has to be used to avoid livelocks.

After some typical examples of problems with locks, for instance, locking order leading to deadlocks, he introduced transactional memory. His focus was on the implications of this programming model, the different variations, performance questions, and the problem of irreversible operations.

Art of Invariant Generation applied to Symbolic Bound Computation

The computation of bounds for e.g. loops has a broad range of applications. The knowledge about bounds allows estimating the necessary runtime or memory utilization, thus in general resource utilization. Furthermore, it can be used to identify secrecy properties in terms of information leakage due to repetitive patterns and problems of robustness due to errors and uncertainty for instance introduced by rounding.

After a short introduction and a simple example, he introduced a countless number of approaches to this problem and explained how to apply them to specific problems. His lectures was stuffed with a lot of content, think much more then any other lecturer tried to convey. Moreover, he managed to speak with an incredible speak. Have never heard someone before, putting an infinite number of words into one minute. Sometimes less is more…

Edutainment for the Night Lecture

Oliver was giving a Tips and Tricks lecture about being a PhD student, writing and reviewing papers, and giving talks. Presented in a quite entertaining style :)


Jul 28 2009

Theory and Practice of Language Implementation, part 1

Thanks to the financial support of FWO and a grant by the summer school itself, I was able to go to Eugene, Oregon and learn about theory and practice of language implementation.

The first part of this summer school had three different topics. Ras Bodik presented highly interesting approaches for program synthesis. Oliver Danvy taught us how to use continuations and how to transform a program into continuation passing style. Matt Might tried to explain control-flow analysis(CFA) for high-order programs, starting with 0-CFA(zero-CFA) and k-CFA.

Algorithmic Program Synthesis

The program synthesis lectures were from my perspective the most comprehensible and interesting ones. Ras gave a high-level overview over different approaches to utilize expert knowledge i.e. domain insights to guide the synthesis of algorithms. He presented two different directions: derivative and selective synthesis. The derivative synthesis has its roots in the formal world. The idea is to define a problem with a formal language and to automatically derive a proof for it. As a by-product of this proof, an algorithm to solve the problem is synthesized. This approach is limited mainly by the formal skills of the programmers, and thus, was not successful in a larger scale.

Selective synthesis tries to circumvent these limitations by trying to preserve the usual tools for the programmers as the basic foundation. One example he explained in more detail was SKETCH. Here, the programmer has to describe a space of possible programs and the synthesizer can chose an exemplar of this space by testing/verifying whether it fulfills a specification, which also has to be given by the programmer. An example could be an optimized sort algorithm, which might be tailored to a specific memory model. The programmer implements the basic sketch and leaves some open holes, which needs to be synthesized. She also can give a simple bubble sort as specification, or as an alternative, a checking algorithm, whether the resulting data is sorted. He concluded his lectures with a demonstration of angelic programming, which uses the idea of sketching a solution and automatically deriving an implementation.

Continuations to Go

Oliver was the first lecturer asking us to do exercises. This was a good way to force us to really think about the content of his lectures. At first he taught us the three basic steps to transform a program to continuation-passing style (CPS): 1. name all immediate values, 2. sequentialize immediate evaluations, and 3. introduce the actual continuation. Then, he gave us some exercises, for instance to build the list of suffixes of a list and the list of prefixes. In a language supporting linked lists, the list of suffixes can be realized by sharing the data between all suffixes. The list of suffixes, thus is comprised of pointers pointing to the according element of the original list, where the suffix starts. For the list of prefixes, it is not possible to share the data, but, by implementing it in a continuation-passing style. You can share the computation of the prefixes similar to sharing the data in the case of the suffixes.

Another example, he ask to implement was a convolution of two lists. Thus, the result is a list of pairs of elements of each list, where the second list is reverse beforehand. The interesting thing here is, that it can be implemented in a recursive way with n steps of recursion for lists with n elements. The solution is simple, if you figure out that you can use the return time of a recursion to traverse the second lists, which is equal to a reverse. Later he pointed out, that an implementation of this, using CPS and defunctionalization will result in a plain usual zip/reverse like it would be done in Haskell.

Control-flow Analysis of Higher-Order Languages

Well, and then there was the control-flow analysis (CFA). At the moment, I have not figured out completely how to read the slides, since my Greek is rather limited. What I was able to take away from this course was, that there are several different techniques where 0-CFA is a robust and efficient one. In general, the k in k-CFAs denotes the amount of history, which is considered in the analysis. Usually, only 0-CFA and 1-CFA are applied in practice, i.e. in compilers, since the state space of the analysis explodes very fast. The concrete semantics of a program would be described by an infinite-CFA. He also talked about possible optimizations, like applying garbage collection, to eliminate unreachable states/possibilities during the CFA and thereby increasing the efficiency of the analysis significantly.


Jan 18 2009

New Traits Patch Ready for Testing

Finally, I managed to complete the changes for the traits patch. Since, PHP 5.3 already entered the release cycle and it is not sure whether there will be a PHP 5.4 with new language features, the new patch is written for the PHP6 codebase.

In its current version, it implements the RFC with the proposed syntax and functionality. There are still some parts missing for a complete integration into the PHP landscape. Most noticeable is the lack of reflection support at the moment. I will add it, when the patch has reached a level of stability to be included in the official language.

For testing purpose, you can use the following patch, archive with test cases.

The patch is build for cvs -d :pserver:cvsread@cvs.php.net:/repository checkout php6. More information on how to build the PHP source from CVS is available at php.net.

Patch:

In case of any problems applying the patch on the CVS checkout, you can use this code snapshot with traits: php6-traits.tar.bz2