I have posted a near final draft (in PDF and PS format) of my paper on the semantics of Cat at http://www.cat-language.com/paper.html, and I wanted to make a public request for comments.
Here is the abstract:
Cat: A Typed Functional Stack Based Language
Stack-based application languages, such as Forth and Postscript, traditionally have provided little or no support for functional programming and have lacked a formal static type system. This changed when Manfred von Thun introduced Joy, a functional stack-based language, and Stephan Becher provided an implementation of StrongForth, a dialect of Forth with an informal type system. This paper introduces the semantics and type system for a pure functional programming language which is entirely stack based.
There is now a new version of the Cat interpreter available for download at http://www.cat-language.com/download.html. I’ve also updated the manual.
Type annotations have been changed to resemble ML more. The new version of the interpreter however doesn’t do type checking or type inference anymore because the type system was rebuilt. Type checking will be reintroduced in version 2.0.
There are hundreds of tests now, and many of the old primitives have been moved into the standard library.
There is now also a document which discusses portability. I am trying to anticipate other people writing their own implementations, and I want to make sure that Cat programs will have some degree of portability between them.
I would welcome any comments or suggestions about features or the documentation. Cat can be a bit of brain-teaser at first, so I want to make sure that I don’t confuse people.
I have been intrigued by two typed-combinators which I haven’t noticed elsewhere that I am considering adding to the Cat primitives. I am calling them producer and consumer.
consumer : (('A -> 'B) -> ('A -> ))
producer : ('A ('A -> 'B) -> 'A ( -> 'B))
The consumer transforms any function into a new function which produces nothing, but consumes what was expected.
The producer transforms any function into a new function which consumes nothing, but produces the result as if it was fed the stack configuration when it was created. Production is like a generalized form of currying.
I wonder whether this would be possible in the general case in Joy? Or Factor, or any other concatenative language for that matter?
One utility is the ability to define a generic dup function. Consider:
define dup_production : ('A ('A -> 'B) -> 'B 'B)
{ producer dup dip eval }
Using this I can write something like:
define abcdabcd { [abcd] dup_production }
At Channel9 on Monday, there was an interesting interview called Software Composability and the Future of Languages with several language design bigwigs at Microsoft: Anders Hejlsberg, Herb Sutter, Erik Meijer, and Brian Beckman.
This is not something that happens everyday. I particularly enjoyed hearing what got them excited, learning what they thought were the important and hard problems, and listening to them interact.
I just found a reference to one of my articles from a presentation by Scott Meyers.
http://www.aristeia.com/TalkNotes/programmer_discretion_jan_2006.pdf
He mentions my article “Programming with Contracts in C++,” Dr. Dobbs Journal, March 2005, http://www.ddj.com/documents/ddj0503f/, as recommended reading on the subject of Design By Contract and assertions.
I’m pleased as punch, this should make good resume fodder. :-)
In a stack language it is a very convenient thing to be able to say “place a copy of the Nth item on the top of the stack”. For example:
5 peek
This makes perfect sense in an untyped language, but in a typed stack language the type of any expression has to be deteminable. This means that the argument to peek has to be a compile-time constant. If you don’t see why this is, consider the following code:
random 5 % peek
This is not well-typed since we can’t determine at compile-time what type this expression will have.
The obvious workaround is to predefine a set of peek functions (e.g. peek0, peek1, …, peek9 or some peekN). However we almost always will run into cases where the number of peek functions are insufficient. The “if I only had one more” scenario.
One possib;e solution would be to make available a special kind of way of passing compile-time constants to functions. For example:
peek$5
I don’t find this very elegant. It leaves me wondering how to pass compile-time constants of other types, such as, strings and floats to instructions. Later I would also want to generalize it to pass compile-time code, etc.
Another solution is to simply allow the compiler to figure out whether parameters are compile-time or run-time determinable. So in other words let the compiler decide that the following is okay:
5 peek
And then the compiler can intelligently reject
random 5 % peek
This is only somewhat satisfactory to me. I will have to give this more thought. The solution might be to enable a macro preparsing phase for dealing with such sticky problems while leaving the core language tight and elegant.
I am writing the standard library and I find that there are certain kinds of weird combinators that I would like to have. What makes them weird is that they modify the behaviour of functions passed on the stack.
For example an interesting function would be one which takes any function, executes it, and then pushes the production again. So the type would be:
double_production : (’A (’A -> ‘B) -> ‘B ‘B)
Another one would be one which eliminates the production altogether:
eliminate_production : (’A (’A -> ‘B) -> )
Combining them is interesting:
double_evaluation : (’A ‘A (’A -> ‘B) -> ‘B ‘B)
{ dup [eliminate_production] dip double_production }
These might be possible from the core primitives. I’ll need to do a bit more investigating.
Consider if we were to define our stack-based (concatenative)
functions as quotations such as:
define Add { [+] }
We can define the open paranthesis, close paranthesis, and comma as follows:
define ( { }
define , { curry }
define ) { curry eval }
This allows us to write expressions like:
Add(2, 3)
Which leaves the value 5 on the stack just as we would like to see.
This is possible now in the most recent version of Cat (0.9.8 which I have posted at http://www.cat-language.com/cat.zip , but haven’t officially announced yet).
A vast number computer algorithms can be expressed in terms of map, filter, and reduce operations.
- Map takes a list and a function and transforms each value by applying the function to each value.
- Filter takes a list and a predicate function (i.e. a function which returns a boolean value) and returns a new list containing only values for which the predicate returns false.
- Reduce takes a list and an associative function which combines two elements to produce a new value. The function is applied consecutively to values in the list pairwise until a single value is reached.
All three functions can be easily implemented in parallel. This observation forms the basis of the Google map-reduce project.
In an intermediate langauge for compiling functional languages, such as Cat, these three operations can be combined to create a single super primitive, which we can call MFR. The MFR primitive is implementation defined to run the algorithms in parallel as makes sense.
This function can be used to implement all three primitives as follows:
- define map { [pop true] [] MFR }
- define filter { [] swap [] MFR }
- define reduce { [] swap [pop true] swap MFR }
Of course it is not a reasonable thing to expect programmers to always use MFR correctly in their programs. We learn to think in terms of map/reduce/filter and using those primitives makes code more readable.
The thing is that it is trivial for programs written in a stack-based language like Cat to be automatically rewritten by the compiler in term of MFR.
This is all fairly straightforward, but AFAIK it differs significantly from the apprach used to compile other functional languages.
I’ve just recently made a rather long-winded post on Lambda-the-Ultimate concerning how to best present the semantics of Cat formally for the paper I am working on. I am confident the resulting discussion will be interesting and illuminating.