cdiggins.com

April 24, 2007

Cat Version 0.11

Filed under: Everything — cdiggins @ 5:27 pm

A new release of Cat, version 0.11, is now ready for download. This is a much more robust release than previous versions, since I am now running extensive unit tests. There have been a lot of bug fixes in the turtle graphics library, below is a screen-shot of the graphics library in action (click it to enlarge):

A lot of the major work in Cat has been behind the scenes, particularly with the list manipulation functions. I’ve spent a lot of time fine-tuning and optimizing the functional list library used by the interpreter. Some of the new features of the list classes include lazy evaluation, and optimized handling of special list cases (e.g. unit lists, empty lists, pair-lists, concatenated-lists, etc.)

Another big new feature is the ability to throw and catch exceptions using the new throw and try instructions.

I’ve also written a brief description about the motivation of Cat at my Artima.com blog.

April 23, 2007

My Functional Lists Design Problem: Continuations vs Traversal Algorithms

Filed under: Everything — cdiggins @ 6:29 pm

A non-trivial issue with the functional lists library I blogged about earlier is partial list traversal. Many common algorithms (such as get the n-th item) are guaranteed to terminate even with infinite lists. This is a flaw in the one ForEach method design I used. The ForEach is guaranteed to traverse the whole list. Not good.

I have found a workaround though, the addition of a CountWhile method. This method takes a predicate and will terminate abruptly when the predicate is no longer true. Alternatively I could have made the accessor passed to ForEach return a boolean value indicating whether to continue or not, or I could have scrapped the whole design and went with continuations (i.e. the yield statement).

The problem with having the ForEach return a boolean, is that it would require all iteration algorithms to have an extra check, whether or not it was needed. I’m not crazy about that, even if the C# compiler was theroetically smart enough to deal with it.

The problem with yield is that continuations can not be optimized as much as passing functions to iteration methods. This is because when calling an iteration method a smart compiler can inline the code. This is easy to force in C++ if we pass functions as template arguments.  

A continuation however is a more flexible construct and can’t be optimized as easily. Of course in theory a really super-smart compiler could identify the usage pattern of a continuation as being the same as passing a function to an iteration algorithm. Unfortunately the C# compiler is years away from this technology.

April 21, 2007

The Enchilada Programming Language

Filed under: Everything — cdiggins @ 5:37 pm

Enchilada is a nifty programming language by Robbert van Dalen based on the notion of everything being a list. I know it sounds like another language we know, but even numbers are lists in Enchilada. It is suprisingly powerful, and can be implemented quite efficiently using lazy evaluation techniques. I am hoping to someday soon create an implementation using Cat.

April 20, 2007

Exceptions and Stack Rollback

Filed under: Everything — cdiggins @ 8:41 pm

Currently I am working on adding type-safe exception handling to Cat. The primitive function for exception handling is simply:

try_catch : (’A try=(’A -> ‘B) catch=(’A ‘c -> ‘B)  -> ‘B)

The type variable ‘c represents the exception data.  

Implementing exceptions means I need to be able to revert the stack back to a previous state. This will be done naively in the Cat interpreter by simply copying the stack and pushing it back if needed. When type checking is reenabled we can compute precisely how deep of the stack needs to be copied, since most of it will remain unchanged.

Functional Lists in C# and in the Cat Interpreter

Filed under: Everything — cdiggins @ 6:12 am

I’ve just finished implementing an optimized set of immutable (functional) list data types for the Cat interpreter (http://cat-language.googlecode.com/svn/trunk/FunctionalList.cs).

The library consists a base list class called CForEach, and a number of specializations of that class. Many operations return a new list of a different type that also inherits from ForEach. From a user standpoint though, you simply write everything in terms of CForEach. For example: if you call TakeN on a CArray you get back a CRangedArray. The classes are:

  • CForEach - the base class.
  • CConcatPair - the result of a concatenation operation.
  • CEmpty - an empty list (there is only one: gNil)
  • CUnit - a list optimized for one item
  • CPair - a list optimized for two items
  • CArray - a list which is implemented as an array (fast indexing, and counting)
  • CRangedArray - a sub-range of a CArray
  • CMappedForEach - the default result of a map operation
  • CFilteredForEach - the default result of a filter operation
  • CConsCell - a naive “cons” implementation of a functional list
  • CRepeater - an infinite list containing the same value over and over
  • CRangeGenerator - a list generated from a function on a range of indexes
  • CGenerator - a list generated from successive application of a generation function (may be infinite).

These classes all provide the following interface:

  • public abstract void ForEach(Accessor a)
  • public virtual bool IsEmpty()
  • public virtual int Count()
  • public virtual object Fold(object init, FoldFxn f)
  • public virtual CForEach Map(MapFxn f)
  • public virtual object Nth(int n)
  • public virtual Object First()
  • public virtual CForEach Rest()
  • public virtual Object Last()
  • public virtual CForEach Filter(FilterFxn f)
  • public virtual CForEach DropN(int n)
  • public virtual CForEach TakeN(int n)
  • public virtual CForEach TakeRange(int first, int count)
  • public virtual CForEach TakeWhile(FilterFxn f)
  • public virtual CForEach DropWhile(FilterFxn f)

Notice that only “ForEach” is abstract. This means that by simply inheriting from CForEach and overriding the ForEach method, you get all of the other functionality for free. I have purposefully left out any generic parameters, for simplicity and the fact that I wanted a universal empty list. Feel free to implement your own.

This file is not bound to any C# module other than the System root module, and is completely uncoupled from the rest of Cat. I am hoping that some people can find the functional lists library useful in its own right.

The current revision of the Cat interpreter source code (155) is also compiling and seems to be working, though I have more testing to do.

April 17, 2007

Embedding Postfix Languages in Haskell

Filed under: Everything — cdiggins @ 10:16 pm

I was recently pointed to the paper “Techniques for Embedding Postfix Languages in Haskell”, from the 2002
Haskell Workshop ( http://www.eecs.usma.edu/webs/people/okasaki/pubs.html#hw02 ) by Chris Okasaki.

It’s an interesting paper that points us down the road of implementing a typed postfix functional language in Haskell. In the paper Chris describes several problems when implementing non-trivial postfix programs in Haskell. I wonder to what degree the issues have been resolved in recent Haskell releases, and whether any clever workarounds are available. Personally I don’t have the interest in fighting with Haskell to get things to work right.

I am looking forward to finishing the Cat type inference engine, so that I can publish the results of a type-inference shootout between Haskell and Cat.
 

Cat Interpreter in Omega

Filed under: Everything — cdiggins @ 10:11 pm

Gabor Greif just shared with my by email a type-checking interpreter for the Cat language written in Tim Sheard’s Omega Language. You can view the source at: http://svn.berlios.de/viewcvs/al4nin/trunk/purgatory/Thrist.omg?rev=308&view=markup.

In other news I am currently working on a JavaScript interpreter (written for web-browsers) for Cat, and I’m planning a new release of the C# Cat interpreter tommorrow.

April 11, 2007

Cat Higher Order Instructions to Assembly

Filed under: Everything — cdiggins @ 7:50 pm

Here are my rough ideas on how assembly instructions might be generated for higher order instructions in Cat: 

Concepts:

  • A function_struct is a structure that contains executable code (whatever size is appropriate for easy stack allocation and manipulation)
  • A return_register is a register reserved to hold return addresses
  • RETURN is a macro for “push return_register, ret” 

 Functions

  • quote - pop a value x from the stack, push a function_struct containing the instructions “push x; return”.
  • eval - pops top of stack into return_register and jumps to stack_top - sizeof(function_struct)
  • [f] - creates a function called _quote_f that pushes a function_struct onto the stack that contains the following instructions “jmp f” where f is the address of the function f.
  • compose - pops two function structs from the stack and creates a new one according to the following rules
    • all “return” instructions are removed
    • the remaining instructions are concatenated and placed in a new function struct which ends with a “return”
    • if the total size is too small a heap allocated version is created and the function struct simply contains a “jmp” to the address of the heap allocated version.

What is left up to the imagination here is how to deal with heap allocated functions when they are no longer needed. A fair amount can be dealt with during static analysis, but in the end some kind of garbage collection scheme is going to be needed.

Type Inference Example in Cat

Filed under: Everything — cdiggins @ 6:52 pm

I am back at work on the type inference engine on Cat. I threw out the old one because the code was just simply atrocious. The observation that stacks and types are different kinds also had an implication on how to design the objects.

I’ve written a paper on the type-system of Cat but it it is very dense and formal. This post is an attempt to explain the type inference algorithm using a step by step example of infering a type for the expression: “[] eval eval” (which in Joy would be “[] i i”). My first step is to ask the question, how do you compute the type of the function:

define ee { eval eval }

The first step is establishing what the type of “eval” is. Using the type notation I developed for Cat, we can write it as:

eval : ('A ('A -> 'B) -> 'B)

That is to say “eval” expects a function on top of the input stack which maps from a stack kind ('A) to another stack kind ('B). Below the function should be the set of types that constitute ('A).  If we want to figure out the type of “ee” we start by writing:

eval : (A (A -> B) -> B) eval : (A (A -> B) -> B)

In this occurence, the A and B in the first occurence of “eval” are not the same as those in the second. In order to avoid naming conflicts we need to give new names to the second occurence:

eval : (A (A -> B) -> B) eval : (C (C -> D) -> D)

When writing ‘eval eval’ we are implicitly composing two functions. This means we need to apply the following type rule for composition:

Given x : (’A -> ‘B) and y : (’C -> ‘D) then x y = (’A -> ‘D) and ‘B = ‘C

So applying the rule to “eval eval” gives us the following type for “ee”:

ee : ('A ('A -> 'B) -> 'D)

and the axiom:

'B = 'C ('C -> 'D)

Substituting for 'B we get:

ee : ('A ('A -> 'C ('C -> 'D)) -> 'D)

This makes sense, the first “eval” requires a function on the top of the stack, the second “eval” also requires a function on the top of the stack, so this implies that the result of calling the first eval must produce a stack with a function on top.

Now let’s consider the following case, which is rather tricky: what is the type of “[] ee”? It should be the same as eval, since the first eval does nothing. Applying the type rules should yield the answer, and will assure that the type we inferred is correct.

The first step is to figure out the function of []. In Cat notation it has the type:

[] : ( -> ( -> ))

In other words it pushes a function on the stack which does not consume any parameters and does not produce any result. If were to fill in the implicit stack variables we would get:

[] : ('A -> 'A ('B -> 'B))

The rule for adding missing stack variables is that for each function if either the production or consumption does not have a stack variable at the bottom (the furthest left) a new stack variable is created and added to both the bottom of the production and the consumption.

Now we can compose this with ee (while renaming conflicting variables):

[] : ('A -> 'A ('B -> 'B)) ee : ('C ('C -> 'E ('E -> 'F)) -> 'F)

Now applying the compose rule we get:

[] ee : ('A -> 'F)

And the axiom:

'A ('B -> 'B) = 'C ('C -> 'E ('E -> 'F))

This directly implies:

'A = 'C

'B -> 'B = 'C -> 'E ('E -> 'F)

'B = 'C

'B = 'E ('E -> 'F)

By deduction :

'C = 'E ('E -> 'F)

And finally : 

'A = 'E ('E -> 'F)

Now we can substitute for 'A :

[] ee : ('E ('E -> 'F) -> 'F)

Renaming 'E to 'A and 'F to 'B gives us the more recognizable:

[] ee : ('A ('A -> 'B) -> 'B)

And voila, the same type as “eval”, just as we hoped.

April 10, 2007

On Higher-Order Opcodes: Machine Code is a Functional Stack Based Language

Filed under: Everything — cdiggins @ 8:25 pm

I’ve started a few threads on higher-order opcodes for intermediate languages.

It seems thought that many people overlook the fact that most machine code is already a functional stack-based language. You can pass instructions as data, and execute data as instructions.

« Previous PageNext Page »

Powered by WordPress