cdiggins.com

April 30, 2007

Generalized Folds with Multiple Results

Filed under: Everything — cdiggins @ 9:59 pm

A fold is usually defined to take a list, an initial value, and a binary function. The function is applied to the first item in the list and the initial value, the result of which is combined with the first item in the rest of the list and so on. In Cat the type of the fold function is (list ‘a (’a ‘b -> ‘a) -> ‘a). 

For example in Cat we can compute the sum of a list of integers as follows:

define sum : (list -> int) { 0 [+] fold }

Stack-based languages have an interesting characteristic, functions can have multiple results. This means that we can generalize the fold so that instead of only accepting binary functions, it accepts any function taking n arguments and returning n - 1 results.

Let’s call the generalization “gfold” for fold, and give it the type: (’A list (’A ‘b -> ‘A) -> ‘A). Here is how you might use a gfold function to define a split function.

define split : (list (’a -> bool) -> list list)
{
  nil bury nil bury
  [dup] rcompose
  [[swap [cons] dip] [cons] if] compose
  gfold
}

The regular fold can be defined in terms of gfold as simply ”swapd gfold”.

Look for the gfold in the next release of Cat.

April 28, 2007

A Cat Interpreter in JavaScript

Filed under: Everything — cdiggins @ 9:30 pm

I’ve posted a public domain interpreter for a subset of Cat online at http://www.cat-language.com/interpreter.html .

It is intended more as a proof of concept than anything else. Since it is public domain, you can modify it and use it for any purpose you want.

Have fun!

A Definition of Concatenative Languages (refined)

Filed under: Everything — cdiggins @ 6:13 pm

My previous attempt at a formal defintion of the syntax a concatenative language was overly broad. Thanks to William Tanksley for helping me understand the problem with the earlier definition. For an informal description of a concatenative languages see the Wikipedia entry.

A concatenative language is any language that can be described using a concatenative grammar. A concatenative grammar has no non-terminals other than the initial one (S) and always has the following production rules:

  • S = SS
  • S = x
  • S = empty

Where x represents one or more terminals. A non-flat concatenative grammar contains at least one production rule of the form:

  • S = aSb

No other production rules are allowed.  

The semantics of a concatenative language are that each term is a state function (e.g. from a set of stacks to a set of stacks), and the concatenation of two terms implies the composition of those functions.

Cat Presentation to SeaFunc Users Group

Filed under: Everything — cdiggins @ 7:03 am

This Wednesday, on April 25th, I had the good fortune and opportunity to give a talk to members of the Seattle Functional Users Group at Bellevue Library. A big thank you to Ryan Elisei for supplying a projector and Frank Krueger for videotaping it (I’ll announce the video on my blog when it is edited and available for download). 

I was fortunate in that the audience wasn’t shy to ask questions and share their thoughts. I found the discussion both interesting and informative. Many of the questions and comments were very helpful and I feel that the next presentation I give will be much better for it.

The powerpoint slides are available for download here: Powerpoint Slides - Cat Presentation on April 25th to SeaFunc Users Group. Note, that only about 2/3rd of the slides were presented.

 A big thank you to everyone who showed up!

April 27, 2007

What is a Concatenative Language?

Filed under: Everything — cdiggins @ 9:10 pm

The following definition was far too broad. A refined definition is now available at: http://cdiggins.com/2007/04/28/a-definition-of-concatenative-languages-refined/

There is so far, to the best of my knowledge, no formal definition of a concatenative language has been proposed, so here is my attempt at providing a formal definition.

A concatenative language is a context-free language that can be expressed using a concatenative grammar. The defining characteristic of phrases in a concatenative language, is that if A and B are valid phrases in a concatenative language L then the concatenation of A and B is also a valid phrase in L.

And of course we need a definition of a concatenative grammar: 

A concatenative grammar is any context free grammar G, where S = S S is a valid production rule.

Sometimes the term “flat concatenative language” is used, however a flat concatenative language is the trivial language where any combination of terminals is a valid phrase. In other words it is the set of all strings over a given alphabet.

Hopefully this will help lend a bit of formalism to discussions about concatenative languages in the future. Please let me know if you can find any logical flaws with these definitions.

Turing-Complete Two Symbol Regular Languages

Filed under: Everything — cdiggins @ 8:47 pm

A short while back (in February) on the concatenative mailing list Brent Kerby developed a flat two-combinator concatenative language in Joy. This was in response to insightful questions by William Tanksley Jr. This work was inspired by work by Chris Okasaki, J. Fokker and others but followed in the footsteps of Chris Barker’s Jot language which is to the best of my knowledge the first turing-complete two symbol regular language.

I’ll return to Brent’s formulation, since he did all the work of converting the language into a point-free form. Brent’s flat two-combinator basis was:

  [B] [A] q == [[B]] [A B] 
  [B] [A] k == A

In Cat this can be constructed as follows: 

   define q { [dup] dip swap compose [qv] dip }
   define k { swap pop eval  }

The types for q and k expressed using the Cat type system are:

  q : (('A -> 'B) ('C -> 'A) -> ( -> ('A -> 'B) ('C -> 'B)))
  k : (('A -> 'B) ('C -> 'D) -> ('C -> 'D))

What is interesting about these two-combinators is that you can construct virtually everything else you want. The following was adapted from Brent’s post and translated into Cat:

  define pop { [] k }
  define eval { [] q k }
  define qv { [] q pop }
  define swat { q qv k } 
  define dip { qv [unit] qv eval qv eval }
  define dup { [] q qv swat eval } 
  define swap { qv dip } 
  define compose {  swap swat }
  define rcurry { qv swat }
  define curry { swap rcurry }
  ... ad infinitum

Some names have been changed to distinguish between list and function operations in Cat. The following are how some of the Joy functions map to Cat function operations.

  rcurry == cons
  compose == cat
  qv == unit
  i == eval

William Tanksley Jr. also developed an alternative flat two-combinator basis: http://tech.groups.yahoo.com/group/concatenative/message/3197

For those interested in more reading about concatenative languages I’d suggest the following readings:

Determinism, Algorithms and Wikipedia

Filed under: Everything — cdiggins @ 5:38 am

Orcmid asked me recently, is a nondeterministic process algorithmic?

My answer: yes.

To answer more satisfactorily I needed a definition of algorithmic. There doesn’t seem to be a good one lying around. I looked at the definition of an algorithm on Wikipedia, and as is often the case I found it inadequate. Wikipedia states: ”an algorithm is a finite list of well-defined instructions for accomplishing some task which, given an initial state, will terminate in a defined end-state.”

Is is widely acknowledged that not all algorithms halt. So clearly termination is not a requirement of an algorithm.

The article goes on to address this: “Some writers restrict the definition of algorithm to procedures that eventually finish”, but then it boldly contradicts itself by saying “The analysis of algorithms for their likelihood of termination is called Termination analysis.”

That is logical nonsense. If an algorithm terminates by definition, then you don’t need to analyze them to see if they halt, because they do. The algorithm is the trival case: return true. Some may argue that I am picking at semantic details, but the question then becomes, what are you analyzing to see if it halts, since it clearly is not an algorithm. Really they should be saying analyzing computational processes to see if they are algorithms, if they are to be self consistent.

Anyway, this all goes to support my thesis that the modern idiomatic usage of the term algorithm includes computational processes that do not halt, and I have no qualms with that.  

In fact almost all describable computational processes are referred to as algorithms, which as far as I am concerned is an adequate defintion. A formal definition of algorithm is apparently still up for grabs according to  http://research.microsoft.com/~gurevich/Opera/164.pdf.

That paper also describes both “deterministic algorithms” and “nondeterministic algorithms”. If one is to trust the scholarship of the paper, which in my opinion seems sound, then you may be satisfied that nondeterminstic algorithms is not a contradiction.

If it looks like a list, and acts like a list …

Filed under: Everything — cdiggins @ 12:38 am

My friend Frank said he found it hard to wrap his head around Cat infinite lists. The “n” operation, for example, returns a list of all the number from 0 to some upper bound (it is implemented using range_gen). However, an implementation of Cat  is free to implement the list in any way as long as all the list operations are available. It doesn’t even have to allocate a block of memory.

So the creation of the list, mapping the list, getting the size of the list, and getting the nth item can all potentially run in constant time regardless of the size of the list. The following is purely interpreted code without any optimizations:

>> 1000000000 n [2 * 1 +] map count 2 / nth
Time elapsed : 0.00 msec
stack: ( 1, 3, 5, 7, …) 1000000001

Yes that is 0.00 msec (give or take 0.15 msec which is the resolution of the internal timer) .

The next logical question would probably be: what if someone writes code which doesn’t use “n” or “range_gen”. For example:

nil 0 [[cons] keep inc] [dup billion <] while pop

Doesn’t that create a list of a billion cons cells? Well only if you don’t have a optimizer, which the current Cat implementation unfortunately doesn’t. :-(

But since I love theory, lets talk about how an optimizer could identify the pattern as being an opportunity to use a generator instead of the loop. The algorithm is pretty simple (with a fair amount of hand-waving):

  1. Establish that there are no side-effects
  2. Get the initial values: $a=nil and $b=0
  3. Check that $a is a list, and $b is an integer, if not abandon hope.
  4. Compute the relation between iterations of the loop ($a:list $b:int -> (cons($a, $b), inc($b)))
  5. Use that to establish the relation beween consecutive list items: $c = [inc]
  6. Compute the terminating relation in terms of $b if it can (otherwise exit): $d = [1000000000 <]
  7. Rewrite the algorithm as: $b $c $d gen

Now you have 

0 [inc] [billion <] gen

This isn’t quite as efficient though as a “range_gen” function. To go from a “gen” to a “range_gen”, you need to notice that inc is the same as [x +] where x=1. You also note that the terminating condition is a simple comparison: [y <]. Therefore you can easily rewrite a “gen” function with those properties as:

0 y range_gen [x *] map

or in terms of the example: 

0 billion range_gen [1 *] map

which is easily identified as beign the same as:

billion n [1 *] map

and finally it is a trivial observation that [1 *] map is superflous:

billion n

 And that’s that.

April 26, 2007

What is Nondeterminism?

Filed under: Everything — cdiggins @ 11:01 pm

Determinism is the property of a process or algorithm that at any point during the process, given the same state and same input, the next step will always be the same and unique.  Nondeterminism is the absence of this property.

April 25, 2007

On Abstract Interpretation

Filed under: Everything — cdiggins @ 7:51 pm

Below is a post  I made to a discussion thread at Lambda-the-Ultimate. I though I would share it here, because it might help give others  some insight as to how abstract interpretation (AI) relates to the real worlds issues of static analyses and compiler optimizations. 

… I am using the definition of AI as “a partial execution of a computer program which gains information about its semantics without performing all the calculations.”.

This is an algorithmic definition of AI as opposed to the formal mathematical one that I realize that others here are apparently more familiar with: :a theory of sound approximation of the semantics of computer programs, based on monotonic functions over ordered sets, especially lattices.”

“You’re speculating about what ought to be possible, rather than what existing abstract interpretation systems do, correct?”

No not really. What I am saying is based on the observation that many static analsysis and compiler optimizations are actually a form of abstract intrepretation (a point which the OP I believe was trying to get at). These usually yield increasingly helpful partial information while they executes whether or not they halts. Thus they aren’t neccessary to diverge, and can be executed in a non-deterministic manner.

So by extension I am saying that convergence isn’t neccessary for AI to be useful, just like other static analyses don’t have to halt to yield useful information. This really shouldn’t be a contentious statement. It may be very hard to model formally using probabilistic functions and lattices (I wouldn’t be able to) but algorithmically it is pretty trivial stuff.

It should be well-established that to deal with any algorithm that doesn’t guarantee halting, but yields useful partial information, is to simply write a nondeterministic multi-threaded version of the algorithm. By nondeterministic I am suggesting that during partial execution you can deal with branches and loops by spawning new threads. This makes it non-deterministic because, you can just halt (whether or not all threads converge) based on some arbitrary heruistic and you can schedule threads arbitrarily.

What I am suggesting shouldn’t be a new idea to other researchers, and I doubt it is. A quick search on ‘nondeterminsism + “abstract interpretation”‘ yields some apparently relevant papers:

Adhoc non-deterministic AI algorithms already exist in many compilers, it’s just that many programmers aren’t really aware of what it is that they are doing in formal terms. So the usefulness is pretty clear, whether the formal models of AI have reached that point or not.

Finally as for an example of a non-deterministic AI algorithm, consider the identification of whether or not a control path executes.

Next Page »

Powered by WordPress