cdiggins.com

June 30, 2007

My Goal: Naive Programming

Filed under: Everything — cdiggins @ 5:39 pm

I want to be able to program naively.

Elegant code in most languages, is very often considered naive. The most natural, compact, and precise way to express an algorithm is often inefficient. I want to write algorithms, not code. When I write algorithms in some language, very often the resulting code is criticized as sub-optimal, or perhaps the worst insult of all, it is “naive”.

Think about this, after 25 years of writing computer programs in over a dozen languages when I write code in a language that I am not fluent in, the code I write is usually considered naive. You are probably thinking “So what? Maybe you are a mediochre programmer, besides, to use a language you need to spend some time familarizing yourself with the idioms of the language”. Maybe I am mediochre, but after studying so many languages I’ve realized almost all programming languages are the same: “iterate”, “branch”, “define function”, “apply function”, “make collection”, “input/output”, “store intermediate result” and maybe if you are lucky “define module”. After enough experience moving between languages you stop thinking about idioms and syntax, and start thinking in terms of algorithms and abstractions.

So what motivates most programming idioms? Efficiency and safety. Now what does that have to say about a language? It indicates very clearly that the language has a problem, the most elegant or natural way to write some type of code is not the most efficient or safe. Time to start reexamining your language and compiler I’d suggest.

In whatever language I happen to be using I try to write code where it is obvious what the input is, what the output is, what the requirements are, and what the transformation is. I don’t like to think about function inlining, intermediate results, result caching, tail-call optimization, or any of that silliness. This just obfuscates the intent of my program and should be transformed by a compiler. However, what I call elegant code is too often considered “naive”, because a particular language implementations can’t figure out how to transform lucid code into efficient code.

I don’t care if call my code naive, just make sure your language implementation can deal with it.

PS: I’ve followed this up with a new post about naive programming containing some code examples.

June 29, 2007

YARD 1.0 - Yet Another Recursive Descent Parsing Framework for C++

Filed under: Everything — cdiggins @ 4:49 pm

YARD is a C++ framework for constructing recursive-descent parsers from parsing expression grammars (PEG) at compile-time. This technique allows YARD parsers to be extremely fast and require minimal executable overhead. A parser constructed with YARD can automatically construct an abstract syntax tree (by placing “Store” actions in the grammar) and eliminates the need for a scanning phase (also known as a lexing phase).

I’ve been developing, testing, and using YARD for over three years and as far as I am concerned it is now stable. I’ve finally posted an official version 1.0 of the YARD framework to SourceForge. Documentation is not yet complete for the current version, but the source code is only 28k and in 7 files. A great example of YARD in action can be found in the Cat to C++ translator. You can view the YARD grammar for Cat online at: http://cat-language.googlecode.com/svn/trunk/cpp/cat_to_cpp/cat_grammar.hpp. All of the source code for the current version of YARD can also be browsed online in the Cat subversion repository at: http://cat-language.googlecode.com/svn/trunk/cpp/yard/. I should point out that YARD will not work on some older compilers that are not compliant to the C++ specification, so if you are one of those lost souls still using or supporting Visual C++ 6.0 you may need to do some rewriting.

I haven’t compared YARD to other parsing frameworks in a while, but it was significantly faster than Boost.Spirit when I last tried a couple of years ago. I expect YARD to still be much faster, because of the fact that YARD parsers can be very heavily optimized at compile-time by the C++ compiler. YARD also has a much smaller footprint on your executable size and compilation times than Boost.Spirit.

Other people have had positive things to say about YARD, and personally I can’t live without it. YARD is public domain, so hack away to your heart’s content.  

June 24, 2007

Cat Version 0.15

Filed under: Everything — cdiggins @ 9:17 am

A major new release of Cat, version 0.15, has just been posted to Google code hosting. Notes on the recent release have been posted to the discussion group/mailing list.

June 23, 2007

Cat: Now Fully Inferable

Filed under: Everything — cdiggins @ 8:33 am

I  probably shouldn’t blog at 1:00 am, but I have some news to share: I now have a fully operational type-checker and type-inference engine that can handle circular/recursive types (the types of functions that accept themselves as a parameter)! So now even strange little recursive programs have their types inferred correctly, such as the following:


  define dolittle { [dolittle] }
  define m { dup apply } 
  define bind_self { dup bind }

Their types inferred are: 


  dolittle : ('A -> 'A self)
  m : ('A ('A self -> 'B) -> 'B)
  bind_self : ('A ('A self -> 'B) -> ('A -> 'B))

Incidentally, the infamous “omega” combinator (omega == [m] m) is also rejected by the type system. That is good news since omega is an infinite loop.

All of this means that Cat is finally a completely inferrable language: all valid Cat programs can have their types inferred! (… or so the theory goes, I’m sure some bugs will turn up as testing kicks into overdrive). Cat can infer the types of simple programs that most Haskell compilers can’t infer (e.g. try entering the “dolittle” program or the “m” combinator into GHCi and see what happens).

Very few languages have this property. Well to be honest, I don’t know any if I don’t count trivial type systems where everything is the same type or the set of types is small and finite. The Cat type system may appear small , but that is only the core primitives. The set of function types that Cat allows is infinite.

The SVN version is fully operational (hopefully I’m not omitting any files), but I won’t be doing a release until tommorrow. I’ve rewritten the graphics library on top of everything else and I still have to rewrite the graphics demo programs.

June 21, 2007

Comment Meta-Data in Cat using JSON

Filed under: Everything — cdiggins @ 4:09 pm

I am interested in having sophisticated meta-data embedded in the comments of Cat source file. In fact I want my meta-data in comments to contain actual code (like contract specifications, unit tests, and example usage) it really seems like JSON is the best choice.

Informative comments and useful meta-data are especially important for sharing and reusing Cat source code, because the language is so succint and minimalist.  Apart from the usual comment information, comments are a good place to store unit-tests and contract specifications — as in Design by Contract (DbC) — in a language which doesn’t have direct language support for such things. Some programming languages shack the specification to provide support for DbC and unit-tests (the D language for example).  I think hard coding support for processes and techniques in a language, rather than using a separate specification for comment meta-data, is inflexible. C# attributes are a step in the right direction. They are however not powerful nor “literate” enough for my purposes.

Seperating unit-tests and contract specifications from actual code and placing them as meta-data allows your tools to do very strange and wonderful things. Imagine automated testing as you code: a red squiggle in your IDE to highlight test breaking usage. Imagine your IDE telling you how often a unit test passed, and what kind of coverage a segment of code is getting. Imagine your tool telling you when you are violating a contract as you type code. There are tools which do this, but implementing them suddenly becomes much easier and standardized with a good meta-comment specification, along with things I haven’t even dreamed of yet! Imagine, people sharing meta-comment parsing libraries, and if I use JSON there are several dozen libraries already available. I want to see tool developers cooperate and share their work. No more monolithic “Eclipse” projects needed to code some basic automated code testing and analysis.

If anyone else is really excited about this stuff, please contact me, I could really use some help. This kind of project is very doable, but I am already overburdened working on the Cat interpreter and a Scheme to Cat translator.

Tonight’s East-Side SeaFunc Meeting

Filed under: Everything — cdiggins @ 7:44 am

Tonight we had a good turn out for the East-side version of the Seattle Functional Users Group (SeaFunc). The theme of the night was tools and IDE’s, but of course discussion was lively and covered many different topics. We had seven attendees in all, including Jeremy, a software developer from Intentional Software which was particularly appropriate and interesting.

We found out that Charles Simonyi, president/founder of Intentional Software recently published this paper on Intentional Software which appeared in OOPSLA 06. Frank Krueger was also kind enough to bring some printed out notes about common problems he faced with software development. What was interesting was how the responsibility for the issues he mentioned often straddled many areas of software development: language, tools, libraries, process, and programmer discipline. One issue Frank brought up that really caught my eye, and that clearly belonged in the domain of tool support, was the suggestion of unit-testing through example-centric programming. Frank and I also discussed the overlap of unit testing with Design by Contract. Design by Contract, Example-Centric Programming, and Extreme Programming are all bedfellows in my opinion.

I also asked at the meeting about how to best represent meta-data in comments in a source file. The topic of literate programming came up, and afterwards I did some research on documentation generation tools. This led me to believe that I probably want to add Doxygen or JavaDoc style meta-comments to the Cat language. Thanks to Frank for steering me away from XML.

June 20, 2007

Modules as Stateful Blackboxes

Filed under: Everything — cdiggins @ 5:50 pm

Much of Cat research has focused on pure Cat programs. At one point a non-trivial program is going to have to deal with the issue of side-effects.

In Cat all functions with side-effects or that are not referentially transparent are called “impure” and are labeled with “~>” instead of “->”. This is a very general meaning of “~>” which could be refined. I am experimenting with the idea that “~>” carries with it an attachment to a particular module. Modules would be stateful blackbox components of a Cat program. For example: input, output, console, instruction sequence, random number generation, date-time query, network sockets, file system, pipes, etc. Each module would be treated as a new type that is monitored by the type system and attached to the squiggle-arrow. What is relevant for me is that certain modules are hard-linked and others are completely unlinked.

Let’s say I have two modules: A and B. If they are unlinked then any code that affects A must be sequenced in order, and any code that affects B must be sequenced in relative order, but the mutual order is unimportant. Consider the following example:

define fA : ( ~> ) { A.f }

define gA : ( ~> ) { A.g }

define fB : ( ~> ) { B.f }

define gB : ( ~> ) { B.g }

define test : ( ~> ) { A.f A.g B.f B.g }

The following rewrite would be illegal:

define test : ( ~> )  { A.g A.f B.g B.f } // no good

 However the following rewrite would be legal:

define test : ( ~> ) { A.f B.f A.g B.g } // good

This is not implemented yet, but looks very much like a direction I want to take Cat.

Parsing Expression Grammars are Good … but Packrat Parsing is not Neccessary

Filed under: Everything — cdiggins @ 5:12 pm

Parsing expression grammars (PEG) are becoming increasingly popular as a useful tool for implementing flexible and efficient parsers. A new language Katahdin was recently announced with runtime customizable syntax that leverages PEG’s and packrat parsing techniques to achieve this.

Packrat parsing uses memoization to guarantee parsing PEG languages with O(1) complexity. I am not convinced that using memoization is really neccessary. As far as I can see it just adds complexity. I have implemented several PEG based parsers (C++, C#, and even in Heron) which all appear to be O(1) but I haven’t provent it. Perhaps someone will look into this in more detail an with some rigour in the future.

Simple Infinite Loop Detection in Cat

Filed under: Everything — cdiggins @ 5:26 am

It is a law of computability that there is no way to guarantee that all infinite loops in a program can be identified, however detecting any infinite loop is always a good thing. Some infinite loops in Cat can be identified using a very simple algorithm: if a reference to the function occurs outside of any quotation, then we know there is an infinite loop. For example:

define f { a b f c d } // definitely at least one infinite loop

define f { a b [f] c d } // can’t say whether or not there is an infinite loop

This simple observation is due to the fact that we know control flow passes through every word outside of a quotation. Isn’t static analysis of Cat fun? I love Cat and Joy because they are chock full of these really nifty properties.

June 18, 2007

Kitten -> LambdaCat

Filed under: Everything — cdiggins @ 4:37 pm

It looks like in the next Cat release Kitten is going to become LambdaCat. I am adding support for typed lambda expressions to the Cat interpreter, which will remain separate from the core Cat language. It is provided for convenience for anyone converting from other languages into Cat, it is also provided as a teaching tool. If I want Cat as a tool for educators, I simply must include lambda expressions.

This means that Cat functions can be defined with parameters, e.g.

define square(x) { x x * }

And can also contain embedded Lambda expressions

define k(x) { \y.[x] }

This is also intended to support my goals to write translators from Scheme to Cat, and from C to Cat.

Next Page »

Powered by WordPress