Version 1, last updated by Pete Kirkham at Apr 08 22:06 2009 UTC
Initial Thoughts
First thoughts – 2007-10-14
Having put kin aside for a couple of years, in which I’ve done part of a pure maths degree and read up quite a bit on abstract algebra, possibly become a better programmer, been exposed to few more interpreters than the Lisp, Java and Prolog I was familiar with, and done a silly amount of little benchmarks and experiments, I’m getting the interpreter bug again.
kin is intended for hpc graph/relational problems. Performance critical graph problems can extend from wire layout on the Airbus 380 to messaging your mates on Twitter to querying all Wikipedia articles that contain the word ‘wibble’ and have link distance greater than three from Rowan Atkinson.
The intention is to write an interactive, dynamic compiled language. Experience with large Javascript projects means it needs some modular system, and the ability to specify types. Open, partially typed objects should also be supported. A lot of the code I’ll throw at it will be numerical, so I believe a mix of type inference on basic types – things which can be known to be primitive integer, floating point and arrays/tuples of such – and cached object dispatch. The Self interpreter (and better C++ virtual dispatch systems) uses a technique of firm-coding the control path for the previous concrete object type to prevent an indirect causing a pipeline stall.
Which brings on the other concern – you should be capable of creating really fast code with it. It’s not necessary that fast code is pretty – I’ve worked on projects where I had to rewrite pretty code to be fast code as it was costing the company half a million a year in CPU time to run it. But fast should be elegant, and the language shouldn’t require manual repetition or obfuscation to be fast. Specifically, kin should let you tune the code generation to get fast code from a simple algorithm, rather than having to add large amounts of performance hacks to the algorithmic code itself.
It also should be bootstrappable – both Javascript and Ruby have a philosophy of moving bottlenecks into libraries written C, so the ‘high level’ user code calls the ‘performance critical’ library. The problem I’ve with that is that it means you hit an abstraction barrier below which you can’t optimise your code in the high level language, and most of the problems which take serious amounts of my programming effort are critical (otherwise I’d get a junior to code them) and so such languages don’t help me.
There are some things I have found that I really like, and will attempt to steal:
- ability to choose between dynamic and static typing (common lisp, ECMAScript 4)
- good performance without static typing (Self)
- type inference where performance should benifit (?)
- static metaprogramming (C++, Fortress, UML MDA, XS, http://www.lshift.net/blog/2007/10/15/your-very-own-32-way-simd-machineLT)
- dynamic metaprogramming (common lisp)
- cuteness (scheme, Javascript sometimes, Python) – SICP should look good in it, or at least the concepts in it should have elegant implementations.
- ability to dynamically link between actors/libraries/services (possibly based on XMPP service discovery and RDF based SOA models) with circular dependencies
- built-in pattern matching (Prolog, XSLT, Relational ML, UML transforms)
- built in parser expression grammars for little languages (Prolog has some capabilities, but not exactly PEGs; Pratt’s operator grammars are related)
- compact internal and on-the-wire object representations (ASN.1, C++)
- expressive object representations for debugging – the representation of an object should also be the code which consttucts that object (Lisp)
- concurrency through actors (Carl Hewitt)
- imperative parallelism → implicit parallelism (Fortran, Fortress, MPI)
It also needs to be able to run of multi-nodal computers. I believe that efficient multi-node code should scale down to multi-core code – where the latency is less and you can use copy-on-write – better that multi-core code scale up to multi-node; though that’s more of a hunch than something that I can quite justify yet. Multi-node code is made efficient by efficient partitioning and minimal messaging (Dymola did some interesting model partitioning last time I looked)
Previous kin attempts targetted JVM byte code; I’m fairly convinced that Java’s object model, the way it handles immutability (copy on create, rather than copy on write) and the lack of SIMD and tail call optimisations in the runtime are enough to justify targetting another low-level system; that the JVM only allows for using OS threads and – though Scala shows you can write something to translate actor model code to inversion-of-control Java – I don’t think that can ever be as efficient as a direct complier. Lack of runtime efficiency pollutes code clarity.
(Javascript should have an upper case S, but then it becomes a wiki link)