Datatypes

Read This First

This page is a design scratchpad. Please see: http://clojure.org/datatype for the current documentation.

Clojure issue tracking now lives at http://dev.clojure.org/jira, and the wiki is at http://dev.clojure.org. These Assembla pages are kept online for historical interest only.

 

reify, deftype

This page describes work in progress in the 'new' branch of Clojure. All features are subject to change - feedback is welcome.

Motivation

Important code-generation capabilities are locked in fn.

Several parts of Clojure are written in Java for the ultimate performance - we'd like to be able to write them in Clojure with equal performance.

Clojure is defined in terms of a set of abstractions, currently written in Java. We'd like to be able to define and implement those abstractions in Clojure. The datatype features, when coupled with protocols, will enable that.

Implementation

The basic idea is to harness the code generation of fn, which compiles Clojure code to bytecode and supports lexical closure, and make it available in a form that could create classes of types other than anonymous derivees of IFn. This support comes in the form of 2 new special forms/macros:

1) reify (previously brainstormed as newnew) is the most dynamic. Like proxy, it creates an instance of an anonymous class that implements one or more protocols or interfaces. The method bodies of reify are lexical closures, and can refer to the surrounding local scope. reify differs from proxy in that:

The result is better performance than proxy, both in construction (proxy creates the instance and a fn instance for each method), and invocation. reify is preferable to proxy in all cases where its limitations are not prohibitive.

2) deftype dynamically generates compiled bytecode for an anonymous class with a set of given fields, and, optionally, methods for one or more protocols and/or interfaces. Instances of the resulting type will have a supplied type tag. deftype is suitable for dynamic and interactive development, it need not be AOT compiled, and can be re-evaluated in the course of a single session. deftype is similar to defstruct in generating data structures with named fields, but differs from defstruct in that:

Like defstruct, deftype can generate classes with (optional) support for the IPersistentMap interface, allowing them to be used anywhere that maps can. Overall, deftypes will be better than structmaps for all purposes, especially for defining your own data abstractions.

3) In addition, when deftype is AOT compiled:

AOT-compiled deftype may be suitable for some of the use cases of gen-class, where its limitations are not prohibitive. In those cases it will have better performance than gen-class.

Details

Note - although an attempt is made to keep this up to date, you should always get the definitive documentation on the version you are using via (doc reify), (doc deftype) etc.

reify is a macro with the following structure:

 (reify options* specs*)
where options can be:

:as this-name

and specs are:

protocols-or-interface-or-Object
  (methodName [args*] body)*

methods should be supplied for all methods of the desired interface(s). You can also define overrides for methods of Object.

((str (let [f "foo"] 
       (reify Object 
         (toString [] f))))
== "foo"

(seq (let [f "foo"] 
       (reify clojure.lang.Seqable 
         (seq [] (seq f)))))
== (\f \o \o)

deftype is a macro with the following structure:

(deftype Name [fields*] options* specs*) ;options and specs same as for reify

that does the following:

(deftype Bar [a b c d e])
(def b (Bar 1 2 3 4 5))

b
#:Bar{:a 1, :b 2, :c 3, :d 4, :e 5}

(:c b)
3

(type b)
:user/Bar

(meta (with-meta b {:foo :bar}))
{:foo :bar}

When deftype is AOT compiled:

Prerelease changes under consideration

New ideas/scratchpad

 

 

 

 

 

 

Old ideas/scratchpad

This is a scratchpad for ideas relating to a new datatypes feature. This is not a promise of any feature nor attributes of the feature, this is here to allow for feedback and input.

There are several motivations for datatypes:

Basics

A datatype is a simple definition of a class that includes only fields with some hints.

(deftype org.myns.Foo
  fred
  #^int ethel
  #^String lucy)

The above would produce a class named org.myns.Foo. The generated class would have the following:

Issues

Other ideas:

Associative/Expando support

Indexed support

Universal field identifiers

A problem with classes, IMO, is that each creates its own micro language. Getting methods out (see protocols) is a first step, but field names are still a problem, since each class defines its own scope. The 'name' field of 5 different classes might have the same semantics, or might not. Duck typing on local names is not good enough. Universal field identifiers would allow you to indicate that some field has some more universal agreed upon semantics (e.g. an RDF URI).

Constraints

Equality given mutable fields like arrays, and objects with broken equals()