Enhanced Primitive Support
History Key
- New content
Removed content
Recent Versions
Choose two versions to compare, or click the link to view it.
This document describes work in progress in the prim, num, equal and equiv branches. This work is not promised to become part of Clojure, as is or otherwise, now or in the future. The purpose of this document is to explain the work and encourage discussion.
Update redux: equiv, the revenge of num
The latest revision is in the equiv branch. It takes the approach of num, with no auto-promotion, and bigint contagion, but adds several things to better support contagion:
- Uses equiv for =
- but narrows equiv, only same category numbers are =
- so floats not = to integers
- but == still available for that
- new clojure.lang.BigInt class
- as with num, BigInts do not auto-reduce
- BigInts will enable optimizations when fits in long
- optimzations not yet in place
- unlike BigInteger, BigInt has consistent hashcodes with Long, through range of long
- BigInts will enable optimizations when fits in long
- all long-or-smaller integers are boxed as Long
- as with num, you can’t use primitive coercions to force specific box types
- hash maps and sets use = for keys
- this will make maps and sets unify 42 and 42N, since =
- will use stricter .equals when calling through java.util interfaces
- not there yet
- this will require renaming Associative.containsKey, since semantics will now differ
- +’ etc remain available when auto-promotion is required, but they are not needed for polymorphic arithmetic
Re-Update, alternate universe
In an attempt at fairness in evaluation, this version reverses the defaults. If you want arbitrary precision you must use the ‘prime’ versions of the ops.
The current plan is implemented in the equal branch. It includes changes from what is below, which should be looked at for facts other than in the areas stated here. I’ll reorganize soon.
- BigIntegers will fully reduce in all cases.
- The plan is to offer 2 sets of ops for the few ops that might overflow: +, -, *, inc, dec.
-
+, -, *, inc, dechave been enhanced. They will never auto-promote on integer overflow, and will instead throw an exception. Thus, the semantics of these operators are now uniform when used with boxed or primitive arguments. They can return integer primitives.
- ’ is now a constituent character. That means it may appear as a character in symbols in other than the first position, as does #.
-
+', -', *', inc', dec'For all arguments other than long-or-smaller integers, they do exactly what their non-prime counterparts do. For long-or-smaller integers, they will auto-promote in all cases, even when given integer primitives. As such, they will never return integer primitives. They can and do return double primitives when given doubles, as the semantics match. Unless you need to work with arbitrary-precision integers, there is no reason to use them.
- type-specific equality semantics remain in place
- literal numbers are always implicitly primitives, and never need (long 42) style coercion except for host interop overloading disambiguation.
- Note: this means that locals initialized with literals will have primitive type, e.g. (let [x 42] …), and especially: (loop [x 42] …). If you intend to recur with a non-primitive, init like this, (loop [x (num 42)] …)
;;;;;;;;;;;;;;;;;;;;;;;;;
;;equal branch, as of:
;;http://github.com/richhickey/clojure/commit/310534b8e7e7f28c75bb122b4bf1bee320cdae67
(defn fib [n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 3565.579 msecs"
;; hint arg and return
(defn ^:static fib ^long [^long n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 395.365 msecs"
;Use promoting op' ops for arbitrary precision
(defn fib [n]
(if (<= n 1)
1
(+' (fib (dec' n)) (fib (-' n 2)))))
(time (fib 38))
"Elapsed time: 3705.123 msecs"
;; double arg/return
(defn ^:static fib ^double [^double n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 860.326 msecs"
;; double arg/return, double operands
(defn ^:static fib ^double [^double n]
(if (<= n 1.0)
1.0
(+ (fib (dec n)) (fib (- n 2.0)))))
(time (fib 38))
"Elapsed time: 428.116 msecs"
Update
The current plan is implemented in the equal branch. It includes changes from what is below, which should be looked at for facts other than in the areas stated here. I’ll reorganize soon.
- BigIntegers will fully reduce in all cases.
- The plan is to offer 2 sets of ops for the few ops that might overflow: +, -, *, inc, dec.
- +, -, *, inc, dec will auto-promote in all cases, even when given integer primitives. As such, they will never return integer primitives. They can and do return double primitives when given doubles, as the semantics match. Thus, the semantics of these operators are now uniform when used with boxed or primitive arguments.
- ’ is now a constituent character. That means it may appear as a character in symbols in other than the first position, as does #.
- +’, -‘, *’, inc’, dec’ have been added. For all arguments other than long-or-smaller integers, they do exactly what their non-prime counterparts do. They differ in that they will not auto-promote on integer overflow, and will instead throw an exception. They can return integer primitives. Unless you need the aforementioned performance benefit, there is no reason to use them. In particular, it is not necessary to use them for doubles at all.
- type-specific equality semantics remain in place
- literal numbers are always implicitly primitives, and never need (long 42) style coercion except for host interop overloading disambiguation.
- Note: this means that locals initialized with literals will have primitive type, e.g. (let [x 42] …), and especially: (loop [x 42] …). If you intend to recur with a non-primitive, init like this, (loop [x (num 42)] …)
;;; equal branch, as of
;;http://github.com/richhickey/clojure/commit/7652f7e935684d3c7851fbcad8ddce97e510a5a6
(defn fib [n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 3530.085 msecs"
;; hint arg and return
(defn ^:static fib ^long [^long n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 1238.83 msecs"
;hint arg and return, non-promoting op' ops
(defn ^:static fib ^long [^long n]
(if (<= n 1)
1
(+' (fib (dec' n)) (fib (-' n 2)))))
(time (fib 38))
"Elapsed time: 407.476 msecs"
;; double arg/return
(defn ^:static fib ^double [^double n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 804.862 msecs"
;; double arg/return, double operands, note does not use op'
(defn ^:static fib ^double [^double n]
(if (<= n 1.0)
1.0
(+ (fib (dec n)) (fib (- n 2.0)))))
(time (fib 38))
"Elapsed time: 480.442 msecs"
Objective – Support primitives as fn arguments and return values
- Clojure supports primitive arithmetic locally, and primitives in deftype/defrecord
- But not in fn arguments or returns
- Means you can’t break up your logic, or write helper functions, without boxing
Issues
- fn is specified by an interface (IFn), taking/returning objects
- fns must still satisfy that interface
- How will callers know about primitive params/return, and how to invoke?
- New :static support
:static
- defn supports {:static true} metadata
- :static fns can take/return longs and doubles in addition to Objects
- compiler will compile static methods in addition to IFn virtual methods
- When compiling direct invocation of :static fn, compiler will look at var metadata and make call to static method
- Note that because it is not going through the var at runtime, will not see any dynamic modifications to the var
- thus ‘static’
- Caller won’t see redefs, bindings etc
- Like macros, recompile client code to see changes
- Note that because it is not going through the var at runtime, will not see any dynamic modifications to the var
- The var still contains IFn object for HOFs etc
- and you can get dynamic call by using (#’foo …)
Features to support :static and primitive args/returns are introduced in the prim branch
- New ^:keyword metadata
- turns into ^{:keyword true}
- Metadata stacks – ^:static ^:private foo == ^{:static true :private true} foo
- ^long and ^double hints allowed on args
- other reference type hints allowed as well
- unlike for non-statics, these will be enforced
- other reference type hints allowed as well
- hint for return goes on arg vector
- e.g. (defn ^:static foo ^long [x] …)
- this so it supports multi-arity with varied returns
- Static linking – replaces direct binding
Static linking
- Eliminates the overhead of keeping fns in vars
- Normally not an issue, but for small functions vars have 2 overheads:
- volatile can cause cache miss
- volatile inhibits inlining
- Useful even when not using primitive args/returns
- Known limitations
- static fns can’t be closures
- temporary limitation
- static fns can’t be protocol call sites
;;;;;;;;;;;;;;;;;pre-prim master
(defn fib [n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 3798.641 msecs"
;with type hints
(defn fib [n]
(let [n (long n)]
(if (<= n (long 1))
(long 1)
(+ (fib (dec n)) (fib (- n (long 2)))))))
(time (fib 38))
"Elapsed time: 2548.311 msecs"
;;;;;;;;;;;;;;;;; prim branch
(defn fib [n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 3423.861 msecs"
; add primitive arg/return type
(defn ^:static fib ^long [^long n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 3701.73 msecs" ;no benefit, still boxed math
; primitive arg/return type plus hints for primitive math
(defn ^:static fib ^long [^long n]
(if (<= n (long 1))
(long 1)
(+ (fib (dec n)) (fib (- n (long 2))))))
(time (fib 38))
"Elapsed time: 437.96 msecs"
Objective – Unify primitive and boxed integer semantics
- Currently, boxed integer arithmetic promotes to bigint as needed
- but primitive arithmetic throws on overflow
- Means that you always have to be explicit about primitives
- difficulty with representation choices
- difficulty when optimizing due to change in semantics
- prohibits the compiler from presuming, e.g. that literals can be used as primitives
- Would like to have literal numbers be used as primitives automatically, when possible
Features to support unified integer semantics are introduced in the num branch
- Includes all of the features of the prim branch
- Unifies on the semantics of primitive longs, throw on overflow
- For bigints, specify bigints
- new literal bigint format 42N
- Polymorphic boxed numeric tower still in place
- bigints contagious, no more auto-reductions
- (\+ 0N 21 21) => 42N
- Literal non-big numbers are now treated implicitly as primitives (long or double)
- Locals known to be non-big numbers (i.e. inits are either literals or typed returns) are automatically coerced to long or double
- More automatic conversions in argument matching
- int matches long, float matches double
- longs also match ints, and doubles match floats
- long→int is checked for fit
- Consistent canonic Integer/Long boxing
- If it fits in int, is Integer, else Long
- Can’t force boxed representation from primitive coercion:
- (class (long 42)) => Integer
- it was always wrong to presume otherwise
- ditto boxed values coming from arrays of primitives
- and primitive returns from interop
- (class (long 42)) => Integer
;;;;;;;;;;;;;;;;; num branch
(defn fib [n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 3592.971 msecs"
; add primitive arg/return type
(defn ^:static fib ^long [^long n]
(if (<= n 1)
1
(+ (fib (dec n)) (fib (- n 2)))))
(time (fib 38))
"Elapsed time: 392.605 msecs"
Objective – Get rid of equivalence semantics for numeric =
- Current equivalence semantics for numbers introduced early on when Clojure didn’t do consistent boxing
- So people had (Integer 42) and (Long 42) etc due to the results of computation
- Not .equals
- Solution at the time was to make = more lenient
- this is not a good solution
- leaks into = for collections that might contains numbers
- already a semantic mismatch with key comparisons, which Java dictates must be based on .equals
- complicates the logic in =, a heavily used function
- Consistent boxing in num branch opens the door to fixing this
Features to support type-sensitive numeric equality are introduced in the equal branch
- Includes all the features of the num branch
-
= on numbers now includes the type
- matches .equals
- so no more special handling on collections either
- Use == for numeric equivalence
- or better yet, be consistent about types
- uniform boxing should take care of most of that
(doc =)
-------------------------
clojure.core/=
([x] [x y] [x y & more])
Equality. Returns true if x equals y, false if not. Same as Java
x.equals(y) except it also works for nil. Boxed numbers must have
same type. Clojure's immutable data structures define equals() (and
thus =) as a value, not an identity, comparison.
Summary
- New features enable much faster code
- Unified semantics reduce complexity
- All features other than auto-promotion still available
- Optimization more automatic, less work, less subtle