Tuesday, June 7, 2016

Robopsychology, Part 1: What Goes on in the Mind of a Cognitive System?

Dr Susan Calvin in Asimov’s I, Robot is a robopsychologist.  As Asimov explains in the preface, this means a psychologist of robots, and is not to be misunderstood as a robot who is also a psychologist.  Robopsychology should perhaps be a sub-discipline of Cognitive Psychology, but the latter term is firmly entrenched as relating exclusively to human psychology.

With increasing interest in Artificial Intelligence and cognitive systems, it may be time to lay the foundations for robopsychology.  What can we learn from human psychology about cognitive systems?

I believe that a useful analogy can be based on the work of Nobel-laureate Daniel Kahneman and his late colleague Amos Tversky, as described in Kahneman’s wonderful book, Thinking, Fast and Slow.  They describe two modes of thought, called “System 1” (“fast thinking”) and “System 2”  (“slow thinking”).  System 1 is automatic, instinctive, fast, parallel, and emotional; it enables jumping to conclusions quickly.  In contrast, System 2 is deliberative, conscious, slow, and logical; it involves careful and explicit reasoning.  System 1 is optimized for common cases, and therefore has biases that result in wrong conclusions in unusual situations.  The book explores these biases by presenting a series of problems, and asking the reader to respond immediately.  In most cases, your responses will be wrong; I know mine were.  This is due to the fact that these problems were carefully chosen to expose the various biases of System 1 thinking.  In contrast, System 2 thinking is more accurate, but requires considerably more effort.  It is therefore impossible to use it for every task, or we would be able to perform very little.  Just imagine having to compute mathematically the trajectories of all cars you see before being able to cross the street.
Cognition requires both types of thinking.  For example, you would use fast thinking to zero in on a small number of candidate solutions, then use slow thinking to evaluate these solutions carefully, in order to consciously overcome System 1 biases.  To complete the cycle, you can use slow thinking to direct a new burst of fast thinking, leading to a new evaluation step.

The early days of AI were mostly concerned with the mechanization of slow thinking, and resulted in knowledge-based systems (or “expert systems”), planning systems such as STRIPS, and machine learning based on models (for example, explanation-based learning).  Much of this work was based on the Knowledge Representation Hypothesis, which states that:
Any mechanically embodied intelligent process will be comprised of structural ingredients that (a) we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and (b) independent of such external semantic attribution, play a formal but causal and essential role in engendering the behavior that manifests that knowledge.
According to this hypothesis, in order for us to assign intelligence to a computer system, we need to be able to see how it represents its knowledge explicitly, and how it uses that knowledge to perform tasks that we would consider to be intelligent.  It isn’t sufficient for the task to require intelligence, since machines can use other means (such as computation speed or access to large sources of data) to perform the task without an explicit representation of knowledge.

Knowledge-based technology, originally developed for AI, is widely used today in various industries, such as insurance and finance, and there are several successful commercial and open-source rule systems available for building such applications.  For most AI tasks, unfortunately, knowledge-based systems did not scale well, and were not very successful commercially.

This led to a backlash in AI research (sometimes called the “AI winter”), the abandonment of the Knowledge Representation Hypothesis, and the rise of statistical methods in AI.  These methods, including various forms of neural networks used to support large-scale machine learning, were very successful in many applications such as the deciphering handwriting and speech understanding, which had been very difficult to automate previously.  These methods scale well with increasing computational power, and can solve larger and more difficult problems as more resources become available.

This manifest success of System 1 methods in AI has profoundly changed the field, its methods and goals.  Instead of focusing on understanding, it is now concerned with classification and information retrieval.  These systems learn, but cannot in general externalize what they learn, since it is encoded in a large set of numeric weights in a neural network or similarly opaque representation.

The domain-specific focus of knowledge-based systems was driven by the need to create domain knowledge manually, but the goal was to create universal mechanisms for representing the knowledge, so that different sources could eventually be used together in a single system.  In contrast, statistical methods are now being tuned for each problem separately.  They can work together to the extent that they can be combined through their end results, but they don’t share a common representation.  For example, in this paradigm it is impossible to combine one system that solves geometry problems with another that solves algebraic problems to obtain a system that can solve problems that require both geometrical and algebraic reasoning.  Thus, each of these systems is an island; it can perhaps be pushed further in a single direction, but is unlikely to work with other systems to solve more complex problems (except by combining results rather than mechanisms).

Pat Langley’s paper “The Cognitive Systems Paradigm” (Advances in Cognitive Systems 1, 2012) analyzes this shift in the field of AI and cognitive systems, and proposes six features that distinguish research on fundamental issues of machine intelligence from more recent approaches, which he characterizes as producing “impressive but narrow idiot savants.”  I recommend this paper for anyone interested in the original goals of AI and in getting it back on track.

Friday, July 17, 2015

Programming by Thinking

In my second post (Programmers as Children) I complained about the fact that programmers work with their fingers instead of using their brains.  This is gradually getting worse instead of better, with growing academic research concentrating on the use of examples from the internet, search tools to find such examples for specific questions, and so on.  Of course, there are many websites containing advice for various programming domains, and the examples they provide are often quite useful.  However, copying these into your own code without understanding what they do, and what are the wider implications (for example, on performance and security), is a recipe for disaster.

The April 2015 issue of Communications of the ACM contains several articles that should be required reading for every software developer.  The first is Leslie Lamport's viewpoint article, "Who Builds a House without Drawing Blueprints."  No engineering discipline proceeds to build anything without first analyzing the requirements, making initial designs and analyzing them from various points of view (such as cost, safety, standards, and legal implications) -- in other words, thinking deeply about the problem and possible solutions.  But in the (arguably misnamed) discipline of software engineering, starting with coding is all too often the norm, and is even praised as a methodology.  (I am not implying that any of the agile methodologies advocates that; quite the contrary!  However, agile methods are often misunderstood as "don't think before you code.")

Lamport argues that thinking is necessary before coding, and that writing specifications is an excellent way to focus your thinking.  The specifications need not always be formal, although Lamport makes a good case for using formal specifications, and using tools to check them.  Specifically, TLA+ is a formal specification language created and used by Lamport himself in various projects.

You may be inclined to dismiss this as only academic, and, unfortunately, this is the too-prevalent view.  However, you may learn otherwise if you read another article in the same issue, "How Amazon Web Services Uses Formal Methods."  This article describes how TLA+ was used in several critical Amazon projects, and discovered bugs that the authors admit they could not have found otherwise.  In some cases, Amazon engineers used TLA+ successfully on their projects after two weeks of self-learning.

Of course, getting the specifications right does not mean that the code will be correct as well.  Tool support for verifying the code against its specification would be very helpful, and this is an area of active research.  However, not getting the specifications right is almost a guarantee that the code will have bugs.  This is particularly true for large concurrent systems, which are a source of extremely subtle bugs that programmers have great difficulties understanding and predicting.

And while you have the April issue in your hands, I recommend you also read the article "Security Challenges for Medical Devices."  It discusses safety, security,  and privacy issues with various kinds of medical devices.  Even hospital information systems have such issues, but they are most critical with implanted devices such as pacemakers.  Modern implantable devices communicate wirelessly to provide information they collect, and sometimes they can even be programmed wirelessly.  This creates health hazards from mistakes but also deliberate attacks.  Even devices that only provide information are dangerous if this information is modified anywhere along the way from the device to the physician (including in IT systems that store or transmit this information), since this could lead to the wrong treatment being applied.

Security cannot be added as an afterthought; it must be designed-in from the beginning.  While not all systems have the same security implications as medical devices, the whole chain is as strong as the weakest link, which implies that security is crucial to more systems than would be immediately obvious.  This helps drive home Lamport's point.  Think before you code!

#formalmethods #specifications

Sunday, January 11, 2015

Functional Programming in Mainstream Languages, Part 7: Higher-Order Functions in Haskell

(Jump to Table of Contents)
I will finish the survey of higher-order functions in mainstream languages with Haskell, a fully-functional language.  (There's a lot of talk about the imperative features of Haskell, but don't let that mislead you; Haskell is truly functional.)  Haskell comes from a different tradition, and its notation may take some getting used to.  However, if you are really into functional programming, you can do amazing things with very short Haskell programs, which (unlike APL, for example) are still comprehensible if you understand the idioms.

Of all the languages I surveyed before, Scala is closest in spirit to Haskell, so I will contrast Haskell with Scala instead of Java 8 as I've been doing before.  Unlike all other examples we saw before, type declarations in Haskell are separate from value definitions, and are optional in most cases, since the compiler performs extensive type inference.  In all the examples below I will give the type declarations explicitly, but they are not required.  In fact, the way I produced them was by copying them from the compiler warnings.  They are shown in a different color to emphasize the fact that you don't need to write them yourself.

In Haskell, curried functions are the norm, and the syntax makes it easiest to write curried functions.  In fact, each function in Haskell always has a single argument, although it is possible to simulate multiple arguments using tuples, as I will show below.  But first, here is the curried version:

Scala
def makeAdder = (x: Int) => (y: Int) => x + y def increment = makeAdder(1) def res = increment(4)
Haksell
makeAdder :: (Num a) => a -> a -> a makeAdder x y = x + y increment :: (Num a) => a -> a increment = makeAdder 1 res :: Integer res = increment 4

The most striking syntactic feature is that function arguments do not need to be enclosed in parentheses.  This is actually the norm in the lambda calculus and derivative notations, and is very convenient given that each function can only have a single argument.  It is possible to use lambda notation (which we will see below) to define makeAdder, but it isn't necessary in this case.

The makeAdder function in Haskell, unlike all the other examples we saw before, is exactly equivalent to the function +; both are curried and take one argument.  Thus, the above code is equivalent to the following:

Haksell
increment :: (Num a) => a -> a increment = (+) 1 res :: Integer res = increment 4

The parentheses around the plus symbol are required since by default Haskell binary operators use the common mathematical notation where they have two arguments surrounding the operator symbol.  (This is just syntax, it doesn't change the fact that the functions are curried.)

Looking at the types, from the simplest (the last) to the most complicated (the first), we see that the type of res is simply Integer.  However, if we give increment an argument of a different type, the result will be the corresponding type:

Haksell
real :: Double real = increment 0.5

This is possible since the type of the increment function is parameterized.  Before the double arrow (=>) is a type constraint; we'll look at that in a minute.  The type of increment is a -> a, where a is a type parameter.  This means that the function takes one argument of type a, and returns a result of the same type.  However, the function applies the + operator to its argument (indirectly through makeAdder), and therefore the type a must be appropriate as an argument to +.  This is the purpose of the type constraint (Num a); it restricts a to be a numeric type, which (among other operations) supports +.

Both Integer and Double are numeric types, so the type of increment includes both Integer -> Integer and Double -> Double.  When functions of such parameterized types are applied to actual arguments, it may be possible to deduce an actual type for the type parameters, as happened in the two examples above.

The type of makeAdder is more of the same: a -> a -> a is the type of a function that takes a single argument of type a, and returns a function of type a -> a, that is, a function (like increment) that takes a single argument of type a and returns a result of the same type.  Again, this has the constraint that a must be a numeric type.

As is usual in type theory, the -> type constructor is right-associative, so that the type a -> a -> a is the same as a -> (a -> a).  With curried functions, this is often what you want, since curried functions take one argument and return a function.  Of course, if this argument is itself a function (which is not unusual), you will need to put parentheses around its type.

In contrast, function application binds to the left; for example, makeAdder 1 4 is the same as (makeAdder 1) 4, which is the same computation as our increment 4 above and returns 5.  The reason for making function application left-associative is the same as that for making the type operator right-associative: this is the common case with curried functions.

If you really want the uncurried version of makeAdder, you would use a tuple containing the two arguments.  Tuples are written using the conventional mathematical notation of a sequence of comma-separated values inside parentheses: (1, 4).  Here is the uncurried version, which can simply be called add:

Haksell
add :: (Num a) => (a, a) -> a add (x, y) = x + y

As you can see, the type now represents a function that takes a tuple containing two elements of type a, and returns a result of type a.  A very useful feature of Haskell, common to many other functional languages (and some languages with a strong functional subset, like Scala), is pattern matching.  In Haskell, a formal parameter can have the syntactic structure of a constructor for any type (and, in fact, such constructors can be nested).  In this case, the parameter of add is a constructor for a tuple containing two values, x and y, and has the parenthesized form (x, y).  This means that the function can only accept tuples of this form, and when it does, the first element will be called x and the second will be called y.  This form is deceptively similar to the usual mathematical notation, but keep in mind that this is not natural for Haskell and interferes with various Haskell idioms.  The norm in Haskell is to use curried functions; any other use requires careful justification.

Having gone through the preliminaries, here is the fixed-point function:

Scala
def fixedpoint(f: Double => Double, v: Double): Double = { val next = f(v) if (Math.abs(next - v) < epsilon) next else fixedpoint(f, next) }
Haskell
fixedpoint :: (Ord a, Fractional a) => (a -> a) -> a -> a fixedpoint f guess =   let next = f guess in       if abs(guess - next) < epsilon       then next       else fixedpoint f next

The let syntax binds one or more variables in the context of an expression (which follows the keyword in).  As in Scala, if in Haskell is an expression that returns one of two values depending on the condition.  The parenthesis on the argument to the abs function are required; without them, the expression would be parsed as (abs guess) - next.

The type of fixedpoint is (a -> a) -> a -> a, using only the required parentheses; this could equivalently be written as (a -> a) -> (a -> a), which makes it clearer that fixedpoint takes a function from a to a and returns another such function.  The type a is constrained to be ordered (because of the use of <), as well as fractional (because it is compared with epsilon, which is a Double).

In Haskell, of course, recursion is the norm.  It is not impossible to write this as a loop, but it is too cumbersome to even try.  Instead, I want to show a more abstract version of this function, in which the separate parts are given meaningful names; this is actually closer to the original Scheme example.

Haskell
fixedpoint :: (Ord a, Fractional a) => (a -> a) -> a -> a fixedpoint f guess =   let closeEnough v1 v2 = abs(v1 - v2) < epsilon       try guess =           let next = f guess in               if closeEnough guess next               then next               else try next   in try guess0

Here, the decision of whether the guess is close enough to the desired value is abstracted into the function closeEnough, and the recursion is encapsulated into the function try.  As usual, the internal functions have access to enclosing scopes, so they can refer to f and epsilon.

Lambda notation in Haskell uses the backslash as the ASCII character typographically closest to lambda.  A single arrow is used instead of Scala's double arrow to separate the body from the argument list.  Using this notation, here is the non-terminating version of sqrt:

Scala
def sqrt(x: Double) = fixedpoint(y => x / y, 1.0)
Haskell
sqrt :: Double -> Double sqrt x = fixedpoint (\y -> x / y) 1.0

The terminating version is of course very similar:

Scala
def sqrt(x: Double) =     fixedpoint(y => (y + x / y) / 2.0, 1.0)
Haskell
sqrt :: Double -> Double sqrt x = fixedpoint (\y -> (y + x / y) / 2.0) 1.0

The generalization of the average-damp procedure and its use should hold no surprises now:

Scala
def averageDamp(f: Double => Double) = (x: Double) => (x + f(x)) / 2.0 def sqrt(x: Double) = fixedpoint(averageDamp(y => x / y), 1.0)
Haskell
averageDamp :: Fractional a => (a -> a) -> a -> a averageDamp f x = (x + f x) / 2.0
sqrt :: Double -> Double sqrt x = fixedpoint (averageDamp (\y -> x / y)) 1.0

Haskell takes some getting used to for people used to imperative or object-oriented languages.  However, it you are ready to take the commitment to pure functional programming, Haskell would be the obvious choice.

Remember that we have only discussed functional abstraction up to this point; there is much more to functional programming (even in mainstream languages), and I will address that in future posts. #functionalprogramming #haskell

Tuesday, December 9, 2014

Functional Programming in Mainstream Languages, Part 6: Higher-Order Functions in Xtend

(Jump to Table of Contents)
I introduced the Xtend programming language in a previous post.  Xtend is similar to Java, and compiles to Java (not to bytecode), but is much more pleasant to use.  In particular, it supports functional programming in various ways, both in the language itself and in the libraries.  Some of these will have to wait for later posts in the series; for now, as in previous posts, I am only discussing higher-order functions.  Again, I will contrast Xtend with Java 8.

Here is the simple example of the curried addition function:
Java 8
IntFunction makeAdder = x -> y -> x + y; IntUnaryOperator increment = makeAdder.apply(1); int res = increment.applyAsInt(4);
Xtend
val makeAdder = [int x|[int y|x + y]] val increment = makeAdder.apply(1) val res = increment.apply(4)

The syntax for anonymous functions (lambdas) in Xtend is different from all those we saw before; instead of some kind of arrow symbol, it uses square brackets with a vertical bar separating the parameters from the body.  Type inferencing is applied (as in Scala) to infer the types of the expressions.  Unlike Scala, and like Java, Xtend still requires the method-call syntax for function calls.

The recursive fixed-point function looks like this:

Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) { double next = f.applyAsDouble(v); if (Math.abs(next - v) < epsilon) return next; else return fixedpoint(f, next); }
Xtend
def double fixedpoint(Function1<Double, Double> f, double v) { val next = f.apply(v) if ((next - v).abs < epsilon) next else fixedpoint(f, next) }

This uses the Xtend interface for unary functions, Function1, with the Java wrapper type Double.  In principle, the compiler could have inferred the return type of this function (Scala can do it, see Part 5), but its type-inference mechanism doesn't handle recursion, so it is necessary to specify the type manually.  I hope this will be fixed in a future release.

In this example we can see a nice extension feature, one of those that gave Xtend its name.  Doubles in Java has no abs method; instead, you call the static method Math.abs with the number as a parameter.  This is awkward and places an unnecessary cognitive burden on programmers, who need to remember which methods each object has and which static methods are available for it.  (This is quite common in Java, which has many classes that offer services as static methods; Arrays and Collections are famous examples.)  In general, having different and mutually exclusive ways of expressing what is essentially the same concept is bad.  Xtend allows developers to define "extension methods," which are static methods that can be called like regular methods on the first argument.  In this example, for a double variable x the following are completely equivalent: Math.abs(x), and x.abs.  This equivalence (like many others) is defined in the standard Xtend libraries.

Because this recursive version of fixedpoint compiles in a straightforward manner to Java, tail-recursion optimization is not necesarily applied.  The imperative version is therefore preferred in this case as well:

Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) { double prev = v; double next = v; do { prev = next; next = f.applyAsDouble(next); } while (Math.abs(next - prev) >= epsilon); return next; }
Xtend
def fixedpoint(Function1<Double, Double> f, double v) { var next = v var prev = v do { prev = next next = f.apply(next) } while ((next - prev).abs >= epsilon) next }

A seemingly slight difference between the recursive and imperative versions is the use of the keyword to define variables.  Unmodifiable variables in Xtend (which are translated into Java final variables) are introduced by the keyword val, whereas modifiable variables (non-final) are introduced by the keyword var.  In Xtend it is therefore just as easy to define final variables as non-final ones.  This is a great incentive for writing functional programs, which don't have modifiable variables.  When you write Xtend code, you should always use val rather than var, unless you convince yourself that there is good reason why the variable should be modifiable.  And you should be hard to convince on this point!

At this point you know to expect the non-terminating sqrt; here it is:

Java 8
public double sqrt(double x) { return fixedpoint(y -> x / y, 1.0); }
Xtend
def sqrt(double x) { fixedpoint([y|x / y], 1.0) }

The differences are minor: less type information is required in Xtend, and, like Scala, Xtend doesn't require the access-level designator (public) and the return keyword; however, the braces are still required.

By now you can surely write the terminating version yourselves:

Java 8
public double sqrt(double x) { return fixedpoint(y -> (y + x / y) / 2.0, 1.0); }
Xtend
def sqrt(double x) { fixedpoint([y|(y + x / y) / 2.0], 1.0) }

The general average-damp and the corresponding sqrt again hold no surprises:

Java 8
public DoubleUnaryOperator averageDamp(DoubleUnaryOperator f) { return x -> (x + f.applyAsDouble(x)) / 2.0; } public double sqrt(double x) { return fixedpoint(averageDamp(y -> x / y), 1.0); }
Xtend
def averageDamp(Function1<Double, Double> f) { [double x|(x + f.apply(x)) / 2.0] } def sqrt(double x) { fixedpoint(averageDamp[y|x / y], 1.0) }

As I said before, Xtend is a very pleasant alternative to Java, which has many similarities with Java but many points in which it improves on it significantly.  One of those is its support for functional programming (of which we have seen only one part, higher-order functions).  Java 8 has closed some (but not enough) of this distance, and made some choices that are different from Xtend's.  The most obvious is the lambda notation, but perhaps more important are the use of the functional interfaces, more limited type inference, and the inconsistent use of the method name (apply, applyAsDouble, test, etc.).  As we saw in Part 5, Scala is perhaps more pleasant that Xtend in some ways, but is places itself further away from Java (in many more ways than we have seen).

In summary, if you are now programming in Java and want a closely-related but better language, you should take a good look at Xtend.

#FunctionalProgramming #Xtend

Sunday, December 7, 2014

Functional Programming in Mainstream Languages, Part 5: Higher-Order Functions in Scala

(Jump to Table of Contents)
Scala is a relatively new language; it belongs to the bigger-is-better school of programming languages, and includes many kinds of object-oriented features as well as a functional subset.  Scala captured a lot of interest in the academic community, and for several years it featured in many papers in the leading conferences on object-oriented programming.

Scala attempts (among other things) to combine object-oriented and functional programming in a graceful way.  In this post, I will show how higher-order functions are expressed in Scala, using the same examples as previous posts, and contrast it with the Java 8 implementation.

Scala compiles to Java bytecode, and is therefore compatible with Java code.  It extends the Java type system in ways that make it more consistent.  For example, all values in Scala are objects, so it doesn't have the artificial and annoying distinction between primitive types (int, double, ...) and classes (everything else, including Integer and Double).  Like Java, Scala is strongly typed, but contains extensive type-inference mechanisms so that programmers can avoid a lot of type specifications.

Here is the simple example of the curried addition function:

Java 8
IntFunction<IntUnaryOperator> makeAdder = x -> y -> x + y; IntUnaryOperator increment = makeAdder.apply(1); int res = increment.applyAsInt(4);
Scala
def makeAdder = (x: Int) => (y: Int) => x + y def increment = makeAdder(1) def res = increment(4)

Note that in this example type inference goes in the opposite direction in the two languages: Java infers parameter types from the declaration, whereas Scala infers the function's type from the parameter types and the expression in the body.  Scala infers the following type for the add function: Int => (Int => Int).  We could specify this type manually, but don't need to.

Clearly, Scala's notation for functional types is much nicer than Java's use of multiple interfaces; and the function application notation doesn't require an artificial method call.  In both aspects, Scala notation is similar to the usual mathematical notation.

Here is the recursive fixed-point function:

Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) { double next = f.applyAsDouble(v); if (Math.abs(next - v) < epsilon) return next; else return fixedpoint(f, next); }
Scala
def fixedpoint(f: Double => Double, v: Double): Double = { val next = f(v) if (Math.abs(next - v) < epsilon) next else fixedpoint(f, next) }

These are quite similar, except for the notations for types and for function application, and the fact that Scala doesn't require explicit return statements.

Scala emphasizes recursion, and will therefore perform tail-recursion optimization whenever possible.  If you want to make sure that the function is indeed optimized in this way, you can add the annotation @tailrec to the function (or method).  This will cause the compiler to produce an error if the tail-recursion optimization can't be applied.  In this case, the compiler accepts this annotation, so that this version is just as efficient as the imperative one.  Because recursion is natural in Scala, this would be the preferred way of writing this method.  For comparison, here is the imperative version:

Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) { double prev = v; double next = v; do { prev = next; next = f.applyAsDouble(next); } while (Math.abs(next - prev) >= epsilon); return next; }
Scala
def fixedpoint(f: Double => Double, v: Double) = { var next = v var prev = v do { prev = next next = f(next) } while (Math.abs(next - prev) >= epsilon) next }

This is not too bad; still, I recommend using recursion instead of assignments in all languages that fully support it (that is, guarantee the tail-recursion optimization), including Scala.

Now for the naive non-terminating version of sqrt that uses fixedpoint:
Java 8
public double sqrt(double x) { return fixedpoint(y -> x / y, 1.0); }
Scala
def sqrt(x: Double) = fixedpoint(y => x / y, 1.0)

The differences are small, and have to do with Java's verbosity more than anything else.  Gone are the access-level designator (public), the return type, the return keyword, and the braces.  The Scala version is more concise, and can easily be written in one line.  These are small savings, but they add up over a large program and reduce the noise-to-signal ratio.  Except for Haskell (forthcoming), Scala is the most concise (for this kind of code) of all the languages I survery in this series.

The terminating version of sqrt that uses average damping should now come as no surprise:

Java 8
public double sqrt(double x) { return fixedpoint(y -> (y + x / y) / 2.0, 1.0); }
Scala
def sqrt(x: Double) = fixedpoint(y => (y + x / y) / 2.0, 1.0)

Here is the general average-damp procedure and the version of sqrt that uses it:

Java 8
public DoubleUnaryOperator averageDamp(DoubleUnaryOperator f) { return x -> (x + f.applyAsDouble(x)) / 2.0; } public double sqrt(double x) { return fixedpoint(averageDamp(y -> x / y), 1.0); }
Scala
def averageDamp(f: Double => Double) = (x: Double) => (x + f(x)) / 2.0 def sqrt(x: Double) = fixedpoint(averageDamp(y => x / y), 1.0)

Again, the Scala formulation is concise and clear, while still being strongly typed.  By design, Scala is well-suited for expressing functional-programming abstractions, using type inference effectively in order to reduce the burden of specifying types.  Scala is too large for my taste, but functional programming is easy and natural in it.

#functionalprogramming #scala

Friday, November 28, 2014

Functional Programming in Mainstream Languages, Part 4: Higher-Order Functions in JavaScript

(Jump to Table of Contents)
JavaScript is the only universal language for client-side programming for web applications.  If you need to write anything that will run in the browser, JavaScript is currently your only option.  That is unfortunate, since JavaScript is an amalgam of some good parts and some truly horrible parts.  (See the book JavaScript: The Good Parts for more details.)

Among the good parts of JavaScript is its functional part, which is widely used to create callbacks; but it can also be used for functional-programming techniques in general.  For a fascinating but advanced presentation of the elegant way in which NetFlix uses functional programming in JavaScript to perform asynchronous processing, see the ACM webinar titled "August 2014: Async JavaScript at Netflix."

In this part of the series, I will show the examples of higher-order functions from Part 3 in JavaScript, and contrast it with the Java 8 implementation.  For the Scheme and Java 7 versions, see Part 3.  First, however, is the simple example from Part 2:

Java 8
IntFunction<IntUnaryOperator> makeAdder = x -> y -> x + y; IntUnaryOperator increment = makeAdder.apply(1); int res = increment.applyAsInt(4);
JavaScript
var makeAdder = function(x) {     return function(y) {         return x + y;     } } var increment = makeAdder(1); var res = increment(4);

As you can see, the JavaScript syntax for defining functions is more verbose but not unreasonable.  In contrast, function calls use the familiar mathematical notation without requiring a spurious method call.

Ecma International publishes the standards for the JavaScript language, under the name ECMAScript.  The draft ECMAScript 6 standard (code named Harmony) introduces "arrow functions"; these are very similar to regular JavaScript functions but use an alternative definition syntax (and improve the semantics in small but significant ways).  For comparison, here is the same code using this notation:

ECMAScript Harmony (still in draft status)
var makeAdder = x => y => x + y; var increment = makeAdder(1); var res = increment(4);

This is very similar to Java 8, except for the use of the => symbol instead of ->.  As in Java 8, this is the simplest syntax for defining functions; there are variants for more complex cases.  For example, if the function has more (or less) than one argument, the argument list must be placed in parentheses.  Unlike Java 8, of course, ECMAScript has no syntax to specify the types of the arguments or the returned value.  There are other differences as well; see the proposed draft for further details.

Here is the recursive fixed-point function:

Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) {     double next = f.applyAsDouble(v);     if (Math.abs(next - v) < epsilon)         return next;     else         return fixedpoint(f, next); }
JavaScript
function fixedpoint(f, v) {     var next = f(v);     if (Math.abs(next - v) < epsilon)         return next     else         return fixedpoint(f, next); }

These are very similar, except for the lack of types in JavaScript.  Here is the imperative version of fixedpoint.  JavaScript, like Java, doesn't guarantee that tail-recursion optimization will always be performed, and this version is more natural for JavaScript.  For these reasons, I find this acceptable practice even when using functional-programming techniques in JavaScript, provided that the variables being modified aren't used in any other context.

Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) {     double prev = v;     double next = v;     do {         prev = next;         next = f.applyAsDouble(next);     } while (Math.abs(next - prev) >= epsilon);     return next; }
JavaScript
function fixedpoint(f, v) {     var next = v;     var prev = v;     do {         prev = next;         next = f(next)     } while (Math.abs(next - prev) >= epsilon);     return next; }

Next is the naive non-terminating version of sqrt that uses fixedpoint:

Java 8
public double sqrt(double x) {     return fixedpoint(y -> x / y, 1.0); }
JavaScript
function sqrt(x) {     fixedpoint(function(y) {         return x / y;     }, 1); }

The boilerplate code is starting to get in the way here (although not as much as in Java 7, due to the lack of types).  Still, it's becoming difficult to recognize the syntactic role of the constant 1.  This annoyance should disappear with the release of ECMAScript Harmony.

The terminating version of sqrt that uses average damping implicitly looks like this:

Java 8
public double sqrt(double x) {     return fixedpoint(y -> (y + x / y) / 2.0, 1.0); }
JavaScript
function sqrt(x) {     fixedpoint(function(y) {         return (y + x / y) / 2;     }, 1); }

Finally, here is the average-damp operator and the implementation of sqrt that uses it explicitly:

Java 8
public DoubleUnaryOperator averageDamp(DoubleUnaryOperator f) {     return x -> (x + f.applyAsDouble(x)) / 2.0; } public double sqrt(double x) {     return fixedpoint(averageDamp(y -> x / y), 1.0); }
JavaScript
function averageDamp(f) {     return function(x) {         return (x + f(x)) / 2;     } } function sqrt(x) {     return fixedpoint(averageDamp(function(y) {         return x / y;     }), 1); }

In summary, higher-order functions are quite natural in JavaScript, with a syntax that is somewhat cumbersome but still manageable.  One of the reasons it is simpler than Java 7 is the lack of types, which is a result of the lack of static typing in JavaScript as a whole.  An interesting compromise is the language TypeScript, which supports gradual typing (see my review in a previous post) but compiles into JavaScript.  TypeScript uses the arrow-function notation proposed for ECMAScript, so it allows very concise function definitions that do contain types.  I will address TypeScript in a future post.

Coming up in this series: Scala, Xtend, and Haskell!

#functionalprogramming #javascript

Thursday, November 27, 2014

Functional Programming in Mainstream Languages, Part 3: Higher-Order Functions in Java 7 and 8

(Jump to Table of Contents)
Having seen (in the part 2 of this series) the basic syntax for expressing functions in Java 8, we will now explore some functional-programming techniques and compare how they are expressed in Scheme, Java 7, and Java 8.  The examples are adapted from the book Structure and Interpretation of Computer Programs (2nd Edition) by Abelson and Sussman.

The following function attempts to compute an approximation to a fixed point of a given function f; that is, a value x such that x = f(x).  The computation starts from an initial guess c, and repeatedly applies the function f to it, to get the series f(c), f(f(c)), f(f(f(c))), ....  If this process converges, the result is close to a fixed point, since convergence means that applying f one more time returns (almost) the same value.  However, the process may not converge at all, either because f has no fixed points at all, or because the process oscillates without finding a fixed point.

Here are the implementations of this function in the three languages:

Scheme
(define (fixed-point f first-guess)   (define (try guess) (let ((next (f guess))) (if (< (abs (- guess next)) epsilon) next (try next)))) (try first-guess))
Java 7
public double fixedpoint(UFunc<Double, Double> f, double v) { double next = f.apply(v); if (Math.abs(next - v) < epsilon) return next; else return fixedpoint1(f, next); }
Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) { double next = f.applyAsDouble(v); if (Math.abs(next - v) < epsilon)         return next; else         return fixedpoint1(f, next); }

Functional implementations usually use recursion instead of loops; the Scheme implementation defines an internal recursive function try, which computes the next value in the series and checks whether it is already close enough to the previous one to consider the series to have converged.  If so, it returns the last value in the series; if not, it continues recursively to try the next value.

The two Java implementations are quite similar, except for the types.  For the Java 7 implementation, I created the following interface to represent unary functions from a domain D to a range R:

Java 7
public interface UFunc<D, R> { R apply(D arg1); }

As I explained in the previous post, Java 8 supplies a large set of interfaces to describe functions; DoubleUnaryOperator is the type of functions that take a double and return a double.

There is a fundmemtal difference between functional languages (including Scheme) and most other languages, including Java.  In the former, tail-recursive calls are converted by the compiler into jumps that don't add a frame to the execution stack.  This means that such calls behave like loops in terms of their memory consumption.  (For more details see Abelson and Sussman.)  However, the latter languages don't guarantee this property.  As a result, the Java implementations above may use stack space proportional to the number of applications of the function required for convergence.

A more natural style for implementing these methods in Java is imperative rather than functional:

Java 7
public double fixedpoint(UFunc<Double, Double> f, double v) { double prev = v; double next = v; do {         prev = next;         next = f.apply(next); } while (Math.abs(next - prev) >= epsilon); return next; }
Java 8
public double fixedpoint(DoubleUnaryOperator f, double v) { double prev = v; double next = v; do {         prev = next;         next = f.applyAsDouble(next); } while (Math.abs(next - prev) >= epsilon); return next; }

While functional-programming purists frown at this style, which modifies the values of the variables prev and next, this style is more natural for Java, and uses constant space on the stack.  For those reasons, I find it acceptable when using functional techniques in Java, provided that the changes are confined to local variables (rather than fields), and, in particular, locals that aren't referenced in any other method (such as those in inner classes) or function.

We can now try to use the fixedpoint function to compute square roots, using the fact that y is a square root of x if y = x/y; in other words, if y is a fixed point of the function λy.x/y:

Scheme
(define (sqrt x) (fixed-point (lambda (y) (/ x y)) 1.0))
Java 7
public double sqrt(final double x) { return fixedpoint(new UFunc<Double, Double>() {         @Override         public Double apply(Double y) {             return x / y;         } }, 1.0); }
Java 8
public double sqrt(double x) { return fixedpoint(y -> x / y, 1.0); }

Here we need to create an anonymous function ("lambda"), and here the difference between Java 7 and Java 8 becomes very clear.  In Java 7 we need to create the function as an anonymous inner class, with all the necessary boilerplate code, which obscures the essentials of what we are trying to do.  (For example, it is quite difficult to figure out that 1.0 is the second parameter to the fixedpoint method.)  In contrast, in Java 8 we can just use the new lambda notation, and the meaning is immediately clear.

In both versions of Java, it is only possible to refer in inner methods to variables defined in enclosing constants if these variables are final; that is, if they can't be modified.  In Java 8 the variables doesn't need to be declared final, but it needs to be "effectively final," which means that it is never changed.  (It is as though the compiler added the final modifier itself; this is similar to the type inference done by the Java 8 compiler.)

The reason for this restriction is that local variables have a lifetime (sometimes called "extent") that coincides with the lifetime of the method in which they are defined.  In other words, when the method returns, these variables are no longer accessible.  This is done so that the variables can be allocated on the stack rather than on the heap, avoiding the need for garbage-collecting them.  However, an object containing an inner method that refers to these variables may be used after the method that generated the object has already returned.  If (and only if) the variables are final is it possible to copy their values to the new object so that references to their values are still valid.  This is a poor-man's version of a closure, which is an object containing a function pointer together with all accessible variables in outer scopes.  As we will see in a later post, Scheme (like other dialects of Lisp) supports full closures, which also allow changing bound variables.

Unfortunately, the method above doesn't work in any language.  The problem is that the computational process oscillates between two values and doesn't converge (try it for x=2).  One value is too high, and the other is too low.  However, if we take their average, the process converges; the following implementations all compute the square root:

Scheme
(define (sqrt x) (fixed-point (lambda (y) (/ (+ y (/ x y)) 2.0) 1.0))
Java 7
public double sqrt(final double x) { return fixedpoint(new UFunc<Double, Double>() { @Override public Double apply(Double y) { return (y + x / y) / 2.0; } }, 1.0); }
Java 8
public double sqrt(double x) { return fixedpoint(y -> (y + x / y) / 2.0, 1.0); }

It turns out that this is a general technique, called average damping.  It takes any function f and returns the function λx.(x+f(x))/2; this function has exactly the same fixed points as f.  In certain cases, however, the computational process of the fixed-point function converges with the new function when it diverges on the original function.  This technique is easily specified as a program:

Scheme
(define (sqrt x)(define (average-damp f) (lambda (x) (average x (f x))))
Java 7
public UFunc<Double, Double> averageDamp(final UFunc<Double, Double> f) { return new UFunc<Double, Double>() { @Override public Double apply(Double x) { return (x + f.apply(x)) / 2.0; } }; }
Java 8
public DoubleUnaryOperator averageDamp(DoubleUnaryOperator f) { return x -> (x + f.applyAsDouble(x)) / 2.0; }
Again, the Java 7 version is obscured by the boilerplate code (not to mention the annoying repetitions of the template type).  Here is the definition of sqrt that uses average damping (this time I will spare you the agony of the Java 7 version):

Scheme
(define (sqrt x) (fixed-point (average-damp (lambda (y) (/ x y))) 1.0))
Java 8
public double sqrt(double x) { return fixedpoint(averageDamp(y -> x / y), 1.0); }

Computationally, this is exactly the same as the previous version, except that we use the original function  λy.x/y after applying the average damp operator to it.

In summary, it is possible to use functional abstraction in Java 7, but it is very painful.  The one case where it is widely used (although not usually thought of as such) is the case of callback functions, which are encapsulated as methods in one-method classes.  Java 8 makes higher-order functions much easier to use, even though the cumbersome type system is still annoying and gets in the way.

In subsequent posts we will see how to express the same ideas in other languages.

#functionalprogramming #java