Review of three Functional Programming papers

Download 60.54 Kb.

Date	28.04.2018
Size	60.54 Kb.
	#41813
Type	Review

Comp432 Essay 1 – A review of three Functional Programming papers.

Daniel Ballinger

300041839
The three separate reviews look at the following papers in regard to functional programming.

Simon Thompson: Higher-order + polymorphic = reusable, unpublished note. [1]
Paul Hudak, Mark P. Jones: Haskell vs Ada vs C++ vs Awk vs …, An Experiment in Software Prototyping Productivity. Unpublished report. Yale University Department of Computer Science, July 4 1994. [2]
Henk Barendregt: The Impact of the Lambda Calculus in Logic and Computer Science, Bulletin of Symbolic Logic, 3(2):181-215, 1997. §§ 1 – 3. [3]

Review of Simon Thompson’s unpublished note, ‘Higher-order + polymorphic = reusable’.

As the title suggests, the driving motivation behind this paper is to show that the dual features of higher-order functions and polymorphism (or generics, in other terminology) support a programming style that encourages re-use. To prove this concept, Simon Thompson shows that using a higher-order, polymorphic interface signature can describe the essential properties of various types, and, using this approach, a programmer can treat one type as if it was another. This form of data abstraction supports a general ‘algorithm oriented’ programming style as introduced by David Musser and Alexander Stepanov in Algorithm-Oriented Generic Libraries.
Thompson starts by giving an introduction to functional programming using Miranda^¹, which supports both higher-order functions and polymorphism. Here, and throughout the paper, he makes good use of concrete code examples to clearly demonstrate the ideas being discussed.

The basics of functional programming are touched on including: selection via alternatives, the local definition of functions, lists, and patterns for case analysis. Even with this introduction, a prior basic understanding of functional programming and its idioms would benefit the reader in following the examples.

With the basic introduction out of the way, Thompson demonstrates how both higher-order functions and polymorphism can encourage and support re-use independently.
Firstly, higher-order functions are used to abstract over a particular function showing that the resulting (more abstract) function can be reused in many situations, both in similar and completely different contexts. He points out how abstracting a particular sort of behaviour from a function increases it generality.
With polymorphism, an example of a generic and a polymorphic function are given. In the former of the two functions, the example given works on lists, and, as it contains no reference to the type of objects in the list, it is applicable to a list of any type. The later function introduces type variables, which can produce parametric polymorphism. So, by itself, polymorphism supports re-use in a number of ways.
Thompson then proceeds to demonstrate how together higher-order functions and polymorphism are general functions of considerable power. To do this he revisits a map function from the higher-order section and uses type variables to increase the generality of the function. Hence giving two dimensions of generality: the type of list is arbitrary (parametric polymorphism) and the operation applied to each element of the list is an argument (a higher-order function). He then gives a worked example in creating a general function – foldr (fold the function from the right). Finally in this section, first-order^² and higher-order functions are examined from the Miranda ‘standard environment’.
Inspired by the iterators of the C++ Standard Template Library, Thompson looks for a characterisation in next few sections that illustrates how to take an abstract view of data structures by treating one type like another. The first phase in doing this was the creation of a signature for list-like types t * -> t * fold :: (** -> * -> t * -> **) -> ** -> t * -> ** -->. From there it was possible to show “that a family of types t * is list-like if we can define all the general functions conforming to the signature. Given such a family, we can define all the general functions over the types in the family.” What Thompson has done is to create a higher-order polymorphic interface signature that can describe the essential properties of various types, and, in using this signature, treat one type as if it were another.
To demonstrate the usefulness of the list type signature, an example is given showing the treatment of binary trees as lists and the creation of a function snoc that redefines the cons function to append to the end of a list rather than the start.
Demonstrating that this technique can be used on structures other than lists, Thompson defines a signature for tree-like types. He uses it to create a quicksort style algorithm on lists and to allow lists to be treated as trees.
The final section looks at the mechanisms by which the abstract view of data structures can be supported in various programming languages. These include:

Parameter passing – This allows for higher-order functions to be given different types or different recursions over the same type, for each invocation within a single scope.

Constructor classes – This allows the classification of type constructors rather than types. These type constructors are very similar in nature to the signatures defined throughout the paper.
Abstract data types - Declared in many languages via a specific signature to which the implementation is bound. Only one implementation is allowed per scope.
Structures and signatures – An approach that binds structures to signatures. It does not allow the overloading of multiple bindings per scope.
Dynamic binding – An example is given of how operations are dynamically bound in a C++ inheritance hierarchy.

Review of Paul Hudak and Mark Jones’ Unpublished report, Haskell vs Ada vs C++ vs Awk vs …

This paper is about the reflections of the authors on a prototyping experiment carried out by the United States Advanced Research Projects Agency (ARPA) in conjunction with the Office of Navel Research (ONR) and the Navel Surface Warfare Center (NSWC). The experiment occurred in the fall of 1993 and involved creating a prototype for a Geometric Region Server, which forms part of a larger weapons system.
Hudak and Jones identified this experiment as an opportunity for functional programming languages, in particular Haskell, to demonstrate their ability to replace mainstream languages like C++ in prototyping. They use the review of the programs and development metrics done by a committee chosen by the Navy to show how Haskell outperformed the other languages involved and to identify reasons why it performed so well. In particular, they believe “the Haskell prototype took significantly less time to develop and was considerably more concise and easier to understand than the corresponding prototypes written in several different imperative languages, including Ada and C++.”
The conclusions drawn in the paper emphasise the emerging opinion that performing prototyping experiments to reduce risk as early as possible actually reduces costs while improving quality. Hudak and Jones emphasise that identifying defects and design problems sooner will reduce costs, as they are much more costly to remove later. Also, prototyping can finalise subsystem interfaces early, while experimentally demonstrating that they are well designed. This allows a large project to be split into several smaller projects, reducing the communication between teams and increasing productivity. They reason that productivity declines exponentially as project size increases; hence, programmers and designers are more effective working in several smaller groups on smaller projects than altogether on one large project.
The primary goal of prototyping is simply to understand better both the problem and solution spaces, thereby reducing the risks of the project. It may also be applied for the design and resource spaces; in fact, it can be applied at any stage where there is a potentially poor understanding of the problem, the solutions, and the available tools and resources. Hudak and Jones advocate using prototyping at many stages in the software life-cycle and that the key product of prototyping is a document that describes what has been learned. Also, the prototype itself can form an executable specification that may later be developed and refined into the final product.
In the second section of the paper, details of the NSWC experiment are given. This includes how the experiment was conducted, what some of the problems with the experiment were, and what some of the virtues were. An interesting point to come out of this section was that metrics were mainly collected by the developers, with no guidelines as to what exactly should be reported. Hence the metrics could be questionable (some cases of this are pointed out in latter sections) as even what constitutes a line of code could be interrupted differently or abused to give more favourable results. Also, the somewhat ambiguous problem specification given, although realistic, allowed for designers to interpret what was required differently. As such, many of the designs provided differing functionality.
The third section contains a brief, but clear, description of how the Geo-server should work. This is mainly done using the output generated by the Haskell prototype. An outline of the objectives for the project, using quotes from problem spec, along with several extras imposed by the authors, are outlined to give criteria that can be used to evaluate designs against. Other than directly meeting the needs of the Geo-server, two important objectives were to minimise work to achieve a first delivery and allow room for incremental enhancement, both desirable characteristics for a prototype.
A very quick introduction to what languages were used is given in the fourth section along with explanations about some of the lesser-known languages. The fifth section then includes brief comments on the resulting prototype for each language and a table containing metrics, including lines of code, lines of documentation, and development time. Many interesting points are made here, including:

The Rapide solution addressing software architecture rather than the essential functionality of the system.
Several languages not even compiling or running due to lack of suitable language specific tools (compilers) at the time of the experiment.
The issue of how the language elements like standard prelude, libraries and data encoding should be included in the line counts.
The amount of documentation produced affecting the development time. Either producing no documentation or not including the documentation time in the development time metric.

I have several issues with the use of lines of code as a metric to compare functional and imperative languages. As a metric, lines of code can turn into a syntax issue. The authors comment, “The syntax issue is a subjective one, but we note, for example, that many of languages used in this experiment have relatively heavy-weight “begin…end” constructions, whereas Haskell uses a convenient ‘’layout’’ rule for block structuring.” Clearly Haskell code will be more compact than the imperative languages.

I believe that mainly this metric can be used to infer how concise the design is. In theory a succinct design should have better clarity, as there is less code to understand. However, although functional code can fit into much fewer lines, it is still possible to write bad code that is just as hard to understand as some imperative style code. Some part of the clarity of the code comes from the programmer’s level of understanding of the language and syntax. There is an example in the experiment of one programmer trying to compress as many statements onto a single line to reduce the line count, defeating the purpose of the metric, as clarity could be greatly reduced. Something interesting about the Haskell code was the possibility to reduce the lines of code from 85 to 36 by having the compiler infer the types.
I also have concerns about what constitutes an experienced programmer, the amount of information provided to each team and the size of the development teams.
Examples of programmers not making use of higher-order functions, even though the language supports them, indicates they may not be experienced programmers, at least with the particular language. One could also question how experienced the designers were for some languages when no compiler existed at the time of writing. It’s hard to be experienced or claim the language to be mature if you can’t write code that executes.
Some developers were provided with solutions done by other developers, leading to questions about how experimentally correct their designs were. The authors point out that a college graduate learnt Haskell in 8 days, with the aid of an experienced Haskell programmer, and then produced their own prototype. They were, however, given the Haskell report written by the Navy, leaving questions about how much of the design could be derived from this document. Not all the developers were able to attend a verbal briefing on the experiment (to their credit, this included the Haskell team), putting them at a disadvantage to those who did. Finally, the size of the development teams varied in size from an individual through to several people. This would effect the development time and the quality of the design that emerges.
During the review it was recognised that the analysis based on metrics, for objectivity, was perhaps not as valuable as carefully examining the code itself. The review panel did this and graded each language based on several criteria. The authors point out that “grade compression” factors probably prevent any declaration of a “clear winner” and then proceed to point out that “Haskell did fare better than any other languages in this analysis”, indicating the authors believe the Haskell results made it the winner. They then look at the features that gave the Haskell result its conciseness, high level of documentation, and extensibility.
In the last section the authors reflect on reactions to the functional model, including preconceptions about programming leading assessors to believe that the Haskell model submitted was incomplete as it appeared to be just a mixture of requirements specification and top level design. Questions were also raised about the use of higher-order functions to capture regions of space as a clever trick that would not work in other contexts.
Final remarks by the author’s point out the experiment was unique in its structure, and noteworthy for functional programming and that in the end, prototyping was the winner at the end of the day.

Review of Henk Barendregt’s, The Impact of the Lambda Calculus in Logic and Computer Science.

Alonzo Church invented the lambda calculus originally trying to construct a formal system for the foundations of mathematics by having a system of functions together with a set of logical notions. When his initial attempts proved inconsistent, he abandoned his original goal and lifted out the lambda calculus that concentrates on computability. Henk Barendregt wrote this paper to honour this great invention and outline some of the contributions it has made.
The paper has four main sections and restricts discussion to within the fields of mathematical logic and computer science. Firstly, a general introduction is given on the notions of computability and how the lambda calculus is involved. This section also includes a preliminary introduction for readers not familiar with the lambda calculus. The remaining three sections then look at the process of formalising the notion of ‘computable’, and its impact on two areas of mathematical logic: the representation of computations and of reasoning. The final section on reasoning is beyond the scope of this review and will be emitted.
The lambda calculus provided the first formalised notion of computability in terms of definability on numerals represented within it. In the first section of the paper, Barendregt points out that Church’s Thesis, on the correct formalisation on computability, has not been seriously challenged in more than 60 years and that it has been extended to include computations on data types other than integers, like trees and syntactic definitions. Of particular interest in this section is that the notion that lambda definability is conceptually the basis for the discipline of functional programming. A very brief history of the lambda calculus and Church’s goals is given including an explanation of how the letter ‘’ was chosen to denote function abstraction.
The latter part of the introductory section is dedicated to giving a preliminary introduction into the lambda calculus for those not familiar with it. This covers a range of concepts from the untyped and typed lambda calculus including:

Syntax
Notation conventions (associativity and brackets)
Free and bound occurrences of variables
The notions of -convertibility, -reduction, -normal form and -redex
The Church-Rosser theorem
The simply typed lambda calculus
Inductive types and recursion

This may prove sufficient for some readers but I found other introductions available that present the material in a form that is easier to comprehend.
In the last three sections of the paper the author uses a writing style that greatly reduces the readability of the paper. The paper contains 119 references spanning over 6 pages, all of which are referred to numerically (e.g., [28]). This is not a problem in itself except that the authors writing style requires the reader to constantly turn back to the references to find what he’s referring to. Take these sentences from page 185 for example, “The system  turned out to be inconsistent, as was shown by Church’s students [71] using a tour de force argument involving all the techniques needed to prove Gödel’s incompleteness theorem. Then [28] isolated the (untyped) lambda calculus from the system  by deleting the part dealing with logic…” At other points in the paper theories are referred to only by reference, often requiring the reference to be read first to understand the point the author is making. An example of this is on page 195, “Eager evaluation, however, is not a normalising reduction strategy in the sense of [6, Chapter 12].”

The constant need to look up references in the appendix, and possibly read them to find what is actually being referred to, combined with the long paragraphs, in some cases almost the length of an A4 page, reduce the overall readability of this paper.

The second section of the paper relates to formalising the notion of ‘computable’. Barendregt explains how Church introduced a formal theory based on the notion of a function to be a foundation of mathematics. This theory was later proved to be inconsistent by one of Church’s students by using a tour de force argument. This led him to then isolate the (untyped) lambda calculus from the original theory by deleting the parts dealing with logic and keeping the essence of the part dealing with functions. From here Church introduced the notion of lambda definability for functions f : N^k  N in order to capture the notion of computability. Barendregt briefly follows through the evolution of lambda definability from being able to prove only elementary functions initially (like addition) to more complicated functions like predecessor (pred(0) = 0, pred(n + 1) = n).
A short paragraph in the middle of this section points out that the notions of lambda definability and Turing computability (an independent alternative formalisation using Turing machines) are equivalent, thereby enlarging the credibility of Church’s Thesis.
As pointed out in an earlier section, Church’s Thesis has not been refuted in more than 60 years. While Church’s Thesis is plausible it cannot be proved, nor even stated in (classical) mathematical terms, since it refers to the undefined notion of intuitive computability. Barendregt comments that “most logicians do believe Church’s Thesis” and that rather than use it to prove something is computable, it is used to show a function is not intuitively computable. Two examples of such undecidable predicates are given: the normalisation problem, whether a lambda term has a normal form, and the halting problem, whether a machine with program p and input x terminates.
The section concludes by precisely formulating Church’s Thesis as a formal statement in intuitionistic mathematics.
The third section of the paper describes the impact the lambda calculus had on the representation of computations and commences with a discussion about the various versions of typed and untyped calculi. A point made by Barendregt here is that Lambda calculi are prototype programming languages. The two versions of the simply typed calculus are divided into either Church’s versions (explicit types) and Curry’s versions (implicit types). Also mentioned at the start of the section are higher order lambda calculi 2 and .
A subsection (3.1) is present to explain how it is possible to represent data types in a very direct manner in various calculi. Barendregt observes that lambda definability was introduced for functions on the set of natural numbers N and that all other domains of input and output have been treated as second class citizens by coding them as natural numbers. He points out that the lambda calculus is just strong enough to code other domains directly as lambda terms while preserving their structure. The results of doing this is that a much more efficient representation of algorithms on these data types can be given. Barendregt then proceeds to give an example of encoding labelled trees and a mirroring function on them via a series of definitions, propositions, proofs and corollaries. He does this twice, once using lambda terms typeable in the second order lambda calculus λ2, and then using an essentially untypeable representation in terms of lambda calculi.
A short history is presented in the next subsection (3.2) of “how lambda calculi (untyped and typed) inspired (either consciously or unconsciously) the creation of functional programming”. Here Barendregt describes how early computers favoured the Turing model of computation, and how the von Neumann computer model (with software) caused imperative languages to develop faster than functional languages based on Church’s lambda calculus. The key reason given for the slow development of Church’s computability model (lambda definability) was that it was harder to interpret in hardware. Some interesting reasons are given as to why the functional paradigm is more convenient than the traditional imperative one, including that “functional programs are closer to the specification of computational problems than imperative ones” and that functional programs are much more natural at expressing parallelism.
Functional languages that were – and in some cases still are – influential in the expansion of functional programming are then discussed and divided into broad classes. These classes included the reduction strategy used in attempting to reach the normal form of a lambda term (which is unique if it exists), typed and untyped languages, and interactive functional languages. The relative merits of each language are pointed out, along with some the undesirable characteristics in some cases.
A brief list of Influential languages in functional programming covered includes:

APL and FP – Early languages that consist of a set of basic functions that can be combined to define operations of data structures. They are, however, less complete than a full functional language.
LISP – The first functional language. It used eager evaluation but was not a pure functional language.
Common LISP – Removed the dynamic binding from LISP but was still not a pure language.
Standard ML – the first important typed functional language with eager evaluation, based on the Curry variant of the  calculus.
SASL – an experimental language based on the idea of graph reductions and using carefully chosen combinators as primitives. One of the first lazy functional languages. It was untyped and much slower than the current imperative languages.
Lazy ML – A typed, lazy version of ML.
Miranda^TM– the first fully developed, typed, functional language.
Clean and Haskell – more modern and very efficient lazy functional languages.

In terms of evaluation strategies, eager and lazy evaluation are discussed. A definite advantage pointed out in favour of eager evaluation is the efficiency of implementation in terms of time and space. Barendregt observes that while all computable functions can be represented in an eager language, not all reductions in the full λK-calculus can be performed. In contrast to eager evaluation, lazy evaluation is evaluated leftmost outermost, which is regarded as being a more desirable normalising reduction strategy, but in order to make lazy evaluation efficient, graph reductions were needed. Lazy evaluation has the advantage of being able to work on infinite objects/structures.

Barendregt describes typing as useful “because many programming bugs (errors) results in a typing error that can be detected automatically prior to running one’s program.” But it can be unnecessary in some situations, as a type reconstruction algorithm can automatically find the type of an untyped but typeable expression.
The final part of this subsection looks at interactive functional languages, which address the autistic nature of many functional languages (they can’t interact with the outside world) by using the concept of a process as opposed to a function. Barendregt looks at how continuations are added to lambda calculus as a satisfactory way to deal with I/O and comments that “The use of continuations is equivalent to that of monads in programming languages like Haskell.” He finishes with the problem of duplication of I/O channels and how Clean’s uniqueness typing system addresses this.

References

[1] Simon Thompson: Higher-order + polymorphic = reusable, Journal of Functional Programming, 1997.

http://www.cs.ukc.ac.uk/pubs/1997/224

[2] Paul Hudak, Mark P. Jones: Haskell vs Ada vs C++ vs Awk vs …, An Experiment in Software Prototyping Productivity. Unpublished report. Yale University Department of Computer Science, July 4 1994.

[3] Henk Barendregt: The Impact of the Lambda Calculus in Logic and Computer Science, Bulletin of Symbolic Logic, 3(2):181-215, 1997. §§ 1 – 3.

http://www.math.ucla.edu/~asl/bsl/0302/0302-003.ps

Glossary

The following definitions are mainly done using relevant quotes from the references.

Higher-order:

Functions that take function valued arguments.

[1] “We call a function higher-order when it takes a function as an argument or returns a function as result”

Polymorphism (algorithmic abstraction):

Parametric, Using the same algorithm for different type classes.

[1] “Polymorphism allows operations to be applied over whole classes of types, whilst function parameters mean that particular operations can be abstracted away, to be passed in as values on application.”

Ad hoc (overloading), One function name, with different algorithms used depending on the type.

Encapsulation (data abstraction):

[1] “… access to a particular type can be given (solely) through a signature of operations, hiding the concrete nature of the type.”

Iterator:

[1] “… an abstract index into a sequential structure…”

[1] “… an abstraction for indices (or pointers ), allowing a walk through a sequential structure.”

Functional Programming Language:

[1] “… based on function definitions as equations or more generally as sequences of equations.”

Type variable:

[1] “… used to signify that the property is valid for any instance of the variable.”

Extensibility:

[2] “ease with which the prototype can be absorbed and altered.”

Understandability:

[2] “rapidity with which the prototype can be absorbed and altered.”

Appropriateness:

[2] “the expressiveness of the language relative to the task at hand,”

Accuracy:

[2] “correct functioning of the prototype.”

Compactness:

[2] “capability per line of code.”

Redex:

A reducible expression.

1 Miranda is a trademark of Research Software Ltd.

2 [1] - “They are ‘structural’, in that their operation is independent of the elements of the list.” Referring to functions that operate on lists but do not examine the actual values.

Daniel Ballinger (300041839) Page of 二〇:五七二十七.四./P.四. 27/04/18

Download 60.54 Kb.

Share with your friends: