Skill 1: Evaluating languages

You will often find yourself in a situation where you have to evaluate a language for a specific task or compare one language to another. Languages come and go. Before studying computer science as undergraduate, my first programming languages were Perl and C++. While I was undergraduate, the trend and debate to teach Java instead of C++ in introductory programming was in full swing. Today, scripting languages, like Ruby and Python, are making headlines. Essentially, every five years or so, we have to evaluate the languages out there and decide which one is right for our task.

No language is perfect for all tasks. I use OCaml a lot because it is great for writing the compiler-like tools that I frequently build. The skills you acquire in this topic (and by taking this course) will help you systematically and objectively evaluate a language or compare one language to another language.

  1. You should understand and internalize the different characteristics in Table 1.1. This skill is of course a prerequisite for acquiring the subsequent skills in this section.
  2. You should be able to evaluate a language or feature with respect to the characteristics in Table 1.1.
  3. You should be able to compare two languages or language features with respect to the characteristics in Table 1.1.

Skill 2: Describing syntax (BNF)

Broadly speaking, each language has two parts: the syntax and the semantics. The syntax of most languages is expressed in a notation called BNF (the book describes two notations: BNF and EBNF; we will just use BNF). If you pick up any book on a language or visit a web site on a language, there is a good chance that you will be able to find a chapter or web page that describes the syntax of the language in BNF. Reading this syntax description will give you a clear and unambiguous understanding of the syntax of the language. You will most likely learn many new languages in your career and thus knowing how to read BNF will be valuable. Almost certainly you will also design new languages in your career. Really! Think about the number of domain-specific configuration and data processing languages out there. For example, many games have little languages; if you work for a company that builds computer games, you will most likely design a new language from a scratch. For this task, knowing how to write an BNF grammar for a syntax is a prerequisite.

  1. You should be able to determine whether a given aspect of a language is part of the syntax or part of the semantics.
  2. You should be able to describe in your own words the language that a BNF grammar generates.
  3. You should be able to determine if a given sentence is generated by a given BNF grammar.
  4. You should be able to determine whether two grammars generate different languages or the same language.
  5. You should be able to create a BNF grammar that generates a given language.

Skill 3: Concrete and abstract syntax (derivations, parse trees, ambiguity, precedence, associativity)

When you implement parsers to recognize sentences, you will need to deal with issues, such as ambiguity in the grammar.

  1. You should be able to derive sentences and construct parse trees from a grammar.
  2. You should be able to figure out if a given grammar is ambiguous.
  3. You should be able to figure out what precedence and associativity a given grammar uses.
  4. You should be able to determine what precedence a given language uses by writing and running programs in the language.
  5. You should be able to explain the difference and the roles of concrete syntax and abstract syntax.

A related skill to the above is to take a grammar that is ambiguous or does not respect precedence or associativity and to rewrite it so that it is unambiguous and respects precedence and associativity. While this is an important topic, we will not have time for this in our class. This material is normally covered in a compiler construction course. In a compiler construction course, you often also implement parsers and learn about special classes of context-free languages that are amenable to parsing (e.g., LL(1), LR(k), LALR).

Skill 4: Binding

Bindings are a central concept in all programming languages. A binding is just an association between an entity and an attribute. For example, there is a binding between a variable and its type or between a variable and its value. The skills we are going to focus on here have to do with type bindings and storage bindings. Storage bindings, in particular, are very important to understand. Most languages support more than one storage binding and knowing the properties of the different storage bindings will enable you to pick the storage binding that best matches your needs.

  1. You should be able to discuss the relative strengths and weaknesses of static and dynamic type bindings.
  2. You should know how to use variables with each of the four storage bindings: static, stack dynamic, explicit heap dynamic, and implicit heap dynamic.
  3. Given a programming language, you should be able to determine, by writing programs, the storage binding of variables in the language.

Skill 5: Scoping

One of the important principles behind modern software engineering is the "need to know principle". If a client does not need some aspect of our code in order to effectively use our code, then we should hide that aspect from the client. Scoping is one of the several mechanisms that languages provide for hiding data. With scoping a variable can be hidden from code that should not be accessing the variable. The skills here will help you use scopes effectively in order to hide variables.

  1. You should be able to determine how a program will behave with static scoping and with dynamic scoping.
  2. You should be able to write a program that hides specified variables using scoping.
  3. You should be able to write a program that can distinguish between static and dynamic scoping.

Skill 6: Functional concepts

For many problems, one can get a much more elegant and compact solution with functional languages than with imperative languages. Many of the features from functional languages have crept into the more commonly used imperative languages (e.g., garbage collection and closures). Functional concepts are also appear in many libraries and frameworks (e.g., the STL in C++, Google's MapReduce). In this skill set, we will learn about the main concepts behind functional languages.

  1. You should know what each of the following concepts mean: referential transparency, functional forms, and first-class functions.
  2. You should know how to use recursion instead of iteration. This skill is not really something you will acquire in this course: you should already know how recursion works. We will just get a lot of practice in this course!

Skill 7: SML

Standard ML (SML) is a very commonly used functional languages today. It is also a very interesting language to study since it has many powerful features such as, pattern matching, type inference, and parametric polymorphism.

  1. You should know how to read and write SML code using all the concepts covered in the reading.

    Declarations, Functions, and Tuples

    Declarations in SML introduce type and value bindings into environments. The scope of declarations are determined by lexical scoping rules.

  2. You should understand and be able to explain the difference between variables in SML used in value binding and variables in imperative programming languages.
  3. You should be able to describe the environment at any given program point (i.e., you should understand SML's scoping rules).
  4. You should be able to determine the type and value of simple SML expressions (using SML's scoping rules) if they exist.

    Recursion, Pattern Matching, and Datatypes

    Probably the most natural way to express repeated compuation in functional languages is through recursion. This observation is particularly clear when working with recursive datatypes, which one of the most important features of ML. Pattern matching is particularly convenient feature when working with recursive datatypes.

  5. You should understand the difference between homogeneous and heterogenous types and be able to explain their relation to exhaustiveness and redundancy in clausal definitions.
  6. You should be able to read and write SML code that uses pattern matching and clausal definitions.
  7. You should be able to determine whether or not a function is tail recursive.
  8. You should be able to write tail recursive functions.
  9. You should be able to define datatypes, as well as read and write SML code that manipulate them.

    Polymorphism

  10. You should understand what parametric polymorphism is in SML and implement polymorphic functions.

    Higher-Order Functions

    The power in functional languages comes largely from higher-order functions: the ability to pass functions as arguments and to return them as values.

  11. You should understand what currying is and be able to convert between functions that are curried and not curried.
  12. You should be able to stage computation using curried functions.
  13. You should understand the role of "fold" or "reduce" functions and how they abstract recursion over a data structure.

Skill 8: Induction

Induction is key technique in computer science and can be naturally applied to prove properties of recursive functions. We first review mathematical induction. Then, we discuss structural induction, which is a generalization of mathematical induction, to prove properties of functions that are defined over recursively-defined types.

  1. You should understand the role of specifications and how to provide pre- and post-conditions for a function.
  2. You should understand and explain the difference between total correctness and partial correctness. You should be able to recognize when a theorem statement talks about total or partial correctness.
  3. You should be able to identify when to apply mathematical induction, complete induction, and structural induction.
  4. You should be able to prove simple properties of SML functions using induction.

Skill 9: Lambda calculus

The lambda-calculus shows the essence of functional programming and computation. It serves as the "yeast" or model for programming language design.

  1. You should be able to explain the syntax and semantics of the lambda calculus and discuss informally how it captures the essence of computation. You should be able to evaluate lambda calculus expressions by hand.

Skill 10: Type checking and type equality

Types are the most interesting and varied aspect of programming languages. For example, C++ and Java "look" very similar (have similar syntax, control structures, etc.) but differ greatly in their type systems. C++'s type system is very liberal: you can cast anything to anything else and very few kinds of type errors are detected. On the other hand, Java's type system is much more restrictive: you can cast between types only in limited situations and catches all type errors either at compile time or at run time. Thus, an in-depth understanding of how types work and their implications is key to understanding how to make the best use of programming languages. If you do not understand or fully appreciate the type system for your language, you will find yourself working around it (which is rarely effective) rather than working with it. This set of skills is our first foray into understanding types; we will spend a lot of time on types this semester.

  1. You should know the relative strengths and weaknesses of strong and weak typing.
  2. You should know the relative strengths and weaknesses of name and structural equality mechanisms.
  3. You should be able to determine if two types are equal by name or structural equality or neither.
  4. You should be able to write programs that determine whether a language uses name or structural equality.

Skill 11: Data types

In order to enable programmers to model their data as naturally and cleanly as possible, languages provide a number of different data types. If you use the right data types for the job, not only will you end up with cleaner code, but most likely you will also end up with better type checking. Thus, it is good to know what is out there.

You are most likely already very familiar with some data types: array, record (or struct in C/C++ terminology), primitive types (e.g., integers, booleans). There are some that you are probably not as familiar with: subrange types, enumeration types, union types, associative arrays. We will focus primarily on the ones that you don't already know well.

  1. You should know how and when to use subrange types.
  2. You should know how and when to use enumeration types.
  3. You should know how to do address computation to access an element of an array.
  4. You should know how and when to use associative arrays.
  5. You should know how and when to use union types.
  6. You should be able to explain the difference between discriminated and non-discriminated union types
  7. You should be able to draw the memory layout of values of record and union types.

Skill 12: Pointers

Pointers (or references) are incredibly powerful: they allow programmers to build unbounded recursive data types (such as lists or trees). By "unbounded", I mean that the sizes of the recursive data structures do not need to be known in advance. However, with this power comes many difficulties, two of the most common being the dangling pointer and memory leak (also called lost heap-dynamic variable in the text) bugs. Garbage collection eliminates the dangling pointer problem but can cause very subtle memory leaks, which are hard to track. Having a good understanding of how these mechanisms work will help you use pointers more effectively and also debug memory management related bugs (which are the most common kinds of bugs in C/C++ programs).

Note that the book distinguishes between "reference counters" and "garbage collection". The research literature does not distinguish between them: "reference counters" are a form of "garbage collection". So when I use the term "garbage collection", I mean it to include reference counting.

  1. You should be able to recognize if a given (small) program suffers from either the dangling pointer or memory leak bug.
  2. You should be able to rewrite a program with a dangling pointer or memory leak bug so that it does not have such a bug.
  3. You should understand how reference counting works and what its strengths and weaknesses are (i.e., what memory management issues does it address and what are its limitations).
  4. You should understand how mark-and-sweep works and what its weaknesses are (i.e., what memory management issues does it address and what are its limitations).

Skill 13: Expressions

Expressions perform the computations in programs. To make them more intuitive, many languages borrow the concepts of precedence and associativity from mathematical conventions, which programmers are already familiar with it. Precedence and associativity determine the order of evaluation of operators, that is, given an expression, which operator will we evaluate first and which one after that, and so on. In addition, to understand the semantics of an expression, one also needs to know about operand evaluation order, that is, the order in which one evaluates the operands of an expression. For example, in a+b+c, does one evaluate "a" first, then "b", then "c" or ...? Operands evaluation order is relevant in programming languages but not in mathematics because mathematics does not have side effects: operand evaluation order is relevant only if one has side effects. In this topic, you will acquire the skills to understand what an expression means in a given language.

  1. You should be able to write a program that exploits the language's precedence rules, associativity rules, and operand evaluation order to perform a given computation.
  2. You should be able to determine how a given program will behave when given the precedence rules, associativity rules, and operand evaluation order
  3. You should be able to write a program that determines what precedence, associativity, and operand evaluation order a language uses.

Skill 14: Short-circuit evaluation and control constructs

Control constructs in a language determine what computation a program performs and when. Most of the control constructs in the reading are things you are already familiar with and thus we will not spend time on this in the class (but I will expect you to know them). Here are the skills related to this reading:

  1. You should be able to write a program that exploits short-circuit evaluation capabilities of a language
  2. You should know when and how to use "switch" statements (multiway branch)
  3. You should know how and when to use different flavors of loops
  4. You should know how to use the guarded command and how it differs from "if" and "while" statements

Skill 15: Subtyping and Inclusion Polymorphism

Subtyping and inclusion polymorphism form the backbone of modern object-oriented languages. One cannot understand object-oriented languages without understanding these two concepts. While the skill set for this topic is small, the skills themselves are large and subtle.

  1. You should be able to figure out if and when one type is a subtype of another type.
  2. You should be able to exploit subtyping to obtain inclusion polymorphism.

Skill 16: Parameter Passing

Parameter passing is the preferred way for passing information from a caller to a callee and vice versa. Different parameter passing modes provide different capabilities and thus, many languages support more than one parameter passing mode. For example, C++ and Modula-3 support pass-by-value and pass-by-reference. C++ also (weakly) supports pass-by-name through its macro mechanism (#define ...). Here are the skills you need to know for this topic:

    Modes

  1. You should know when and how to use all the parameter passing modes (value, result, value-result, reference, name).
  2. You should be able to write programs that determine which parameter passing mode is used by a language.
  3. You should be able to determine how a program will behave given a parameter passing mode.

    Types and Binding

  4. You should be able to figure out legal argument types for a formal parameter for each parameter passing modes.
  5. You should be able to write programs to determine whether a language uses deep or shallow binding.
  6. You should be able to determine how a program will behave with deep binding and with shallow binding.

Skill 17: Parametric Polymorphism or Generics

Inclusion polymorphism (Skill 15) allows one to reuse code for many different types. Inclusion polymorphism is the backbone behind object-oriented languages. However, it is not the only kind of polymorphism. There is at least one other kind of polymorphism which is very useful: parametric polymorphism. There are some things for which inclusion polymorphism is more suitable and others for which parametric polymorphism is more suitable. Thus, modern object-oriented languages support both kinds of polymorphism: inclusion polymorphism through subtyping of objects and parametric polymorphism through generic subprograms (or templates in C++ parlance). In this skill set we will understand how and when to use generics.

  1. You should be able to write a generic subprogram that can be used for many different types.
  2. You should be able to use a generic subprogram in your code.
  3. You should be able to pick the most suitable kind of polymorphism (parametric or inclusion) for a task.
  4. You should be able to combine inclusion polymorphism and generics to constrain the type arguments passed to generics

Skill 18: Abstract Data Types

Programmers use abstractions to handle the complexity in their programs. With large programs (e.g., it is not uncommon to have programs that are millions of lines of code), it is essential to break down the program into a number of abstract units, each of which can be understood largely in isolation. Thus, when you have a bug in one unit, you only need to understand that unit to fix the bug. Most modern languages provide two kinds of abstractions: process abstractions (i.e., subprograms) and data abstractions. In this skill set we will understand the data abstraction support in programming languages.

  1. You should be able to use the data abstraction mechanisms in languages such as SML, Java, C++, etc.
  2. You should be able to write generic abstract data types.
  3. You should be able to explain the difference in data abstraction with abstract data types (signatures with abstract types) as in SML and access control (public vs. private) as in Java.

Skill 19: Object-Oriented Concepts

In this skill set you will learn the foundations behind modern object-oriented languages.

  1. You should know how to use inheritance to get code reuse and subtyping.
  2. You should be able to use dynamic dispatching.
  3. You should be able to walk through an object-oriented program and explain which method implementations each method invocation will use.