| Publications of Jeremy G. Siek |
| Books and proceedings |
| Thesis |
| Annotation: | The past decade of software library construction has demonstrated that the discipline of generic programming is an effective approach to the design and implementation of large-scale software libraries. At the heart of generic programming is a semi-formal interface specification language for generic components. Many programming languages have features for describing interfaces, but none of them match the generic programming specification language, and none are as suitable for specifying generic components. This lack of language support impedes the current practice of generic programming. In this dissertation I present and evaluate the design of a new programming language, named G (for generic), that integrates the generic programming specification language with the type system and features of a full programming language. The design of G is based on my experiences, and those of colleagues, in the construction of generic libraries over the past decade. The design space for programming languages is large, thus this experience is vital in guiding choices among the many tradeoffs. The design of G emphasizes modularity because generic programming is inherently about composing separately developed components. In this dissertation I demonstrate that the design is implementable by constructing a compiler for G (translating to C++) and show the suitability of G for generic programming with prototypes of the Standard Template Library and the Boost Graph Library in G. I formalize the essential features of G in a small language and prove type soundness. |
| Articles in journals or book chapters |
| Conference articles |
| Annotation: | Generic programming has recently emerged as a paradigm for developing highly-reusable software libraries, most notably in C++. We have designed and implemented a constrained generics extension for C++ to support modular type-checking of generic algorithms and to address other issues associated with unconstrained generics. To be as broadly applicable as possible, generic algorithms are defined with minimal requirements on their inputs. At the same time, to not lose potential efficiency, generic algorithms may have multiple implementations that exploit features of specific classes of inputs. This process of algorithm specialization relies on non-local type information and conflicts directly with the local nature of modular type-checking. In this paper, we review the design and implementation of our extensions for generic programming in C++, describe the issues of algorithm specialization and modular type-checking in detail, and discuss the important design tradeoffs in trying to accomplish both. We present the particular design that we chose for our implementation, with the goal of hitting the sweet spot in this interesting design space. |
| Annotation: | The past decade of experience has demonstrated that the generic programming methodology is highly effective for the design, implementation, and use of large-scale software libraries. The fundamental principle of generic programming is the realization of interfaces for entire sets of components, based on their essential syntactic and semantic requirements, rather than for any particular components. Many programming languages have features for describing interfaces between software components, but none completely support the approach used in generic programming. We have recently developed G, a language designed to provide first-class language support for generic programming and large-scale libraries. In this paper, we present an overview of G and analyze the interdependence between language features and libraries design in light of a complete implementation of the Standard Template Library using G. In addition, we discuss important issues related to modularity and encapsulation in large-scale libraries and how language support for validation of components in isolation can prevent many common problems in component integration. |
| Annotation: | ``Concepts'' are an essential language feature needed to support generic programming in the large. Concepts allow for succinct expression of bounds on type parameters of generic algorithms, enable systematic organization of problem domain abstractions, and make generic algorithms easier to use. In this paper we present the design of a type system and semantics for concepts that is suitable for non-type-inferencing languages. Our design shares much in common with the type classes of Haskell, though our primary influence is from best practices in the \C pp{} community, where concepts are used to document type requirements for templates in generic libraries. Concepts include a novel combination of associated types and same-type constraints that do not appear in type classes, but that are similar to nested types and type sharing in ML. |
| Annotation: | This paper presents the design of G, a new language specifically created for generic programming. We review and identify important language features of C++ and Haskell in light of the past decade of generic library research and development. Based on this analysis we propose and evaluate relevant language design decisions for G. Generic programming is concerned with the construction of libraries of reusable software components and is inherently about programming ``in the large.'' Thus, the design of G places its greatest emphasis on modularity and safety, while also providing run-time efficiency and programmer convenience. This paper focuses on name scoping and type checking for generic functions, support for dispatching to algorithm specializations, support for type associations among abstractions, and separate compilation. The resulting design for G includes three novel aspects: scoped models declarations, nested types in concepts, and optional type constraints on generic functions. |
| Internal reports |
| Annotation: | ``Concepts'' are an essential language feature needed to support generic programming in the large. Concepts allow for succinct expression of bounds on type parameters of generic algorithms, enable systematic organization of problem domain abstractions, and make generic algorithms easier to use. In this paper we formalize the design of a type system and semantics for concepts that is suitable for non-type-inferencing languages. Our design shares much in common with the type classes of Haskell, though our primary influence is from best practices in the \C pp{} community, where concepts are used to document type requirements for templates in generic libraries. The technical development in this paper defines an extension to System F and a type-directed translation from the extension back to System F. The translation is proved sound; the proof is written in the human readable but machine checkable Isar language and has been automatically verified by the Isabelle proof assistant. This document was generated directly from the Isar theory files using Isabelle's support for literate proofs. |
| Annotation: | Krivine presents the K machine, which produces weak head normal form results. Sestoft introduces several call-by-need variants of the K machine that implement result sharing via pushing update markers on the stack in a way similar to the TIM and the STG machine. When a sequence of consecutive markers appears on the stack, all but the first cause redundant updates. Improvements related to these sequences have dealt with either the consumption of the markers or the removal of the markers once they appear. Here we present an improvement that eliminates the production of marker sequences of length greater than one. This improvement results in the C machine, a more space and time efficient variant of K. We then apply the classic optimization of short-circuiting operand variable dereferences to create the call-by-need S machine. Finally, we combine the two improvements in the CS machine. On our benchmarks this machine uses half the stack space, performs one quarter as many updates, and executes between 27 0.000000aster and 17lower than our L variant of Sestoft's lazy Krivine machine. More interesting is that on one benchmark L, S, and C consume unbounded space, but CS consumes constant space. Our comparisons to Sestoft's Mark 2 machine are not exact, however, since we restrict ourselves to unpreprocessed closed lambda terms. Our variant of his machine does no environment trimming, conversion to deBruijn-style variable access, and does not provide basic constants, data type constructors, or the recursive let. (The Y combinator is used instead.) |
| Annotation: | We consider the problem of how best to combine optimizations in imperative compilers. It is known that combined optimizations (or ``super-analyses'') can be strictly better than iterating separate improvement passes. We propose an explanation of why this is so by drawing connections between program analysis and the algebraic and coalgebraic views of programs and processes. We argue that ``optimistic'' analyses decide coinductively-defined relations and are based on bisimilarity. We relate combining program improvements to the problem of deciding combinations of theories. Iterating program improvements is similar to the Nelson-Oppen method of deciding combined theories: in Nelson-Oppen decision procedures communicate equalities, and iterated improvement passes implicitly communicate equalities via term replacements. To decide combined theories of bisimilarity, some ``co-Nelson-Oppen'' procedure is needed that propagates \e mph{inequalities} amongst decision procedures. Hence, iterating optimistic analyses fails to be effective because inequalities cannot be communicated by semantics-preserving rewrites. Superanalysis is conjectured to overcome this failing by behaving like a ``co-Nelson-Oppen'' decision procedure. |
| Miscellaneous |
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
Les documents contenus dans ces répertoires sont rendus disponibles par les auteurs qui y ont contribué en vue d'assurer la diffusion à temps de travaux savants et techniques sur une base non-commerciale. Les droits de copie et autres droits sont gardés par les auteurs et par les détenteurs du copyright, en dépit du fait qu'ils présentent ici leurs travaux sous forme électronique. Les personnes copiant ces informations doivent adhérer aux termes et contraintes couverts par le copyright de chaque auteur. Ces travaux ne peuvent pas être rendus disponibles ailleurs sans la permission explicite du détenteur du copyright.
This document was translated from BibTEX by bibtex2html