Chris Mueller


Open Systems Lab / Computer Science Department
Indiana University
215 Lindley Hall
Bloomington, IN 47405
chemuell@cs.indiana.edu

Synthetic Programming


Problem

High level scripting languages are so far removed from the hardware that they are unable to take direct advantage of processor features. Many scientific and engineering tasks have simple, direct mappings to hardware resources that can result in large performance increases. But, utilizing these requires an intermediate language increasing the size (and thus complexity) of the tool-chain required to perform a task.

Study

In this thesis, we will study the use of runtime-generated instruction sequences for streaming data problems. In our pilot study for this project, we developed a Python-based in-line instruction stream generator for the PowerPC and AltiVec instruction set architectures (ISAs) that conforms to the OS X application binary interface (ABI). Using this system, we implemented a set of abstractions for representing and manipulating bit vectors. Using these, we implemented a sample application that compares a collection of chemical compound bit vectors using an AltiVec-enhanced Jaccard/Tanimito kernel. The runtime generated code was able to attain a 50% performance increase over (non-AltiVec, but -O3 compiled) C++ code.
To continue this project, we will explore different abstractions for representing and combining data, resource, and algorithmic components to form fast and efficient instruction streams based on available runtime information. Using these abstractions, developers can gain direct control over computation resources from high level scripting languages.
The following is a list of execution-level abstractions to explore:
  • Multi-level Parallelism (Distributed -> SMP -> SIMD -> Sequential)
  • Pipelined execution (e.g., fusion/fission, out-of-core)
  • Traditional compiler optimizations (loop fusion/fission/unrolling, etc)
  • Event/Feedback primitives
  • Platform independent (high level building blocks)
  • Platform dependent (low level primitives)
    • PowerPC/AltiVec
    • IA-32/SSE
    • PowerPC/Cell
    • GPU

Research Plan

  • Implement IA-32 engine
  • Implement ABI callback system
  • Identify common data types and algorithms for compounds and genes
  • Identify common optimizations for algorithms
  • Decompose into synthetic components
  • Compare compiled and interpreted implementations against synthetic implementations
  • Compare platform implementations (PPC vs. IA-32)
  • GUI abstractions for building up algorithms
  • Explore applicability to non-PPC/Intel processors (GPU, Cell, etc)

Chem/Bio application

Raw data processing and graph construction/integration:
  • Data stats
  • Data clustering
  • Graph/database construction
  • Queries
  • Image processing (micro-array)
  • Signal processing (HPLC/Mass Spec)
Optimized visualization pipelines (ala CoreImage).

Related Work

  • Meta/Generative programming systems (DyC, et. al.)
  • Just-in-time compilers
  • Python: PyASM, cytpes
  • Compiler infrastructure projects

Risks

Potentially large body of related work that may be missed.