Charles Eric LaForest, PhD Student

Mad Scientist In Training

Why

I want to understand computers from the silicon up to the user interface. I want to make better and beautiful machines, like starships and samurai, not Buicks and bureaucracies. I want to make it possible again for two hairy dropouts named "Steve" to homebrew something insanely great in their garage.

More pragmatically put: I want to be a toolsmith (local copy) for scientists in non-computing disciplines.

Whence

What

  • Computer Architecture
  • Asynchronous Circuits
  • Higher-Order Functions (Runtime Code Generation)
  • Shared-Nothing Message-Passing Concurrency (à la Erlang)

Where

  • laforest at eecg.utoronto.ca

Writing

  • Efficient Multi-Ported Memories for FPGAs
    Charles Eric LaForest and J. Gregory Steffan
    ACM International Symposium on Field-Programmable Gate Arrays (FPGA), February 2010, Monterey, CA.
    The outcome of my M.A.Sc studies. The paper was very well-received and ranked third by the reviewers in a double-blind review.

  • Some Thoughts On Machine Code Organization
    Just a digression on the relationship between iterative, recursive, and threaded machine code.
  • Correlation of Ground-Level Computer Installation Error Rates with Solar Activity
    The data suggests that the amount of solar activity does indeed increase the probability of errors in computer systems.
    Boeing Radiation Effects Laboratory has many papers on this topic.
    There are more links in Berke Durak's "On the need to use error-correcting memory" and its follow-ups.
  • HashLife on GPU
    A mostly-failed attempt at implementing Gosper's HashLife algorithm on a GPU using CUDA.
    Hopefully this will help someone else get it right.
  • Survey of Loop Transformation Techniques (Slides: PDF PPT)
    A report on all the loop transformation techniques I had to learn in Parallelizing Compilers.
    If your algorithm is memory-bound, the speedup can be remarkable: "How We Made Our Face Recognizer 25x Faster"
    In C, in some cases, you'll likely have to use the obscure 'restrict' keyword to give the compiler a chance, else the potential for pointer aliasing makes dependency analysis very hard. Not having array accesses being pointer-based is the main strength of Fortran for such things.
  • Message-Passing Concurrency on Shared-Memory Multiprocessors
    Inspired by Erlang, I tested various locking schemes for a message mailbox.
    Even with thousands of threads contending for a single lock, it turns out that a single, simple Pthread mutex works best due to playing very well with the Linux process scheduler.
  • Fletcher: Final Project Report
    This was a continuation of my undergraduate work in stack machines: how to extend fast subroutine calls into fast context switching.
    The end result was a basic kernel and virtual memory model which granted clean and fast machine virtualization.
    (Don't get too excited. It assumes a very different software universe, where code is routinely dynamically generated.)
    The Project Design Document and my undergrad thesis provide the background details, but are not necessary prior reading.
  • Largest Lyapunov Exponent
    In short: use Rosenstein's algorithm over Wolf's. It's much faster and more accurate. (MATLAB code)
  • Second-Order Wiener-Bose Model
    Finding the number of Laguerre functions is easy enough, but scaling them to the available number of data points is tricky.
    I initially tried to do it all in MATLAB, but broke down and used LYSIS instead for the modelling.
    The MATLAB code contains a fast memoized recursive Nth-order Wiener-Bose filter implementation, if you need such a thing.
  • Canadian Space Gazette (November 2005)
    I wrote "Terrapin Station: An Alternative Approach To Space Exploration" to argue that developing self-sustaining stations should be done sooner than later, and would have great technology transfer potential on Earth.
    Since then, I've been told that the Van Allen belts pose a bit of a problem...
  • Batch job queue client and server.
    I wrote this to enable one-at-a-time batch processing of benchmarks for a class, so the students wouldn't pollute each-other's measurements. It did not get used in the end, and I am skeptical about its resilience.
    At the very least, look at it as an example of Pyro (Python Remote Objects), authentication (and how hard it is to be certain of identity), and faking private class methods in Python.

Ways

  • ECE1747 - Parallel Programming
  • ECE1755 - Parallel Computer Architecture
  • CSC2227 - Topics in the Design and Implementation of Operating Systems
  • ECE1754 - Parallelizing Compilers
  • ECE540   - Optimizing Compilers
  • ECE1764 - Algorithms and Data Structures
  • ECE1636 - Supervisory Control of Discrete-Event Systems
  • CSC2232 - Topics in System Performance and Reliability
  • JEB1444  - Neural Engineering
  • ECE1724 - Sp. Topics in Software Eng: Programming Massively Parallel Graphics Processors

Work

  • Curriculum Vitae
  • Teaching Assistant:
    • ECE1755 - Parallel Computer Architecture
    • ECE540 - Optimizing Compilers
    • ECE385 - Microprocessors Systems
    • APS111 - Engineering Strategies and Practice I
    • APS112 - Engineering Strategies and Practice II

Winnings

  • 2010 Right Track CAD Graduate Scholarship