research projects

Software Tools for Collaborative Science - supervised by Prof. Steve Easterbrook, Department of Computer Science, University of Toronto

This project grew out of an interest in developing software tools as means of improving performance of software-related tasks in an open science context. It began as an examination of the methods used by scientists (climate scientists in particular) for recording and sharing their lab notes electronically - in other words, the tools available to them for practicing open and collaborative science online. A case study of the Hadley centre by Prof. Easterbrook resulted in several research papers, related to problems which lie at the intersection of software development and science research: Engineering the Software for Understanding Climate Change and Configuration Management for Large-Scale Scientific Computing at the UK Met Office.

Ones of the most common tools used for sharing scientific work in open science communities are blogs and wiki pages, where wiki pages are often used to describe experiments. Often, there can be vast collections of wiki pages representing experiments, which prompts the need for a software tool to ease the process of finding related pages, without necessarily providing explicit search terms, but rather finding pages that are similar to a particular experiment you are interested in. Such a tool can be built on the premise that similar experiments would contain similar constituents which would be reflected in the way the experiment is described. This information can be extracted from the structural layout of the wiki page. A tool which can recognize similarities and differences in structural and textual content of pages can have many interesting applications, ranging from simple and quick "search engine" functionality for wikis, to more complex analysis of structural classes of pages and standardization of layout for certain classes of pages, as well as detection of commonly occurring structural elements (template suggestions).

MyeLink is the implementation of the described software tool which I developed this summer. It is a MediaWiki extension written in PHP an Python, and for now is a simple tool which suggests a list of related articles based on structural and textual information extracted from the content of a page. It includes several user interface features, such as inline browsing of the results' content and generation of word clouds to visualize the intersection of word tokens across the page and the suggested result.

You can read more about the process of creating MyeLink on my research blog. A demo server with MyeLink installed is set up at http://www.cs.toronto.edu:40154/mediawiki. You can view a screencast here, and here's the poster which was shown at the UofT Undergrad Summer Research Poster Session. A Sourceforge page with the package will be available soon.

coursework - year I

ESC102 - Praxis II
RFP: Service Delays caused by Inefficiency of Passenger Dynamics into and out of Subway Cars
TTC design project. Part I: identified a key problem area associated with the Toronto Transit Commission subway system. Group RFP chosen among ~60 others for the solution stage of the course. Part II: the solution my group proposed involved (a) a light system and countdown display to regulate passenger flow; (b) a sound system to notify passengers of door closure or door blockage; (c) a floor sensor for detecting direction of passenger flow and controlling the display signals appropriately; (d) a door sensor (EM-relay) for detecting obstructions to door closure and triggering the sound system alarm in such an event. Read more about the showcase here.

BME105 - Systems Biology
Designed a procedure for a PCR and restriction digest-based medical diagnostic to help inform therapeutic strategies. Protocol provides means of determining genetic differences between two or more populations. Performed lab with four DNA samples, and analyzed genotype using agarose gel electrophoresis results.

CSC190 - Data Structures and Algorithm Analysis
Assignments and labs on: (all in C++)
  • Implementing the internal decision-making software for the computer game Tetris
  • Implementing the internal software of a simple movie recommendation system using sparse matrices and a sorting algorithm
  • Parse trees for evaluating arithmetic expressions and make-like parse trees (simplified version of the parser used by the unix command make)
  • Implementation of the Huffman tree image compression algorithm using binary heaps
  • Graphs - Travelling Salesman Problem, greedy algorithm implementation
  • Sorting - mergesort (recursive and non-recursive stack implementations), quicksort, shellsort

CIV102 - Structures and Materials: An Introduction to Engineering Design
Analysis of bridges (suspension, truss, box-girder, arch bridges), pressure vessels, towers. Bridge design and construction project: designed and built a segment of a model railway bridge, by analyzing axial and flexural stresses (compressive and tensile), shear stress, beam deflection, plate buckling. The model (made of matboard) was able to safely withstand the passage of a 400-N steel train and subsequent loading of two 530 N weights.

reading

Miscellaneous books I'm currently reading.

Python Scripting for Computational Science, by Hans Petter Langtangen
Proofs and Refutations: the Logic of Mathematical Discovery, by Imre Lakatos (this turned out to be a greatly fascinating book, thanks Greg!)
Field Notes from a Catastrophe, by Elizabeth Kolbert (a climate change book)
The Gold of Troy, by Robert Payne (the story of Schliemann's life -- which almost sounds like fiction).