Projects I Have Authored


An Automatic Gap-Fill Question Generation system that creates multiple choice, fill-in-the-blank questions from text corpora. Textbooks, factoid archives, news articles, reports, lecture notes, legal proceedings -- the minimum viable input is a small to moderate sized collection of coherent, well-formed english.



An efficient Scala implementation of the Sequential Minimal Optimization algorithm for training Support Vector Machines.


Functional programming for machine learning

fp4ml: A library of machine learning algorithms implemented using principles of functional programming.



A type class for data of all sizes: write an algorithm once and run on local Scala collections or a Spark cluster.



Supervised relation extraction from free-form text on Spark. Complete end-to-end implementation of my Master's thesis:



Critical cell finding algorithm using functional programming. Critical cell finding is a pre-processing step for table data extraction.



Small library of image processing and computer vision algorithms. Depends on the boof-cv Java library.



A functional abstraction for dependent, asynchronous computation.


Projects I have Collaborated On


Scala code generation from Avro schemas.



Tutorial code from my Fall 2014 Datapalooza session. Includes beginner to advanced material for functional programming in Scala. Also includes classification and ranking using nearest neighbors as well as clustering using k-means. Includes some elements of "fp4ml".



A Scala project that extends PDFBox 2.x with a simplified document object model. Wrote the PDFBox 1.x based predecessor of this library as a Nitro internal project. Assisted, mentored, and code reviewed  Sagnik Choudhury on the PDFBox 2.x re-write and open source development.



ProPPR (pronounced "proper"): Graph-algorithm inferences over local groundings of first-order logic programs