An abbreviated version of this resume can be downloaded as a pdf.

ABOUT

Software engineer (18yoe) with a focus on developing applications with machine learning functionality.

Specialties: Machine Learning, DevOps, REST, Data Modeling, Python, Java, Linux, AWS.

Experience

Senior Research Software Engineer - Visual Analytics at NCSA

October 2015 - Present: Chicago, Illinois

As a Senior Research Software Engineer, my focus is on the Backend, Devops, and Machine Learning aspects of cloud-based applications designed for scientists by Visual Analytics at the National Center for Supercomputing Applications..

Highlighted projects:

  1. Lead Backend Developer for Nest, a platform for data science and web applications. This platform is based on common data types shared between frontend, backend, and data science jobs. Nest integrates mature but modern open-source components as docker containers, including CI jobs and one-click deploy to AWS EC2. (2015-)

    Online Whitepaper

    Tech: AWS (EC2, Route53, S3, IAM), Docker, Postgres, Python Flask, SQLAlchemy, Jenkins.

  2. Product Manager and Lead Developer of the Pixsure project built on Nest. Pixsure is a human-centric web tool for annotating complex medical images and simplifies data management of the resulting ML training data. (2020-)

    Tech: Nest (described above), Conda, Scikit-image.

  3. Cloud engineer for a proof-of-concept API for Mayo Clinic to store and access a particular subset of genomics data used by clinicians when determining cancer treatments. I evaluated technologies and built the MVP of the data management system and API on Google Cloud while the remaining team of doctors and DBAs created a sophisticated data model in Postgres. (2021)

    Tech: GCP (Compute, Cloudsql-Postgres), Docker-Compose, Traefik (a reverse proxy), Postgresql, PostgREST, OpenAPI, RapidDoc, Jupyter Notebooks.

  4. Lead Backend Developer on Phyloflow, a bioinformatics pipeline to compute phylogenetic trees from tumor mutation data in collaboration with a CS prof and Mayo Clinic. This is an up-and-coming area of cancer research and my role was to pull together a collection of bleeding-edge commandline tools, mainly from University lab Github repos, and package them consistently into docker containers and WDL tasks, and run the resulting pipeline on both a local HPC system and GCP.

    Phyloflow On Github, Phyloflow On Dockstore

    Tech: AWS (ECR, Route53), GCP (CloudFiles, terra.bio), Docker, Workflow Description Language (WDL), MiniWDL, Conda

  5. Lead Backend Developer for Omix built on Nest. Omix is a web app that visualizes microbiome analytical results built in collaboration with Mayo Clinic’s Center for Individualized Medicine. (2015-2017)

    Project Page

    Tech: Nest (described above), Conda, Numpy/Scipy

Software Engineer - Groupon

September 2013 - March 2015: Chicago, Illinois

Part of the Automated Merchandising team, responsible for bundling deals (Groupons) into widgets that can be displayed anywhere on the Groupon website or mobile apps. Personal areas of focus:

  • International rollout of core service to 20 countries in Europe and Latin America.
  • Integration test suite of the core system, written in Python. Proved to be mission critical for the international rollout.
  • Build automation using Python, Capistrano, Maven, Jenkins, and cron.
  • Machine learning system for adjusting the relative rankings of widgets in different contexts.
  • General software engineering of a Java webapp on an 8 person agile software team.

Independent Contractor

October 2006 - September 2013: Chicago, Illinois

Conducted multiple machine learning and software development projects. Examples:

Nexlp (2013)

An e-Discovery startup in Chicago. Their core product uses a natural language processing toolkit from the University of Illinois and a graph database (Neo4j) to analyze sets of millions of emails at a time. The primary deliverable was a pattern detection module that combined frequent item set analysis and anomaly detection to generate patterns of the form "Bob emailed Sally late at night about Chicago 12 times during the week of Dec 4, 2006, but normally this occurred 0.03 times/week."

rVibe (2013)

A boutique maker of training software for the pharmaceutical industry. Handled devOps for the company for nine months. Designed and implemented build tools and a performance benchmarking suite. Performed weekly deploys to production servers and adjusted agile release schedule and methodology as needed.

Fuzzy economics project (2011/2012)

Privately funded by a (stealth) organization, this was a bleeding edge project to create a system for creating expansive yet detailed ontologies of hypotheses and their supporting evidence. Built the full stack prototype using Java EE with JSP and the JavaBayes toolkit for Bayesian Networks.

Founder - DesignByRobots

October 2006 - September 2013: Chicago, Illinois

Developed an algorithmic trading application written in OCaml using a genetic algorithm and statistical modeling techniques to find market price patterns on a time scale of less than one hour. Trading strategy optimization was the first application that uses DesignByRobot's data model for machine learning and automated design technology.

https://designbyrobots.com

Research Engineer - National Center for Supercomputing Applications (NCSA)

Jan 2008 - May 2009: Champaign, Illinois

Acted as a core developer and release engineer (now would be called devops) for project SEASR. SEASR is a development platform and analytics toolkit for Humanities research communities to develop, share, and deploy analytics driven web applications, primarily involving digitized document collections. Primary responsibility was design and execution of release process, including integration and final QA.

Technical Sales - RiverGlass, Inc.

October 2004 - September 2006: Chicago, Illinois

Acted as technical liason to sales and marketing. Brought technical expertise to solution consultations, sales calls, investor briefings, and other customer facing situations. Also responsible for defining marketing messaging by authoring abstracts and white papers, particularly in new markets and early stage product roll-outs.

Analyst/Developer - RiverGlass, Inc.

January 2004 - October 2004: Chicago, Illinois

First full-time employee of RiverGlass, Inc., a company formed to commercialize data mining technology developed at the National Center for Supercomputing Applications. Designed and implemented statistical and machine learning oriented applications in Java. Primary project involved analysis of groundwater monitoring schemes at chemical and oil spill sites to identify redundancies. Designed and developed software that justified the removal of redundant wells resulting in up to a 10% yearly cost savings for the potentially 50+ years of monitoring required by the EPA at spill sites.

Graduate Research Assistant - National Center for Supercomputing Applications

May 2000 - December 2003: Champaign, Illinois

As part of the Masters in Computer Science program at University of Illinois, performed research in the implemention of neural networks, genetic algorithms, and other automated learning techniques in Java. Thesis was on the subject of feature selection for machine learning in hyperspectral remote sensing (see Publications). This work involved parallel and distributed computing.

Publications

Blatti, C., Emad, A., Berry, M. J., Gatzke, L., Epstein, M., Lanier, D., Rizal, P., Ge, J., Liao, X., Sobh, O., Lambert, M., Post, C. S., Xiao, J., Groves, P., Epstein, A. T., Chen, X., Srinivasan, S., Lehnert, E., Kalari, K. R., Wang, L. & 8 others, Knowledge-guided analysis of "omics" data using the KnowEnG cloud platform, 2020, In: PLoS biology. 18, 1, e3000583.

Park, J., Chaney, E., You, S., Abdelrahman, A. M., Leiting, J. L., Yonkus, J. A., Groves, P. D., Harrington, J. J., Spillman, D. R., Lynch, I. T., Marjanovic, M., Tu, H., Bushell, C. B., Nelson, H., Truty, M. J. & Boppart, S. A. Characterizing treatment response of pancreatic tumor patient-derived xenografts in mice by Simultaneous Label-Free Autofluorescence Multi-Harmonic (SLAM) microscopy , 2020, Clinical and Translational Biophotonics, Translational 2020. OSA - The Optical Society, (Optics InfoBase Conference Papers; vol. Part F178-Translational-2020).

Minsker, B. S., Groves, P., and Beckmann, D. Optimizing Long Term Monitoring at a BP Site Using Multi-Objective Optimization, American Society of Civil Engineers (ASCE) Environmental & Water Resources Institute (EWRI) World Water & Environmental Resources Congress 2005 & Related Symposia, Anchorage, AK, 2005.

Groves, P., Bajscy, P. Rank Ordering with Accuracy Selection (ROWAS) for Hyperspectral Band Selection. M.S. Thesis. December 2003.

Bajscy, P., Groves, P. Methodology For Hyperspectral Band Selection. Journal of Photogrammetric Engineering and Remote Sensing. Vol 70, No. 7. pp. 793-802. July 2004.

Groves, P., and Bajscy, P. Methodology For Hyperspectral Band and Classification Model Selection . Proceedings of the IEEE Workshop on Advances in Techniques for Analysis of Remotely Sensed Data. October 27-28, 2003.

Education

Masters of Science in Computer Science

University of Illinois Urbana-Champaign
Graduation Date: December 2003

Bachelors of Science in Agricultural Engineering, Minor in Computer Science

University of Illinois Urbana-Champaign
Graduation Date: August 2001