Sr Platform Engineer - ClassPass (2021- Present)
One of a small team working to maintain the DevOps tooling and base platform/infrastructure for this startup, later acquired by MindBody. Highlights include:
- Maintained and added features to a custom, from-scratch deployment app. App was written in React and Python, and interfaced with AWS ECS to deploy any commit from GitHub; the app uses an Airflow job to refresh the list of services. Features include canary deployment, linked service/worker deployment, and streaming logs.
- Wrote custom chatbot that interfaces with AWS to allow services to scale on demand, and interfaces with the above tool to deliver the latest deployed version right into Slack.
- Wrote a custom problem for interviewing platform developers, as opposed to only validating their coding ability. This allows more traditional infrastructure engineers to compete in our competitive interview process.
- Overhauled our analytics database (postgres-on-EC2) to include backups, streaming to more replica nodes, and performance improvements.
- Ran a biweekly training session for platform team.
- Provided leadership while our team was lacking a team lead, despite being the newest member of said team. Gracefully stepped back once the role was filled to allow better cohesion with the new team lead.
- Heavily involved with DEIB efforts, including being on the DEIB Council and leading an ERG.
SRE II - AXS (2018 - 2021)
Systems engineer and later SRE for this fast paced ticketing company. Highlights include:
- Maintained both Windows and Linux systems including patching, general maintenance, and on-call duties. Built new environments by hand where Terraform was not yet set up, including working with development and QA to debug the complex systems. Deployed code in overnight windows on behalf of development teams and managed configuration on legacy servers. The team covered multi-region infrastructure, handling issues in the US and Japan. The SRE on-call is typically the first team called when the ICM team gets word of a problem, as our problem-solving skills are legendary within the company.
- Introduced best practices for building environments with Terraform, including Terraform Enterprise (cloud-hosted). Designed a more efficient flow for creating multiple enviroments from the same template and oversaw an overhaul of the platform. This resulted in much faster and more repeatable build times for infrastructure, and documented the infrastructure we did have in reusable modules to ensure standardization.
- Oversaw transition from Teamcity to Bamboo for building, connecting projects to Octopus for deployment. Maintained Bamboo system including remote agents dynamically spun up in AWS. The transition to Bamboo allowed us to move into a build-infrastructure-as-code model, where the pipelines could again be standardized and documented.
- Helped company transition into a DevOps model from an Agile team-based model. This involved a migration from classic server-based architecture to Docker containers using AWS ECS service. This includes moving from Octopus to Spinnaker and building new infrastructure from the ground-up using Terraform Enterprise, as well as transitioning config into AWS Parameter Store. This move will produce a broader understanding of the environment across developers, allowing them to be more efficient in resolving and debugging problems as they occur without having to reach out to the SRE on-call.
- Scripted common server commands to work with hubot so they can be executed by SREs or ICM team via Slack. This includes targeted machine restarts, IIS resets, and stopping and starting Windows services. This reduces the amount of manual maintenance and the response time when problems occur, as these tasks previously required signing onto the VPN and connecting to the box itself.
Co-founder - Sock Drawer (2015 - Present)
Jack of all trades in this open-source organization, which is dormant at present due to loss of a couple developers. Hats include:
- Subject matter expert in all things testing. Along with my co-founder Accalia, the subject matter expert in all things development, we created many pieces of software that attained 100% unit testing out of the box. I have run usability studies, integration tests, security scans, and suggested multiple architecture revisions based on testability and maintainability.
- Build Engineer. I introduced many build and deployment tools to ensure that we are constantly getting feedback from production to improve our craft
- Project Manager: On multiple projects, I am acting as a project manager as well as a backend developer. I have introduced kanban boards, collected status from the distributed team, and charted the milestones for this Agile project.
- Backend developer: I have worked in Node.Js using TDD principles to engineer everything from an integration with Slack to the REST API for our new forum project. In particular, I am proud of being the tech lead on the SockMafia module, which is the single most complex Sockbot plugin to date.
Software Engineer (Platform Operations) - Dealer Tire (2017 - 2018)
Moved to Platform Operations team to continue work on build engineering, systems administration, and improved communication between developers and operations (DevOps). Highlights include:
- Continued development and improvement of the build/CI system, including administration of the distributed system, consulting on and at times developing new build plans, and creation of plugins for the Bamboo system.
- With the team, converted the entire three-language ecosystem to run in containers, resulting in reduced cycle times, improved developer involvement in the environment, and faster, less risky deployments. This includes upgrading some of the outdated CF9 code to Lucee, running performance tests, configuring Scratch containers for our built Go binaries, and converting our Node processes from nodemon to a container system. Platform Ops maintains base container images for all languages in our company plus our Nginx base container, and I personally created several of them during the course of the project.
- Pioneered new ways of running automated tests in containers to improve the reliability of our code promotions. This includes running unit and integration tests in the container at build time, as well as running functional tests in our container environment with Nightwatch.js after building but before deploying to our test environment for manual visual verification.
- Worked with Ansible to provision and maintain our legacy servers and our new Docker hosts. We templated out configurations that vary per environment for the containers to consume, as well as manage certificates and other secrets through this system.
- Used Vagrant to control a similar system for developers to run their own docker hosts on their laptops, allowing them to develop in a system like the one they will be deploying into.
- Deployed, configured, and maintained multiple monitoring systems, including AppDynamics, New Relic, Site 24x7, Datadog, and FusionReactor. Coordinated various metrics from monitoring systems into alerts that are actionable and include links to SOPs to ensure that our responses to problems are rapid and standardized.
- As part of an on-call rotation, helped ensure that our Operations center has the tools they need to troubleshoot common problems. Platform Operations is responsible for second-tier escalations as well, solving difficult performance issues and gathering data for the developers to resolve more difficult code-related issues.
- Developed custom tooling for Operations to deploy new applications, stop and start them, and obtain updates on the status of applications and of docker hosts. These interface with multiple tools to ensure that they are coordinated and that all parties are notified when events happen.
- Administrated developer tools such as Jira, Bamboo, Bitbucket, Rancher, Graylog, and our custom tooling for buliding and deploying.
QA Analyst - Dealer Tire (2014 - 2017)
Responsible for developing and implementing QA systems and best practices across our IT organization. Highlights include:
- Development of the company's first CI system, which we used to massively simplify the build and release process. This includes writing custom code to integrate test runners with legacy systems that were not architected for testing, as well as release scripts to deploy onto our various testing and production environments
- Administration of our JIRA system, which we use to track work products and bugs found during pre-production testing
- Weekly "teatime" sessions in which various topics in the quality domain are discussed on a technical level with the development and middleware teams
- Creating functional automation suites in Java for multiple products, integrated with the Sauce Labs service for cross-browser testing
- Advising on test case writing and manual testing best practices, including bringing code coverage metrics to version 6 of the flagship b2b application
- Creating performance tests that can be used during the build process to ensure that performance does not degrade over time
Web Developer (Contractor) - Dealer Tire (2013 - 2014)
Worked on various web development projects, including a revised CMS application and several backend service upgrades.
SQA Contractor - Diebold (2011 - 2013)
Worked on a team that automated and tested ATM software and utility software produced by Diebold using IBM Rational framework.
- Wrote test framework methods using IBM's Rational Functional Tester library, in Java.
- Worked closely with development to ensure that quality was built in from day one.
- Desgned, modeled, adn scripted test cases using Critical Logic's DTT and TMX products. These products allow the software to be modeled conceptually, and test cases to be automatically generated from the program model.
- Trained and mentored interns.
Freelance web development (2005-2007)
Worked part-time with local small businesses to create web sites and custom applications that would showcase the clients' products, including a gallery site for a photographer with a custom slideshow.