Daniel Siegle — AI/ML Engineer

About

I'm Daniel Siegle, an AI/ML Engineer based in North Carolina. I contract through Possible Futures, LLC, where I focus on LLM training and fine-tuning as well as traditional ML projects across the biotech and life-sciences space.

I earned my M.S. in Pharmaceutical Sciences from North Carolina Central University. My thesis — Cytochrome P450 Inhibitor Classification with Statistical Learning — benchmarked Bayesian binary QSAR against scikit-learn classifiers on HTS luminescence assay data for five CYP isozymes (1A2, 2C9, 2C19, 2D6, 3A4). Before that I completed a B.S. in Biology at UNC-Chapel Hill.

On the industry side, I spent four years at Q² Solutions — first as a Scientific Programmer, then as a Bioinformatics Software Engineer at Q² Genomics, where I built sequencing analysis, QC, and data transfer pipelines for a satellite NGS lab in Beijing. More recently I taught AI for Health and Life Sciences at the North Carolina School of Science and Mathematics (NCSSM). I started out in drug manufacturing at Biogen Idec as a Manufacturing Associate / Associate Scientist.

Outside of work, I run Deep Learning RTP, a 1,700+ member community meetup in the Research Triangle focused on deep learning, with over 400 past events and counting.

Projects

LLM Training Pipeline

Production infrastructure for large language model fine-tuning — 50K+ training examples, 500+ tokens/sec throughput. End-to-end pipeline covering data preparation, training orchestration, and evaluation.

gut-typist

A Lua-based typing tutor built for the terminal. Lightweight, distraction-free typing practice with progress tracking.

View on GitHub →

CYP450 Inhibitor Classification

Master's thesis project comparing Bayesian binary QSAR models with scikit-learn classifiers for predicting cytochrome P450 inhibition. Evaluated performance on HTS luminescence assay data across five CYP isozymes (1A2, 2C9, 2C19, 2D6, 3A4), with applications in early-stage drug metabolism screening.

LLM Tuning Demonstration

End-to-end walkthrough of LLM fine-tuning techniques — from data formatting through training to evaluation. Designed as a practical reference for teams adopting LLM customization.

View on GitHub →

NGS Bioinformatics (Q² Genomics)

Developed and maintained genomic analysis pipelines for next-generation sequencing data at Q² Solutions. Supported a satellite lab in Beijing with sequencing QC, analysis workflows, and cross-site data transfer infrastructure. Received the 2019 CEO Team Award as part of the Q² Genomics Global Expansion Team for Beijing lab support.

Neural Network from Scratch

A hands-on implementation of a neural network using only NumPy — no frameworks, just math. Walk through backpropagation, gradient descent, and activation functions step by step.

View the full notebook →

Community

Deep Learning RTP

I founded Deep Learning RTP, a community meetup in the Research Triangle with 1,700+ members and 400+ past events. We meet weekly on Wednesdays noon–2 pm at The Frontier in RTP, and monthly on 4th Tuesdays 6–8 pm in downtown Durham.

Workshops & Study Groups

The group runs hands-on workshop series and study groups built around resources like fast.ai, Andrew Ng's courses, and Andrej Karpathy's "Zero to Hero" neural network curriculum. Recent highlight: a multi-session Neural Network Coding Workshop walking through Karpathy's series from scratch.

Co-organizers

Deep Learning RTP is run with fellow organizers Matthew Kenney, Kelly Masters-Melton, Jeff Lee, and William Hill.

Get in Touch

Interested in collaborating or attending a meetup? Reach out:

Writing

Building a Neural Network in NumPy

A step-by-step tutorial walking through the construction of a neural network from scratch — covering forward propagation, loss functions, backpropagation, and training loops, all implemented in pure NumPy.

Read the tutorial →

CYP450 Prediction & Cheminformatics in Pharma

How statistical learning and molecular fingerprints can predict cytochrome P450 inhibition — bridging HTS assay data with QSAR modeling for early ADMET screening in drug development. Coming soon.

Karpathy "Zero to Hero" Workshop Recap

Notes from the Neural Network Coding Workshop series at Deep Learning RTP, working through Andrej Karpathy's "Zero to Hero" curriculum — building language models from scratch, one layer at a time. Coming soon.