Bioinformatics & AI

Computational Biology

Bridging wet-lab science and computational analysis through programming, machine learning, and advanced bioinformatics.

With a powerful blend of programming expertise and extensive wet-lab experience in genomics and proteomics, I am uniquely positioned to bridge the gap between laboratory science and computational analysis. Over three years as a Genomics Postdoctoral Fellow at the College of Computer Science and Engineering, University of South Carolina, I have led five machine learning projects — advancing from decision trees in Scikit-learn to sophisticated neural networks using PyTorch.

5 ML Projects
3+ Years Postdoc
R & Py Primary Languages
8+ Publications

Languages & Environments

R & Bioconductor

Proficient in R for statistical computing and bioinformatics, leveraging the Bioconductor ecosystem for genomic data analysis.

  • DESeq2 — differential gene expression
  • Bioconductor packages
  • ggplot2 / visualization
  • RStudio / Quarto

Python

Advanced Python scripting for machine learning, structural biology, and sequence analysis pipelines.

  • Scikit-learn — classical ML
  • PyTorch — neural networks
  • Biopython — sequence analysis
  • Pandas / NumPy / Matplotlib

Command-Line & Linux

Proficient in text-based environments and shell scripting for pipeline automation and high-performance computing workflows.

  • Bash scripting & automation
  • HPC / cluster computing
  • Nextflow / workflow management
  • Git / version control

Research & Analysis Areas

Genomics

Sequencing Data Analysis & Phylogeny

Extensive wet-lab sequencing experience combined with programming expertise enables seamless integration of raw data into meaningful insights — alignments, quality control, and advanced phylogenetic analysis.

Transcriptomics

Gene Expression Data Analysis

RNA-Seq analysis pipelines from raw reads through differential expression, pathway enrichment, and biological interpretation. Includes DESeq2-based factorial modeling and circadian transcriptomics.

Epigenomics

DNA Methylation Analysis

Genome-wide methylation profiling to explore epigenetic changes and their influence on health. Applied Random Forest models to CpG loci data to identify biomarkers of cognitive impairment.

Structural Biology

Computational Structural Biology

An all-in-one pipeline from gene and construct design to structural analysis of macromolecules. Developing comprehensive tools bridging biology and computation using PyMOL, VMD, and GROMACS.

  • Protein structure prediction (AlphaFold, I-TASSER, ROSETTA)
  • Molecular dynamics (GROMACS, NAMD)
  • Residual dipolar coupling (RDC) analysis
  • Protein-protein interaction networks (Cytoscape)
Primer Design

Custom Primer Design & Assay Development

Deep understanding of PCR techniques combined with advanced programming skills enables custom primer and assay design for innovative diagnostic products, services, and research applications.

Machine Learning

Machine Learning in Biology

Led five ML projects in biological research, from classical decision trees to deep neural networks, applied to genomics, proteomics, and medical imaging.

  • Decision trees & random forests (Scikit-learn)
  • Deep learning (PyTorch)
  • Biomarker discovery from high-dimensional omics data
  • Vascular segmentation in CTA images

Computational Tools & Ecosystem

Bioinformatics Platforms

DESeq2 Bioconductor Biopython BLAST Clustal Omega MUSCLE IQ-TREE FastQC STAR / HISAT2

Structural Biology

AlphaFold PyMOL VMD GROMACS NAMD ROSETTA I-TASSER PDB

Machine Learning & Visualization

Scikit-learn PyTorch TensorFlow Cytoscape ggplot2 Matplotlib Seaborn Plotly

Training & Workshops

Available for workshops and hands-on training sessions in computational biology, bioinformatics, and data analysis for research groups, core facilities, and industry teams.

R for Genomics

Hands-on introduction to R and Bioconductor for RNA-Seq analysis, from data loading and quality control through differential expression and visualization.

Python for Bioinformatics

Practical workshop on Python and Biopython for sequence analysis, data manipulation, and building bioinformatics pipelines.

ML in Life Sciences

Introduction to machine learning approaches applicable to biological data — classification, feature selection, and interpretable models for omics datasets.

Interested in a workshop or collaboration? Get in touch. Visit my GitHub for code repositories and the Data Analysis in R 2024 resource.