PhD @ CMU | Efficient transformer architectures | Scientific ML
Transformer architecture research focused on scaling, memory, and communication structure. Author of FLARE: attention scaling to million-token regimes on a single GPU. Implements architectures in PyTorch and Triton. Background in high performance computing, numerical analysis, and computational fluid dynamics.
Links: Website | LinkedIn | Google Scholar
Below is a dense list of my open-source work (owned repos + active fork/work branches). These lists are intentionally selective and not exhaustive. Inclusion here means meaningful contribution or ownership, not necessarily sole maintainership.
- FLARE.py
: Fast Low-rank Attention Routing Engine; unified low-rank self-attention with O(NM) memory scaling.
- FastDiffusion.py
: Experiment with trigonometric noise schedule in context of few step diffusion.
- NeuralROMs.jl
: SNF-ROM implementation for projection-based nonlinear reduced-order modeling with smooth neural fields.
- KolmogorovArnold.jl
: Julia implementation of Kolmogorov-Arnold networks with custom gradients.
- SpectralElements.jl
: Julia spectral element method solvers and numerical experiments.
- NekTools
: Fortran 77 turbulence-budget and post-processing tools for NEK5000.
- mlutils.py
: Lightweight PyTorch project template and training utilities.
- PFHubBenchmarks
: Phase-field simulation benchmarks implemented with FEniCS.
- SciML/SciMLOperators.jl
: Operator abstractions for SciML/PDE workflows and matrix-free formulations.
- SciML/LinearSolve.jl
: Unified interface for direct and iterative linear solvers in the SciML stack.
- SciML/OrdinaryDiffEq.jl
: High-performance ODE solvers, including neural ODE workloads.
- SciML/NonlinearSolve.jl
: High-performance, differentiation-enabled nonlinear system solvers.
- SciML/Optimization.jl
: Unified optimization interface across local/global and gradient/derivative-free methods.
- SciML/SciMLBase.jl
: Base interfaces and shared problem abstractions for the SciML ecosystem.
- SciML/SciMLSensitivity.jl
: Sensitivity analysis and adjoint methods for differential equation models.
- SciML/DiffEqFlux.jl
: Neural differential equation tooling and SciML model training.
- SciML/DiffEqBase.jl
: Lightweight shared types/functionality for differential equation and SciML problems.
- FluxML/Flux.jl
: Julia machine learning framework.
- LuxDL/Lux.jl
: Explicitly parameterized neural networks in Julia.
- EnzymeAD/Reactant.jl
: Compiled Julia ML execution/runtime tooling in the Lux ecosystem.
- JuliaGPU/CUDA.jl
: CUDA programming and GPU kernels in Julia.
- JuliaDiff/ForwardDiff.jl
: Forward-mode automatic differentiation in Julia.
- JuliaGPU/GPUArrays.jl
: Common array abstractions shared across Julia GPU backends.
- FluxML/Zygote.jl
: Source-to-source reverse-mode automatic differentiation for Julia.
- FLARE: HF | arXiv | code
- SNF-ROM: JCP / arXiv | code
- spec: MATLAB spectral element code.
- wavyWallDNS: DNS results for turbulent flow over wavy walls.
- wallMountedCube: CFD simulation setup for flow over a wall-mounted cube.
Public profile snapshot used for this summary: 82 repos total (22 original, 60 forks) as of February 13, 2026.




