# Evolution & Wetware #### Machine Learning Singapore
[Martin Andrews](http://mdda.net) @ [reddragon.ai](http://reddragon.ai/)
;[Sam Witteveen](http://samwitteveen.com) @ [reddragon.ai](http://reddragon.ai/)
22-May-2025
--- ## Today's Line-up * "Evolving GPU Kernels"
- _Martin Andrews_ * "A Deep Learning Approach for Nanomedicine Design"
- _Alvin Chan_ --- # Evolving GPU Kernels #### Machine Learning Singapore
[Martin Andrews](http://mdda.net) @ [reddragon.ai](http://reddragon.ai/)
; $x^2=17_i$
22-May-2025
--- ## About Me * Machine Intelligence / Startups / Finance + Moved from NYC to Singapore in Sep-2013 * 2014 = 'fun' : + Machine Learning, Deep Learning, NLP + Robots, drones * Since 2015 = 'serious' :: NLP + deep learning + Including Papers... + & GDE ML; ML-Singapore co-organiser... + & Red Dragon AI... -- ## About Red Dragon AI * Deep Learning Consulting & Prototyping (Google Partner) - Education / Training - Research : NeurIPS / EMNLP / NAACL / ICML / ICLR * Please contact us for : - Language model training (eg: on-prem) - Knowledgebase interaction & reasoning - Sales-oriented applications --- ## Outline * GPU Kernels + What's involved? + AMD Challenge * Evolutionary Algorithms + Some History / Ideas * AlphaEvolve + What's new? * Wrap-up & QR-code ;* Head's Up! --- ## GPU Kernels * Complexity of GPU normally hidden + PyTorch, Keras, JAX, TensorFlow * But sometimes the details matter + DeepSeek; FlashAttention; NeRFs; MAMBA + == Writing CUDA (or equivalent) * So what's so difficult? -- ## CPUs vs GPUs
* CPU : Few complex, independent cores * GPU : Many simple, tied cores -- ## 2080Ti ( Turing ) [
](https://developer.nvidia.com/blog/nvidia-turing-architecture-in-depth/) * So are the boxes the cores? -- ## No... It goes deeper
-- ## ... Tensor Cores
--- ## Matrix Multiply
```java for(int m=0; m
](https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/) -- ## Launching GPU kernels
* Each (regular) core is assigned to an area of the output matrix ;Blog post : progressively more involved ;Quite a few slides… -- ## But there's more... * [Excellent Blog Post](https://siboehm.com/articles/22/CUDA-MMM) on DIY Matrix Multiply + NB: Doesn't use the [Tensor Cores...](https://github.com/andylolu2/simpleGEMM/blob/master/gemm.cuh) | Step | Method | GFLOPs/sec | | ---- | ------ | ---------: | | 1 | Naïve approach | 309 | | 5 | 2D Block Tiling | 15972 | | 10 | Warptiling | 21779 | | | cuBLAS library | 23250 | | | | | ;https://www.reddit.com/r/MachineLearning/comments/1cqhsln/p_simplegemm_fast_and_minimal_tensor_core_matrix/ --- ## AMD GPU kernels [
](https://www.datamonsters.com/amd-developer-challenge-2025) * Registration Deadline ~ last MeetUp + Competition deadlines : 27-May - 2-June + `FP8 GEMM` · `Fused MOE` · `MLA with ROPE` ; https://x.com/pavel_4_ai/status/1915039361655083223 ; AMD software is improving rapidly ; CUDA isn't a moat forever, but Nvidia is building new ones with the Python DSL, Dynamo, and more ; Meanwhile Nvidia hardware advantage is huge this year, but perf/TCO of 355X has attracted some customers ; MI450X is actually competitive with Rubin -- ## Big Question * Can `Gemini Pro 2.5` write AMD GPU kernels? + Short answer: YES + Longer answer: YES, but how competitive is it? * Next Question: + How can we automate code optimisation? --- ## Evolutionary Algorithms #### A bit of history * Back in the mid-1990s: + Neural Networks only 'kinda' worked - whereas HMMs and SVMs were on the horizon + But Genetic Algorithms / Programming actually worked - So: My PhD was in a NN lab, but I did GP ; - and did temporarily 'win' for 2000-2010 --- ## [Genetic Algorithms](https://en.wikipedia.org/wiki/Genetic_algorithm) * Basic Ideas: + Individual = Bit String + Population = 100s of Individuals + Fitness = evaluate each Individual + Selection = Choose 'good' individuals + Mutation & Crossover - to generate new individuals - which replace 'bad' individuals -- ## Genetic Algorithms
-- ## Genetic Algorithms * Widely believed / observed: + Mutation is a local operator :: Weak + Crossover powers the global search * Note strong parallels with Nature + Mutations are clearly a thing, BUT... + Practically all species have 2 parents --- ## [Genetic Programming](https://en.wikipedia.org/wiki/Genetic_programming)
* Each Individual is a program, represented as a tree -- ## GP Crossover
* Crossover operation : Appears to be MADNESS! -- ## GP Crossover Madness * Behaviour of Population != Individual x 100 * For Crossover to work at all: + Qualities that propagate need more robust Individuals + We can look at 'dead code' (for instance) - and draw an analogy with Junk DNA -- ## Genetic Programming #### The Field * Genetic Programming Bibliography ... + now surpasses 10k entries * In 2010, Koza listed 77 results ... + where GP was human-competitive + ... in all sorts of fields --- ## Evolution Innovations * [Novelty Search](https://www.semanticscholar.org/paper/NOVELTY-SEARCH-AND-THE-PROBLEM-WITH-OBJECTIVES-TO-Lehman-Stanley/e49d1ee1bddea0922faca358f3fd42474baad300?p2df) - Lehman & Stanley (2011) + "Why Greatness Cannot be Planned" * [MAP-Elites](https://arxiv.org/abs/1504.04909) - Mouret & Clune (2015) + Also : Work by *MLSG speaker* Jenny Zhang * Help to solve "Population Collapse" --- ## Evolution with LLMs * Can use an LLM as the Mutation/Crossover operator + ... and operate on text / prompts / code * Evolving Prompts: + [Promptbreeder](https://arxiv.org/abs/2309.16797) - Fernando _et al_ (2023) - "Self-Referential Self-Improvement via Prompt Evolution" + [Self-Discover](https://arxiv.org/abs/2402.03620) - Zhou _et al_ (2024) - "Large Language Models Self-Compose Reasoning Structures" * Evolving Programs: + [FunSearch](https://www.nature.com/articles/s41586-023-06924-6.pdf) - Romera-Paredes _et al_ (2024) - "Mathematical discoveries from program search with large language models" -- ## Applications ... * ... to GPU Kernel writing should be clear! --- ##
AlphaEvolve
* [_AlphaEvolve_: A coding agent for scientific and algorithmic discovery](https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf) - Novikov _et al_ (2025) + DeepMind [Blog Post](https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/) * The headline: + New AI agent evolves algorithms ... + ... for math and practical applications in computing ... + ... by combining the creativity of LLMs with automated evaluators -- ##
AlphaEvolve
-- ## Open Implementation
* [OpenEvolve Repo on GitHub](https://github.com/codelion/openevolve) (Apache 2) + Super-fast follow-up to the AlphaEvolve announcement! + Developer [Twitter Thread](https://x.com/asankhaya/status/1925153525597982970) & [Reddit Posting](https://www.reddit.com/r/LocalLLaMA/comments/1kr9rvp/openevolve_open_source_implementation_of/) -- ##
AlphaEvolve
Key Results * Faster matrix multiplication + for 4x4 matrices (and others) * Discovering mathematical objects or constructions + that possess optimal (or near-optimal) properties * Optimizing Google's computing ecosystem + +0.7% for Google's fleet-wide compute resources * Optimizing Gemini kernel tiling strategy + 23% kernel speedup across all kernels + 1% reduction in Gemini's overall training time ;+ (Not in the GPU sense) -- ## Basic Matrix Multiplication
* 2x2 matrices "clearly" requires 8 multiplies -- ## Strassen's Method
* 2x2 matrices *needs* only 7 multiplies! * AlphaEvolve finds similar tricks for larger matrices --- ## New Factor :
Better LLMs * Things LLMs can do: + Write / change programs + Compare methods / Compare outcomes + Create / combine instructions + Adding : Human notions of elegance & novelty * Key question: + How do we achieve crossover *everywhere*? ; + Phenotypes Vs Genotypes --- ## Wrap-Up * Everything Old is New again! * LLMs can power larger systems + ... that have surprising capabilities * Experimentally : It's very early days!
NB: MLSG wants to feature Your Talk!
-- ## Link to Slides [
](https://bit.ly/MLSG_2025-05) [https://bit.ly/MLSG_2025-05](https://bit.ly/MLSG_2025-05) --- ## A Deep Learning Approach for Nanomedicine Design #### Alvin Chan * Lipid nanoparticles (LNPs) * COMET : + Predict LNP efficacy and ... + ... accelerate the design of next-generation RNA medicines --- ## Further Study * Field is growing very rapidly * Lots of different things can be done * Easy to find novel methods / applications -- ## Deep Learning Foundations * 3 week-days + online content * Play with real models & Pick-a-Project * Held online, Live Coding, Certificates * Last run : Early September -- ## NLP (Advanced) ### Advanced NLP and Sequence Processing * NLP (eg: Named Entity Recognition) * Transformers : Theory and Practice * Generative AI * Last run : Early October -- ## Vision (Advanced) ### Advanced Computer Vision with Deep Learning * Advanced classification * Other architectures (eg: U-Nets) * Transformer-based vision * Last run : Early November -- ## AI in Production ### Building Real World A.I. Applications * DIY : node-server + task-queue + python-ml * TensorFlow Serving / PyTorch Serve * TF Lite + TF.js : edge device models * Distillation, pruning, quantisation, etc... * Last run : Early February -- ## Deep Learning for PMs ### ( `= Foundations - code`
`+ management` ) * Much more about 'big picture' * Only a few code examples * Project process standardised * Last run : Late January -- ## Also... * Unsupervised methods * Time-series & Deep Learning * Audio Processing (Sounds & Speech) ;-- ; ;## QR code for Courses ; ;
--- ## Machine Learning SG
MeetUp Group * Next Meeting = 19-June-2025 * Topic : TBA * Typical Contents : + Talk for people starting out + Something from the bleeding-edge + Lightning Talks * [MeetUp.com / Machine-Learning-Singapore](https://www.meetup.com/Machine-Learning-Singapore/) -- ## Quick Poll #### Show of hands * What topic(s) would _compel_ you to come? + [Vibe Coding](https://x.com/MatthewBerman/status/1904039128611914144) + LLMs for Science + Agents + Stable-diffusion++ / Video / Gaussian Splatting + Robotics + AI for Education --- # - Questions -
;`Handouts :` [`https://bit.ly/`
`text-similarity-jan-2022`](https://bit.ly/text-similarity-jan-2022)