Knowledge Share

Browse by topic

All notes

  1. Machine Learning Self Attention At its core, self-attention is a sequence-to-sequence operation. It takes a sequence of vectors and produces a new sequence of vectors of the same…
  2. Machine Learning Vision Transformer (ViT) The Vision Transformer (ViT) represents a massive paradigm shift in computer vision. Introduced by Google in 2020 ("An Image is Worth 16x16 Words"), it…
  3. Machine Learning Diffusion Transformer (DiT) To understand the math behind the Diffusion Transformer (DiT), we have to separate it into two distinct parts: the mathematical framework (the…
  4. Machine Learning Classifier-Free Guidance (CFG) Classifier-Free Guidance (CFG) is arguably the most critical technique for achieving high-fidelity, strongly aligned generations in modern diffusion…
  5. Machine Learning Backpropagation Backpropagation (short for "backward propagation of errors") is the mathematical engine that allows neural networks to learn. At its core, it is an…
  6. Machine Learning AutoDiff Automatic differentiation (AutoDiff) is the algorithmic foundation that makes modern machine learning frameworks like PyTorch and JAX possible. While…
  7. Machine Learning KL Divergence At its core, Kullback-Leibler (KL) Divergence is a statistical measure of how much one probability distribution differs from a second, reference…
  8. Machine Learning ELBO (Evidence Lower Bound) In Bayesian inference and generative modeling, the Evidence Lower Bound (ELBO) is a crucial quantity used to approximate the marginal likelihood (the…
  9. Machine Learning Diffusion Model Diffusion models, specifically Denoising Diffusion Probabilistic Models (DDPMs), are generative models that learn to create data by reversing a gradual…
  10. Machine Learning Diffusion from Stochastic Differential Equations (SDEs) Perspective From a mathematical perspective, diffusion models are fundamentally about defining a trajectory between a complex, intractable data distribution and a…
  11. Machine Learning VAE vs. Diffusion from ELBO Perspective It is fascinating that two entirely different generative paradigms—Variational Autoencoders (VAEs) and Diffusion Models—are mathematically rooted in…
  12. Machine Learning DDIM (Denoising Diffusion Implicit Models) The fundamental difference between DDPM (Denoising Diffusion Probabilistic Models) and DDIM (Denoising Diffusion Implicit Models) lies entirely in the…
  13. Machine Learning DDPMs, DDIMs, and Score-Based Methods The connection between DDPMs, DDIMs, and Score-Based Generative Models is one of the most elegant unifying theories in modern machine learning…
  14. Machine Learning Flow Matching Flow matching is a highly effective mathematical framework for generative modeling. It serves as an alternative to Diffusion Models and provides a more…
  15. Machine Learning Mean Flow Notes on mean-flow generative modeling and its connection to flow matching.
  16. Machine Learning Improved Mean Flow (iMF) Improved mean flow (iMF) — faster sampling and training for flow-based generative models.
  17. Computer Vision Optical Flow Optical flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene, caused by the relative motion between an observer (an…
  18. Computer Vision Lucas-Kanade Method Example To see exactly how the Lucas-Kanade (LK) method solves for optical flow, we need to walk through the least-squares approximation.
  19. Computer Vision RAFT(Recurrent All-Pairs Field Transforms) RAFT (Recurrent All-Pairs Field Transforms) represents a major paradigm shift from traditional optical flow methods like Lucas-Kanade.
  20. Computer Vision Camera Intrinsic Matrix Intrinsics K, focal length, principal point, and pixel skew.
  21. Computer Vision Camera Extrinsic Matrix Estimating the camera extrinsic matrix—which defines the rigid transformation from the world coordinate system to the camera's local 3D coordinate…
  22. Computer Vision COLMAP COLMAP is an end-to-end pipeline for Structure-from-Motion (SfM) and Multi-View Stereo (MVS). It takes a collection of 2D images and mathematically…
  23. Computer Vision Bundle Adjustment Bundle adjustment is the cornerstone of 3D reconstruction, Structure from Motion (SfM), and visual SLAM. At its core, it is a large-scale, non-linear…
  24. Optimization Proximal Algorithms Proximal algorithms are a class of optimization methods designed to handle objective functions that are non-smooth, constrained, or split into multiple…
  25. Optimization Neural Proximal Operators This is the exact conceptual leap that birthed the Plug-and-Play (PnP) and Regularization by Denoising (RED) frameworks, revolutionizing how we solve…
  26. Optimization Analytical Proximal Operators As a quick refresher, the proximal operator of a scaled convex function \(\lambda f(x)\) evaluated at a point \(v\) is defined as:
  27. Optimization HQS (Half-Quadratic Splitting) It is designed to minimize an objective function that consists of two competing terms: a data fidelity term (how well the solution matches the…
  28. Optimization ADMM (Alternating Direction Method of Multipliers) The Alternating Direction Method of Multipliers (ADMM) is a powerful algorithm that solves convex optimization problems by breaking them into smaller,…
  29. Optimization Lagrangian Method The standard Lagrangian is a mathematical trick to turn a *constrained* problem into an *unconstrained* one. It does this by taking the hard rules…
  30. Computer Graphics 3D Gaussian Splatting 3D Gaussian Splatting (3DGS) is a breakthrough technique in computer graphics and computer vision for novel view synthesis. It emerged as a faster,…
  31. Computer Graphics NeRF Neural Radiance Fields (NeRF) represent a breakthrough approach to synthesizing novel views of complex 3D scenes from a sparse set of 2D images.…
  32. Geometry Generation Geometry Generation 3D shape synthesis, neural implicit surfaces, and mesh generation methods.
  33. Computational Imaging PnP with Diffusion Plug-and-play priors with diffusion models for computational imaging inverse problems.
  34. Computational Imaging Generative Methods for Deconv At its core, deconvolution is fundamentally ill-posed. Information is lost when an image is blurred, meaning multiple different sharp images could…

No notes match your search.