Part I Image Formation and Image Models 1 CAMERAS 1.1 Pinhole Cameras 1.1.1 Perspective Projection 1.1.2 Affine Projection 1.2 Cameras with Lenses 1.2.1 Paraxial Geometric Optics 1.2.2 Thin Lenses 1.2.3 Real Lenses 1.3 The Human Eye 1.4 Sensing 1.4.1 CCD Cameras 1.4.2 Sensor Models 1.5 Notes Problems 2 GEOMETRIC CAMERA MODELS 2.1 Elements of analytical Euclidean Geometry 2.1.1 Coordinate Systems and Homogeneous Coordinates 2.1.2 Coordinate System Changes and Rigid Transformations 2.2 Camera Parameters and the Perspective Projection 2.2.1 Intrinsic Parameters 2.2.2 Extrinsic Parameters 2.2.3 A Characterization of Perspective Projection Matrices 2.3 Affine Cameras and Affine Projection Equations 2.3.1 Affine Cameras 2.3.2 Affine Projection Equations 2.3.3 A Characterization of Affine Projection Matrices 2.4 Notes Problems 3 GEOMETRIC CAMERA CALIBRATION 3.1 Least-Squares Parameter Estimation 3.1.1 Linear Least-Squares Methods 3.1.2 Nonlinear Least-Squares Methods 3.2 A Linear Approach to Camera Calibration 3.2.1 Estimation of the Projection Matrix 3.2.2 Estimation of the Intrinsic and Extrinsic Parameters 3.2.3 Degenerate Point Configurations 3.3 Taking Radial Distortion into Account 3.3.1 Estimation of the Projection Matrix 3.3.2 Estimation of the Intrinsic and Extrinsic Parameters 3.3.3 Degenerate Point Configurations 3.4 Analytical Photogrammetry 3.5 An Application:Mobile Robot Localization 3.6 Notes Problems 4 RADIOMETRY-MEASURING LIGHT 4.1 Light in Space 4.1.1 Foreshortening 4.1.2 Solid Angle 4.1.3 Radiance 4.2 Light at Surfaces 4.2.1 Simplifying Assumptions 4.2.2 The Bidirectional Reflectance Distribution Function 4.2.3 Example:The Radiometry of Thin Lenses 4.3 Important Special Cases 4.3.1 Radiosity 4.3.2 directional Hemispheric Reflectance 4.3.3 Lambertian Surfaces and Albedo 4.3.4 Specular Surfaces 4.3.5 The Lambertian+Specular Model 4.4 Notes Problems 5 SOURCES,SHADOWS,AND SHADING 5.1 Qualitative Radiometry 5.2 Sources and Their Effects 5.2.1 Radiometric Properties of Light Sources 5.2.2 Point Sources 5.2.3 Line Sources 5.2.4 Area Sources 5.3 Local Shading Models 5.3.1 Local Shading Models for Point Sources 5.3.2 Area Sources and Their Shadows 5.3.3 Ambient Illumination 5.4 Application:Photometric Stereo 5.4.1 Normal and Albedo from Many Views 5.4.2 Shape from Normals 5.5 Interreflections:Global Shading Models 5.5.1 An Interreflection Models 5.5.2 Solving for Radiosity 5.5.3 The Qualitative Effects of Interreflections 5.6 Notes Problems 6 COLOR 6.1 The Physics of Color 6.1.1 Radiometry for Colored Lights:Spectral Quantities 6.1.2 The Color of Sources 6.1.3 The Color of Surfaces 6.2 Human Color Perception 6.2.1 Color Matching 6.2.2 Color Receptors 6.3 Representing Color 6.3.1 Linear Color Spaces 6.3.2 Non-linear Color Spaces 6.3.3 Spatial and Temporal Effects 6.4 A Model for Image Color 6.4.1 Cameras 6.4.2 A Model for Image Color 6.4.3 Application:Finding Specularities 6.5 Surface Color from Image Color 6.5.1 Surface Color Perception in People 6.5.2 Inferring Lightness 6.5.3 Surface Color from Finite-Dimensional Linear Models 6.6 Notes Problems Part II Early Vision:Just One Image 7 LINEAR FILTERS 7.1 Linear Filters and Convolution 7.1.1 Convolution 7.2 Shift Invariant Linear Systems 7.2.1 Discrete Convolution 7.2.2 Continuous Convolution 7.2.3 Edge Effects in Discrete Convolutions 7.3 Spatial Frequecny and Fourier Transforms 7.3.1 Fourier Transforms 7.4 Sampling and Aliasing 7.4.1 Sampling 7.4.2 Aliasing 7.4.3 Smoothing and Resampling 7.5 Filters as Templates 7.5.1 Convolution as a dot Product 7.5.2 Changing Basis 7.6 Technique:Normalized Correlation and Finding Patterns 7.6.1 Controlling the Television by Finding Hands by Normalized Correlation 7.7 Technique:Scale and Image Pyramids 7.7.1 The Gaussian Pyramid 7.7.2 Applications of Scaled Representations 7.8 Notes Problems 8 EDGE DETECTION 8.1 Noise 8.1.1 Additive Stationary Gaussian Noise 8.1.2 Why Finite Differences Respond to Noise 8.2 Estimating Derivatives 8.2.1 Derivative of Gaussian Filters 8.2.2 Why Smoothing Helps 8.2.3 Choosing a Smoothing Filter 8.2.4 Why Smooth with a Gaussian? 8.3 Detecting Edges 8.3.1 Using the Laplacian to Detect Edges 8.3.2 Gradient-Based Edge Detectors 8.3.3 Technique:Orientation Representations and Corners 8.4 Notes Problems 9 TEXTURE 9.1 Representing Texture 9.1.1 Extracting Image Structure with Filter Banks 9.1.2 Representing Texture Using the Statistics of Filter Outputs 9.2 Analysis(and Synthesis)Using Oriented Pyramids 9.2.1 The Laplacian Pyramid 9.2.2 Filters in the Spatial Frequency Domain 9.2.3 Oriented Pyramids 9.3 Application:Synthesizing Textures for Rendering 9.3.1 Homogeneity 9.3.2 Synthesis by Sampling Local Models 9.4 Shape from Texture 9.4.1 Shape from Texture for Planes 9.5 Notes Problems Part III Early Vision:Multiple Images 10 THE GEOMETRY OF MULTIPLE VIEWS 10.1 Two Views 10.1.1 Epipolar Geometry 10.1.2 The Calibrated Case 10.1.3 Small Motions 10.1.4 The Uncalibrated Case 10.1.5 Weak Calibration 10.2 Three Views 10.2.1 Trifocal Geometry 10.2.2 The Calibrated Case 10.2.3 The Uncalibrated Case 10.2.4 Estimation of the Trifocal Tensor 10.3 More Views 10.4 Notes Problems 11 STEREOPSIS 11.1 Reconstruction 11.1.1 Image Rectification 11.2 Human Stereopsis 11.3 Binocular Fusion 11.3.1 Correlation 11.3.2 Multi-Scale Edge Matching 11.3.3 Dynamic Programming 11.4 Using More Cameras 11.4.1 Three Cameras 11.4.2 Multiple Cameras 11.5 Notes Problems 12 AFFINE STRUCTURE FROM MOTION 12.1 Elements of Affine Geometry 12.1.1 Affine Spaces and Barycentric Combinations 12.1.2 Affine Subspaces and Affine Coordinates 12.1.3 Affine Transformations and Affine Projection Models 12.1.4 Affine Shape 12.2 Affine Structure and Motion from Two Images 12.2.1 Geometric Scene Reconstruction 12.2.2 Algebraic Motion Estimation 12.3 Affine Structure and Motion from Multiple Images 12.3.1 The Affine Structure of Affine Image Sequences 12.3.2 A Factorization Approach to Affine Structure from Motion 12.4 From Affine to Euclidean Images 12.4.1 Euclidean Constraints and Calibrated Affine Cameras 12.4.2 Computing Euclidean Upgrades from Multiple Views 12.5 Affine Motion Segmentation 12.5.1 The Reduced Row-Echelon Form of the Data Matrix 12.5.2 The Shape Interaction Matrix 12.6 Notes Problems 13 PROJECTIVE STRUCTURE FROM MOTION 13.1 Elements of Projective Geometry 13.1.1 Projective Spaces 13.1.2 Projective Subspaces and Projective Coordinates 13.1.3 Affine and Projective Spaces 13.1.4 Hyperplanes and Duality 13.1.5 Cross-Ratios and Projective Coordinates 13.1.6 Projective Transformations 13.1.7 Projective Shape 13.2 Projective Structure and Motion from Binocular Correspondences 13.2.1 Geometric Scene Reconstruction 13.2.2 algebraic Motion Estimation 13.3 Projective Motion Estimation from Multilinear Constraints 13.3.1 Motion Estimation from Fundamental Matrices 13.3.2 Motion Estimation from Trifocal Tensors 13.4 Projective Structure and Motion from Multiple Images 13.4.1 A Factorization Approach to Projective Structure from Motion 13.4.2 Bundle Adjustment 13.5 From Projective to Euclidean Images 13.6 Notes Problems Part IV Mid-Level Vision 14 SEGMENTATION BY CLUSTERING 14.1 What Is Segmentation? 14.1.1 Model Problems 14.1.2 Segmentation as Clustering 14.2 Human Vision:Grouping and Gestalt 14.3 Applications:Shot Boundary Detection and Background Subtraction 14.3.1 Background Subtraction 14.3.2 Shot Boundary Detection 14.4 Image Segmentation by Clustering Pixels 14.4.1 Segmentation Using Simple Clustering Methods 14.4.2 Clustering and Segmentation by K-means 14.5 Segmentation by Graph-Theoretic Clustering 14.5.1 Terminology for Graphs 14.5.2 The Overall Approach 14.5.3 Affinity Measures 14.5.4 Eigenvectors and Segmentation 14.5.5 Normalized Cuts 14.6 Notes Problems 15 SEGMENTATION BY FITTING A MODEL 15.1 The Hough Transform 15.1.1 Fitting Lines with the Hough Transform 15.1.2 Practical Problems with the Hough Transform 15.2 Fitting Lines 15.2.1 Line Fitting with Least Squares 15.2.2 Which Point Is on Which Line? 15.3 Fitting Curves 15.3.1 Implicit Curves 15.3.2 Parametric Curves 15.4 Fitting as a Probabilistic Inference Problem 15.5 Robustness 15.5.1 M-estimators 15.5.2 RANSAC 15.6 Example:Using RANSAC to Fit Fundamental Matrices 15.6.1 An Expression for Fitting Error 15.6.2 Correspondence as Noise 15.6.3 Applying RANSAC 15.6.4 Finding the distance 15.6.5 Fitting a Fundamental Matrix to Known Correspondences 15.7 Notes Problems 16 SEGMENTATION AN FITTING USING PROBABILISTIC METHODS 16.1 Missing Data Problems,Fitting,and Segmentation 16.1.1 Missing Data Problems 16.1.2 The EM Algorithm 16.1.3 The EM Algorithm in the General Case 16.2 The EM Algorithm in Practice 16.2.1 Example:Image Segmentation,Revisited 16.2.2 Example:Line Fitting with EM 16.2.3 Example:Motion Segmentation and EM 16.2.4 Example:Using EM to Identify Outliers 16.2.5 Example:Background Subtraction Using EM 16.2.6 Example:EM and the Fundamental Matrix 16.2.7 Difficulties with the EM Algorithm 16.3 Model Selection:Which Model Is the Best Fit? 16.3.1 Basic Ideas 16.3.2 AIC-An Information Criterion 16.3.3 Bayesian Methods and Schwartz'BIC 16.3.4 Description Length 16.3.5 Other Methods for Estimating Deviance 16.4 Notes Problems 17 TRACKING WITH LINEAR DYNAMIC MODELS 17.1 Tracking as an Abstract Inference Problem 17.1.1 Independence Assumptions 17.1.2 Tracking as Inference 17.1.3 Overview 17.2 Linear Dynamic Models 17.2.1 Drifting Points 17.2.2 Constant Velocity 17.2.3 Constant Acceleration 17.2.4 Periodic Motion 17.2.5 Higher Order Models 17.3 Kalman Filtering 17.3.1 The Kalman Filter for a 1D State Vector 17.3.2 The Kalman Update Equations for a General State Vector 17.3.3 Forward-Backward Smoothings 17.4 Data Association 17.4.1 Choosing the Nearest-Global Nearest Neighbours 17.4.2 Gating and Probabiistic Data Association 17.5 Applications and Examples 17.5.1 Vehicle Tracking 17.6 Notes Problems Part V High-Level Vision:Geometric Methods 18 MODEL-BASED VISION 18.1 Initial Assumptions 18.1.1 Obtaining Hypotheses 18.2 Obtaining Hypotheses by Pose Consistency 18.2.1 Pose Consistency for Perspective Cameras 18.2.2 Affine and Projective Camera Models 18.2.3 Linear Combinations of Models 18.3 Obtaining Hypotheses by Pose Clustering 18.4 Obtaining Hypotheses Using Invariants 18.4.1 Invariants for Plane Figures 18.4.2 Geometric Hashing 18.4.3 Invariants and Indexing 18.5 Verification 18.5.1 Edge Proximity 18.5.2 Similarity in Texture,Pattern,and Intensity 18.6 Application:Registration in Medical Imaging Systems 18.6.1 Imaging Modes 18.6.2 Applications of Registration 18.6.3 Geometric Hashing Techniques in Medical Imaging 18.7 Curved Surfaces and Alignment 18.8 Notes Problems 19 SMOOTH SURFACES AND THEIR OUTLINES 19.1 Elements of Differential Geometry 19.1.1 Curves 19.1.2 Surfaces 19.2 Contour Geometry 19.2.1 The Occluding Contour and the Image Contour 19.2.2 The Cusps and Inflections of the Image Contour 19.2.3 Koenderink's Theorem 19.3 Notes Problems 20 ASPECT GRAPHS 20.1 Visual Events:More Differential Geometry 20.1.1 The Geometry of the Gauss Map 20.1.2 Asymptotic Curves 20.1.3 The Asymptotic Spherical Map 20.1.4 Local Visual Events 20.1.5 The Bitangent Ray Manifold 20.1.6 Multilocal Visual Events 20.2 Computing the Aspect Graph 20.2.1 Step 1:Tracing Visual Events 20.2.2 Step 2:constructing the Regions 20.2.3 Remaining Steps of the Algorithm 20.2.4 An Example 20.3 Aspect Graphs and Object Localization 20.4 Notes Problems 21 RANGE DATA 21.1 Active Range Sensors 21.2 Range Data Segmentation 21.2.1 Elements of Analytical Differential Geometry 21.2.2 Finding Step and Roof Edges in Range Images 21.2.3 Segmenting Range Images into Planar Regions 21.3 Range Image Registration and Model Acquisition 21.3.1 Quaternions 21.3.2 Registering Range Images Using the Iterative Closest-Point Method 21.3.3 Fusing Multiple Range Images 21.4 Object Recognition 21.4.1 Matching Piecewise-Planar Surfaces Using Interpretation Trees 21.4.2 Matching Free-Form Surfaces Using Spin Images 21.5 Notes Problems Part VI High-Level Vision:Probabilistic and Inferential Methods 22 FINDING TEMPLATES USING CLASSIFIERS 22.1 Classifiers 22.1.1 Using Loss to Determine Decisions 22.1.2 Overview:Methods for Building Classifiers 22.1.3 Example:A Plug-in Classifier for Normal Class-conditional Densities 22.1.4 Example:A Nonparametric Classifier Using Nearest Neighbors 22.1.5 Estimating and Improving Performance 22.2 Building classifiers from Class Histograms 22.2.1 Finding Skin Pixels Using a Classifier 22.2.2 Face Finding Assuming Independent Template Responses 22.3 Feature Selection 22.3.1 Principal Component Analysis 22.3.2 Identifying Individuals with Principal Components Analysis 22.3.3 Canonical Variates 22.4 Neural Networks 22.4.1 Key Ideas 22.4.2 Minimizing the Error 22.4.3 When to Stop Training 22.4.4 Finding Faces Using Neural Networks 22.4.5 Convolutional Neural Nets 22.5 The Support Vector Machine 22.5.1 Support Vector Machines for Linearly Separable Datasets 22.5.2 Finding Pedestrians Using Support Vector Machines 22.6 Notes Problems 22.7 Appendix I:Backpropagation 22.8 Appendix II:Support Vector Machines for Datasets That Are Not Linearly Separable 22.9 Appendix III:Using Support Vector Machines with Non-Linear Kernels 23 RECOGNITION BY RELATIONS BETWEEN TEMPLATES 23.1 Finding Objects by Voting on Relations between Templates 23.1.1 Describing Image Patches 23.1.2 Voting and a Simple Generative Model 23.1.3 Probabilistic Models for Voting 23.1.4 Voting on Relations 23.1.5 Voting and 3D Objects 23.2 Relational Reasoning Using Probabilistic Models and Search 23.2.1 Correspondence and Search 23.2.2 Example:Finding Faces 23.3 Using Classifiers to Prune Search 23.3.1 Identifying Acceptable Assemblies Using Projected Classifiers 23.3.2 Example:Finding People and Horses Using Spatial Relations 23.4 Technique:Hidden Markov Models 23.4.1 Formal Matters 23.4.2 Computing with Hidden Markov Models 23.4.3 Varieties of HMMs 23.5 Application:Hidden Markov Models and Sign Language Understanding 23.5.1 Language Models:Sentences from Words 23.6 Application:Finding People with Hidden Markov Models 23.7 Notes 24 GEOMETRIC TEMPLATES FROM SPATIAL RELATIONS 24.1 Simple Relations between Object and Image 24.1.1 Relations for Curved Surfaces 24.1.2 Class-Based Grouping 24.2 Primitives,Templates,and Geometric Inference 24.2.1 Generalized Cylinders as Volumetric Primitives 24.2.2 Ribbons 24.2.3 What Can One Represent with Ribbons? 24.2.4 Linking 3D and 2D for Cylinders of Known Length 24.2.5 Linking 3D and Image Data Using Explicit Geometric Reasoning 24.3 Afterword:Object Recognition 24.3.1 The Facts on the Ground 24.3.2 Current Approaches to Object Recognition 24.3.3 Limitations 24.4 Notes Problems Part VII Applications 25 APPLICATION:FINDING IN DIGITAL LIBRARIES 25.1 Background:Organizing Collections of Information 25.1.1 How Well Does the System Work? 25.1.2 What Do Users Want? 25.1.3 Searching for Pictures 25.1.4 Structuring and Browsing 25.2 Summary Representations of the Whole Picture 25.2.1 Histograms and Correlograms 25.2.2 Textures and Textures of Textures 25.3 Representations of Parts of the Picture 25.3.1 Segmentation 25.3.2 Template Matching 25.3.3 Shape and Correspondence 25.3.4 Clustering and Organizing Collections 25.4 Video 25.5 Notes 26 APPLICATION:IMAGE-BASED RENDERING 26.1 Constructing 3D Models from Image Sequences 26.1.1 Scene Modeling from Registered Images 26.1.2 Scene Modeling from Unregistered Images 26.2 Transfer-Based Approaches to Image-Based Rendering 26.2.1 Affine View Synthesis 26.2.2 Euclidean View Synthesis 26.3 The Light Field 26.4 Notes Problems BIBLIOGRAPHY INDEX