Posts Fundamentals for Computer vision
Post
Cancel

Fundamentals for Computer vision

Introduction

Vision is about perception of world and Human vision is about how humans pursue and inspect world around us. Same applies to computers and we call it Computer Vision (CV).

Let’s take an example, You are sitting in front of your laptop and switched on webcam. computer doesnot know what is happening in camera. If you want computer to identify you, than you have to give computers a Vision system from which computers can extract features and identify you. on next level, you can setup CV system such that if anybody else than you are sitting in front of laptop, CV system will identify him, than computer will automatically switch off for security.

Computers play with bits and we express real world to computers in form of pixels. pixels are bytes representing a color. Images are 2-D arrays of pixels and videos are 1-D array of frames or images, ultimately, videos are 3-D arrays of pixels. Now, If you pass your image and video frames to computers, it can see but can’t differentiate and inspect. So, We have to build a Vision system and recognition system for above purpose.

These are terminologoies related to computer vision Machine Vision, Pattern Recognition, Image Processing, Image Understanding. And it feels same at first sight. Let’s find a difference and purpose of them.

Computer Vision System : By passing image to CV System, CV system extracts features from image and extracts features from image. features can depend upon sophistication of CV system.

  • Input : Image
  • Output : features (Vectors)

Pattern Recognition : Pattern recognition collects features from CV System and makes interpretation from it. It is finding class of image.

  • Input : Features
  • Output : Interpretation & Decision

Image Processing (IP) : It is classified in 2 types, Low level and high level image processing.

  • Low-level IP : Noise reduction and simple filters
    • Input : Image
    • Output : Image
  • High-level IP : Machine-vision and image understanding comes under High level IP. It inspects and understand images and than makes decision and interpretation form images.
    • Input : Image
    • Output : Decision/Interpretation

Neural Nets are computationl structures inspired from neurons in human brain,it is most trending research topic currently.

Note: Sometimes, Sophistication of CV elimiates PR and CV is able to make decisions and interpretations directly.

For ex. Neural nets are embedded in CV system can classify and make decisions such that pattern recognition system is not required.


Essentials of Linear algebra

  1. Products
    • Inner product
    • Outer product
    • Cross product
  2. Space
    • Vector space
    • Null space
    • Function space
  3. Transformation
    • Linear
    • Orthonormal
  4. Decomposition
    • Eigen decomposition
    • Singular value decomposition
    • Eigenvectors and eigen values
  5. Function Minimization
    • Gradient descent
    • Local vs Global minima
    • Simulated annealing
This post is licensed under CC BY 4.0 by the author.