An intuitive way to visualize how SVD works

August 23, 2018

SVD is the core algorithm of Latent Semantic Analysis. I have demonstrated how to perform LSA/LSI using SVD in blog here. However, SVD itself is not an easy thing to understand. To fully understand how it works requires quite a lot of linear algebra knowledge. We all agree math is important but it is not necessary for everyone to understand the math behind SVD in order to use SVD. Is there an intuitive way to see how it works? I mean, without any equations and formulas? Yes, there is. Today, i am going to let you SEE how it works ( :

I randomly downloaded a free picture from internet. An adorable parrot bird.

In order to convert the image into a matrix that can be applied with SVD, i need to convert it into gray scale first, and then perform SVD. The original gray-scaled image's dimension is 2000 x 3000 (see below)

This is how it looks like after we only keep the most important 300 components (2000 x 300). It doesn't look it changes that much, right?

This is how it looks like at (2000 x 50). This definitely makes the image more smudged and we can easily spot the difference now. However we still can see it is a parrot without any difficulty.

This is how it looks like at (2000 x 10). It is a very harsh reduction. A lot of background and bird face information have lost during this transformation. Even though we still can tell it is a bird, we might not be able to tell it is a parrot.

Conclusion

The main idea of LSI is to perform a dimension reduction on our documents features. The more we reduce, the more information we lose. However, getting rid of some unimportant information is not a bad thing at all, i.e. the 2000 x 50 image still tells us pretty much all we can tell from its original image, and the reduction will make future computation much easier.

Search This Blog

Command Line Interface aka CLI

An intuitive way to visualize how SVD works

Conclusion

Comments

Post a Comment

Popular posts from this blog

Perform efficient Latent Semantic Index using Python

SVM