A Visual Intro to NumPy and Data Representation – Jay Alammar – Visualizing machine learning one concept at a time.

Discussions:Hacker News (366 points, 21 comments), Reddit r/MachineLearning (256 points, 18 comments)Translations: Chinese 1, Chinese 2, Japanese The NumPy package is the workhorse of data analysis, machine learning, and scientific computing in the python ecosystem. It vastly simplifies manipulating and crunching vectors and matrices. Some of python’s leading package rely on NumPy as a fundamental piece of their infrastructure (examples include scikit-learn, SciPy, pandas, and tensorflow). Beyond the ability to slice and dice numeric data, mastering numpy will give you an edge when dealing and debugging with advanced usecases in these libraries.In this post, we’ll look at some of the main ways to use NumPy and how it can represent different types of data (tables, images, text…etc) before we can serve them to machine learning models.

Source : A Visual Intro to NumPy and Data Representation – Jay Alammar – Visualizing machine learning one concept at a time.

Dealing With Duplicate Files – Control-Alt-Backspace

In the physical world, we encounter much difficulty because it’s hard to create copies of things: objects use finite resources and are expensive to produce, we have to physically repeat tasks over and over to do them multiple times, and so on. Ironically, in the digital world, many problems instead stem from how easy it is to copy things: some people make unauthorized copies of media and anger the distribution companies; others get doxxed and material they never wanted to share with anyone makes its way to millions of people; and all of us end up with four copies of the same set of files wasting our hard drive space and preventing us from remembering where we put things!

Source : Dealing With Duplicate Files – Control-Alt-Backspace