Using unsupervised clustering in a supervised way.

K-means clustering is a way of vector quantization, originally from signal processing that aims to cluster observations based on mean. Lets start with clarifying the premise of clustering case that is explored here; to segment clients . Client segmentation is the method of partitioning an organization’s clients into clusters that reflect likeness among clients in each grouping. The objective of such dissection of clients is to determine how to identify with clients in each fragment to increase the worth of every client to the business.

One of the popular machine learning techniques for this is K-means clustering, one of the…


Scrapy is an open-source framework for extracting the data from websites. It is fast, simple, and extensible. Every data scientist should have familiarity with this, as they often need to gather data in this manner. Data scientists usually prefer some sort of computational notebook for managing their workflow. Jupyter Notebook is very popular amid data scientists among other options like PyCharm, zeppelin, VS Code, nteract, Google Colab, and spyder to name a few.

Scraping using Scrapy is done with a .py file often. It can be also initialized from a Notebook. …


Different methods for feature selection and why should anyone bother for feature selection; by comparing different approaches for selecting optimal feature selection method on housing dataset.

When working with a large dataset, modelling can be time consuming to run because of number of features. It is not uncommon to have hundreds of features for a model. Then it is critical to weed out irrelevant and subpar features. This is when the concept of feature selection comes into play. In this article, I shall try to inform about few of the widely used techniques for feature selection, and demo some of them.

Feature selection is extremely important step for modelling a computationally efficient model. There are a bunch of a technique for this. …


Using Beautifulsoup, Scrapy, and Selenium for relatively smaller projects.

We need data to work on data science project. Luckily, Internet is full of data. We can obtain those by fetching readily available data from source or call an API. Sometimes they are behind paywall, or data is not up to date. Then the only way to get data is from the website. For simple task copying and pasting does the trick. But for large data which is spread across several pages that is impractical. In those scenario web scraping can help you extract any kind of data that you want…

Tamjid Ahsan

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store