Types of Sampling and Sampling Techniques

Here’s a scenario I’m sure you are familiar with. You download a relatively big dataset and are excited to get started with analyzing it and building your machine learning model. And snap – your machine gives an “out of memory” error while trying to load the dataset. In such cases, employing efficient data loading techniques and utilizing types of sampling can help manage memory constraints and facilitate analysis.

It’s happened to the best of us. It’s one of the biggest hurdles we face in data science – dealing with massive amounts of data on computationally limited machines (not all of us have Google’s resource power!).

So how can we overcome this perennial problem? Is there a way to pick a subset of the data and analyze that – and that can be a good representation of the entire dataset?

types_of_sampling

Yes! And that method is called types of sampling. I’m sure you’ve come across this term a lot during your school/university days, and perhaps even in your professional career. Sampling is a great way to pick up a subset of the data and analyze that. But then – should we just pick up any subset randomly?

Well, we’ll discuss that in this article. We will talk about eight different types of sampling techniques and where you can use each one. This is a beginner-friendly article but some knowledge about descriptive statistics will serve you well.

If you’re new to statistics and data science, I encourage you to check out our two popular courses:

Table of contents