Site icon Machine Learning For Analytics

Introduction to Statistics

Introduction to Statistics

Welcome to ML for Analytics! In today’s post, we will talk about statistics for data science. Let’s begin our discussion by paying attention to an important quote:

Data by itself is useless. Data is only useful if you apply it.

-Todd Park

Truly said. The data is everywhere around us. The tweets we write, the videos we like, the posts we write, the images we click on, everything is generating heaps of data. Today’s world is filled with data and if this data is used wisely, we can make important conclusions out of it.

But how?

Well, businesses use this data to see patterns inside the data collected. These patterns can then be used to make important inferences which can generate huge profits for them. Data science is an art of analyzing the data generated by the businesses, to look for any meaningful patterns in them, which can be interpreted to make inferences and generate profit.

In doing all this, an important “grammar of science” is used by data practitioners, which is called as “statistics”.

What is statistics?

Statistics can be thought of as the science which involves collection of data, so that it can be analysed and tabulated to make inferences out of it. To understand more about it, consider the following statements:

In the above statements, the numerical figures are nothing but statistics. Statistical interpretation are carried out on a given data set. Data set comprises of data elements,  variables and observations.

Measurement scales

Having talked about the data set, let’s now talk about the different scales of measurements. Measurement scales are of two types:

Qualitative measurement scales are of two types:

Quantitative measurement scales are also of two types:

Samples and population

Now, let’s talk about the samples and population.

Population

Population can be termed as the set of all elements that are of our interest. For example, while taking surveys related to exit polls, all the citizens above 18 years of age are of our interest and hence this becomes our population.

Sample

Sample is a subset of population. That is, it becomes very difficult to ask every person above 18 years of age about their preferred candidate. So, it will be wise to go for some random person after every 5 minutes and ask him about his preference. Thus, here, the set of candidates becomes a sample. A random sample is the closest approximation of the population.

Sample survey

An important point to be noted here is that the survey carried out on the population is termed as census and the survey carried out on a sample is called as sample survey. Sample statistics are the estimated of the population parameters (characteristics).

So, guys, stay tuned for more informative tutorial on business analytics. In the next tutorial, we will talk about descriptive statistics. For more updates and news related to this blog as well as to data science, machine learning and data visualization, please follow our facebook page by clicking this link.

 

Exit mobile version