Significance of Statistics in Data Science

History of statistics

The word statistics seems to have derived from the Latin word status, Italian word statista, or german word Statistik each of which means a political state. Prof. Prasanta Chandra Mahalanobis is known as the father of Indian Statistics. In ancient times, the government used to collect information regarding the population and property or wealth of the country – the former enabling the government to have an idea of the manpower of the country, and the latter providing it a basis for introducing new taxes and levies. The theoretical development of the so-called modern statistics came during the mid-seventeenth century with the introduction of the theory of probability.

What is Statistics?

Statistics have been defined differently by different authors from time to time. Statistics is the science which deals with the collection, classification, and tabulation of numerical facts as the basis for explanation, description, and comparison of the phenomenon. Statistics deals with the collection, analysis, interpretation, presentation, and organization of data. It allows companies to collect data, translate the data into information so that decisions can be made based on facts, rather than intuition, gut feel, or past experience. It is like a powerful microscope that makes visible what has previously been invisible. It is a tool that separates common sense reasoning from extraordinary reasoning. Statistics create a foundation for quality, which translates to profitability and market share. Statistics may be called the science of counting. It may rightly be called the science of averages. It is the science of the measurement of the social organism regarded as a whole in all its manifestations.

Importance of Statistics



The basics of statistics include terminologies and methods of applying statistics in data science. In order to analyze the data, the important tool is statistics. In modern times, Statistics is viewed not as a mere device for collecting numerical data but as a means of developing sound techniques for their handling and analysis and drawing valid inferences from them. The concepts involved in statistics help provide insights into the data to perform quantitative analysis on it. It is now finding wide applications in almost all sciences – social as well as physical-such as biology, psychology, education, economics, business management, etc. It is hardly possible to enumerate even a single department of human activity where statistics does not creep in. It has rather become indispensable in all phases of human endeavor.

Statistics for data science


Data scientists use statistical concepts to determine whether there should be significant differences. Let’s say you’re a bulb manufacturer company and you’re trying to test the life of the bulb. Data Scientists can help you decide what sample size you should assign to the experimental group to get clear results, and how to run the study spending as little money as possible. It is also used to build models that predict accurate results without noise or less error. Statistics also help in understanding the data, statistics used for this purpose is data visualization. It also helps in estimating accurately, helps in solving business problems. Most data analysts use statistical concepts such as measures of central tendency (mean, mode, median), measures of variability ( range, variance, standard deviation), various types of probability distributions, population and sampling, Confidence interval and hypothesis testing, correlation, and regression analysis. Almost all the top companies are using data science nowadays. Companies like IBM, Amazon, Google, Oracle, JP Morgan Chase, Facebook, Twitter, Instagram, Apple. In today’s world, many people prefer to travel by uber. Uber is the most popular and fastest-growing startup. It is the simplest process to book an uber by just pressing a button, setting the pickup location, requesting a car, going for a ride, and paying with a click of a button. On the product front, Uber’s data team is behind all the predictive models powering the ride-sharing cab service right from predicting that “Your driver will be in here in 3 minutes.” to estimate fares, showing upsurge prices and heat maps to the drivers on where to position themselves within the city. The business success of Uber depends on its ability to create a positive user experience through statistical data analysis.

Even Swiggy is India’s largest online food ordering and delivery platform, founded in 2014. Swiggy is based in Bangalore, India, and as of March 2019, was operating in 100 Indian cities. In early 2019, Swiggy expanded into general product deliveries under the name Swiggy Stores. Swiggy uses different statistical graphs and charts to see the top types of food you order to prefer. Even to find the trends of past years of how the business made profits and losses.


Image References:

Leave a Reply

Your email address will not be published. Required fields are marked *