by Param Khakhar

Categories

  • Statistics

Tags

  • Intro
  • Statistics

image-center

The world is certainly uncertain, and statistics is a subject to estimate this uncertainity! Yes, no other field of study can deal with such uncertainity, there are always certain assumptions. In this and some of the following articles I’d write about some of the aspects of statistics the knowledge of whose can very much help to better understand the numbers, claims, etc. which are made by people.

Statistical thinking can certainly help you in making wiser decisions, understand and judge the trustworthyness of different personel, and help you become better citizens, consumers, voters, etc. Even in academics, most subjects involve the study of chance events and as a result you need to know the knowledge of statistics. So even it’d help you become better academicians!

Being a statistician [means] you get to play in everyone’s backyard.

John Turkey

In order to better understand the process of statistical analysis, suppose that you yourself are a statistician at a company and you have collected data for the number of employees that are absent on a given day in a company. You have collected the daily records for over 30 weeks. Now probably you would look for some patterns and even before that wonder what patterns might even exist. A simple organization such as grouping the number of absentees by days of the week may give you some insights. Doing that you notice that Mondays and Fridays are the days which have higher number of employees absent then the other days on average and also you can group the days of the week in two with Monday and Friday in one and the rest all weekdays in the other. Creating different summaries of the data is known as statistical description. You may also use certain visualization techniques in order to discover some pattern.

Other than methods used for statistical description, there are also methods which the statisticians can use in order to generalize the detected patterns in a wider setting. This is known as statistical inference. Now what makes this valueable is that this comes with an objective measure of the likelihood that it is correct. However, we can never by sure that a generalization is correct, because uncertainity is so pervasive in the real world.

Now returning to your original task, your hypothesis of higher absence rates on Monday and Fridays can statistically tested and we can also obtain the uncertainity of it being correct. The alternate hypothesis would then be that the data obtained is just by chance and no such concrete generalization exists. Interestingly, the wider setting in the generalization can also refer to future and in a way we can predict future! (with some uncertainity). If your hypothesis turns out to be correct you can certainly notify the board and go home.

Now, all the above stuff may seem to be paradoxical as what are the rules which you use to make predictions about the future given the uncertainities of the world. The thing is that statistics involves logic rules which are different from the traditional mathematics. Essentially, statistical description and statistical inference are the workday roles of statistics.

There are several other byways, some of which are also aren’t discussed in the curriculum such as:

  • Paradoxes of probability and statistics
  • Problems of using standard statistical techniques in non-standard situations.
  • Social Impact of Statistics

How Statistics is different from Mathematics?

  • Turns out that statisticians solve the problem in a real life setting in contrast to mathematicians. Mathematical problems, abstracts out from all the uncertainity and fthen proceeds to report a unique solution. However, when we incorporate chances to play their role, the solution derived from maths is simply an approximation from the true value. Statisticians also calculate an answer and that too is an approximation but with the answer, a statistician also reports an uncertainity. Thus, the answer would be good only if the uncertainity is less.

  • Mathematicians also need data and they plug it in some general theorem and hence obtain the result. However, statisticians proceed the other way round. They are given a particular sample of information (from the entire population) and their job is to estimate the population from the obtained sample. Thus, for mathematicians data refers to the values of non-random variables whereas for statisticians, data refers to the values of random variables. Consequently, statisticians would have to take care about erroneous data which might be present in the example which they have.

Trivia

Consider 3 cities A,B and C. The annual average temperatures (in degree Celsius) are 11.7, 25.2 and 27 respectively. Can we infer from this information that B is twice as hot as A in a year? Also would it be correct to say that B and C have similar temperatures throughout the year?

Well, No. The thing is we can’t and we shouldn’t limit ourselves to the comparisons of the annual means. It may turn out that during some months B is slightly hotter than A but that trend is not preserved throughout the year. Also it might turn out that B and C have significantly different temperatures throughout a particular month.

It often turns out that the byways are often more interesting then the stuff discussed in the curriculum, so stay tuned.

Reference

  • A Panorama of Statistics