The Random Science: Statistics

Showing posts with label Statistics. Show all posts

How to Perform Statistical Computations Using Microsoft Excel: Analysis of Variance for Complete Random Design

Statistical computation is one of the hardest parts in the analysis of numerical data. Though there are specialized software programs designed to perform statistical computation and analysis, this operation can be done in Microsoft Excel, a program very common and familiar to us all.

The default Microsoft Excel does not immediately present the tools necessary for deeper statistical computation. However, these tools or add-ons are actually present in Excel but are not yet installed. To access and use them, you have to install them first. In my previous article (Statistical Computing Using Microsoft Excel: Basic Statistics), I have detailed the procedure on how to install the add-ons on Microsoft Excel. Please see link to follow the installation procedure.

Read the rest of the articles.

How to Perform Statistical Computations using Microsoft Excel: Basic Statistics

Statistics is all about data, particularly those quantitative information and even qualitative facts converted to numerical equivalents. So what is being done with these data? After collecting and organizing them, we summarize and present them in a simplified and compressed form that is easy to understand and discern. The much harder task is the interpretation of the results and of the summary you have made.

With regards to summarizing and presenting statistical data, the key activities are computation and creation of charts or tabular presentations. We can do it manually or using calculators. But with the bulk of data that you may have, such work would be difficult. However, Microsoft Excel can do the job with an eyes’ blink. It is a software program common to almost every computer and hence, this simple step-by-step tutorial can be followed by anyone.

In statistics, one way of summarizing data is to compute for the measures of central tendency (such as mean, median, and mode) and the measures of variation (such as standard deviation and variance). Here is a quick guide to computing them using Microsoft Excel.

Read the rest of the article »

Types of Variables

In studying statistics, we always deal with different data or our raw materials for investigations. In order to have the right tool of analysis for them, we must first know what type of variable we are considering. A variable is the characteristic about each individual element of a population or sample. Understanding the nature of variables will help us have better analysis of our data.

There are two kinds of variable. Here is a brief summary of these variables and their descriptions:

1. Qualitative, Attribute, or Categorical Variable – a variable that categorizes or describes an element of a population

- classified according to some criterion

a. Nominal Variable – a qualitative variable that categorizes (or describes, or names) an element of a population

b. Ordinal Variable – a qualitative variable that incorporates an ordered position, or ranking

2. Quantitative, or Numerical Variable – a variable that quantifies an element of a population

- are counts or measurement

a. Discrete Variable – a quantitative variable that can assume a countable number (also known as counting numbers or integers) of values, meaning, there is a gap between any two values

b. Continuous Variable – a quantitative variable that can assume an uncountable number (numbers with decimal points) of values, meaning, it can assume any value along a line interval

Examples:

X (Citizenship), X = American, English, Chinese, Filipino, Arabian, Japanese
Y(Number of rooms), Y=1,2,3…
Z(Value of house), o<z< infinity
H(Health), H=poor, fair, good, excellent
A(Number of hospital admissions), A=0,1,2,3,…
V(Age of last birthday), V= 0,1,2,3,…
W(True age,) O<W<infinity
G(Gender), G = Male, Female

Exercises:

Identify the following as attribute (qualitative) or numerical (quantitative) variable, and then further as nominal, ordinal, discrete, or continuous variable.

The length of time until a pain reliever begins to work.
The residence hall for each student in a statistics class.
The temperature in Barrow, Alaska at 12:00 pm on any given day.
The number of staples in a stapler.
Whether or not a 6-volt lantern battery is defective.
The pH level of the water in a swimming pool.
The number of colors in a statistics book.
The weight of a lead pencil.
The number of chocolate chips in a cookie.
The amount of gasoline pumped by the next 10 customers at the local Shell Station.
The color of the baseball cap worn by each of 20 students.
The type of book taken out of the library by an adult.
The make of automobile driven by each faculty member.
The number of files on a computer’s hard disk.
The brand of refrigerator in a home.

Two Major Areas of Statistics

There are several tools in statistics like graphs, correlation, analysis of variance, and various mean separation tests. Their specific usage depends upon the nature of the study. Yet, whatever tool or method is used, they all belong to any of the two broad areas of statistics.

The first area is what we call as Descriptive Statistics. It comprises those methods concerned with collecting and describing a set of numerical data so as to yield meaningful inference. This statistics provides information only about the collected data and in no way draws inferences. This can either be graphical or computational like construction of tables, charts, graphs, and other relevant computations. It may also include the study of relationships between and among variables.

The second area of statistics is the Inferential Statistics. If descriptive statistics is concerned only on presentation of data, inferential statistics comprises those methods concerned with the analysis of a subset of data leading to predictions or inferences about the entire set of data. It involves all the techniques by which decisions about a statistical population are made based only on a sample having been observed or a judgment having been obtained. It is concerned more with generalizing information or making inference about the population. Considered as the central function of modern statistics, inferential statistics is concerned with two types of problems: (a) estimation of population parameters, and (b) tests of hypothesis.

Example:

Height of Crops as Affected by Different Nitrogen Level (in inches)

Level 1: 3.45 2.86 3.12 2.95 3.05 Mean – 3.086

Level 2: 4.66 5.54 4.75 5.33 5.24 Mean – 5.104

Level 3: 5.32 7.65 6.87 7.41 7.36 Mean – 6.922

Descriptive Statistics:

The shortest plant has a height of 2.86 inches while the tallest is 7.65 inches.
Plants in Level 1 have a mean height of 3.086 inches, while Levels 2 and 3 produce mean heights of 5.104 and 6.922 inches, respectively.
Level 1 nitrogen produced the shortest plants, while level 3 the tallest.
Plants in the experiment have a grand mean of 5.037 inches.
Chart of Mean Height (in inches)

Inferential Statistics:

Based on the results, the height of plants increases as the level of nitrogen application increases.

Plant height in Level 1 is statistically different from Level 2 and 3. However, difference in Levels 2 and 3 is not significantly different from each other.

There is a strong correlation between the level of nitrogen application and plant height.

Visit: myLot User Profile

Research and Research Methods

A research aims to find answers to questions through the use of scientific methods. The main purpose of research is either to describe or to explain the phenomena under study. Different research problems imply different research goals, which in turn, call for varied methods and techniques. However, not all problems or questions are researchable. The best research design is one that will add to knowledge no matter what the results are (Slavin, 1984).

Types of Research:

a. Pure Research – or fundamental or basic research

- arises from a desire to know for the sake of knowing; academic in nature

- is driven by scientist’s curiosity or interest in a scientific question

- main motivation is to expand man’s knowledge, not to create or invent something

- no obvious commercial values to the discoveries that result from this research

- e.g. Floral Biology and Pollination of Ampalaya (Momordica charantia L.),

Identification of Hybrid Sterility Gene Loci in Two Cytoplastic Male Sterile Lines in Rice

b. Applied Research – comes from a desire to gain knowledge for useful ends using and applying the theories derived from pure research; practical in nature

- goal of the applied scientist is to improve the human condition

- e.g. In vitro Regeneration of Sambong (Blumea balsamifera Linn.),

Development and Release in the Philippines of Sweet Potato Variety ‘NSIC Sp-31’

Types of Research Methods

a. Experimental Method – researcher manipulates a variable to see if it produces any changes in the response of interest

- applicable in all scientific disciplines and appropriate whenever practical and ethical

- only method by which the effect of manipulated variable can be isolated to detect cause-and-effect relationships

- done under highly controlled settings which may be considered as artificial and thus may not reflect what really happens in the less controlled and more complex real world

b. Correlation – determine the degree and direction of relationship between two or more variables or measures of behavior

- can be used when it is impractical and/or unethical to manipulate variables

c. Naturalistic Observation – researcher observes and records some behavior or phenomenon over a prolonged period

- researchers observe behavior in the setting in which it normally occurs rather than the artificial and limited setting of the laboratory

- may be used to validate some laboratory finding or theoretical concept

d. Survey – data collected via interviews or questionnaires to which subjects are asked to respond

- use extensively in the social and natural sciences to assess attitudes and opinions on a variety of subjects

- data may be affected by interviewer or the enumerator

e. Case Study – get a detailed contextual view of an individual’s life of particular phenomena

- useful when researcher cannot, for practical or ethical reasons, do experimental studies

- involves one or few individuals and therefore may not represent the population

Three Principles of Experimental Design

In a research study, the proper design and layout of experimental design is crucial to the success of the study. To accomplish this, the following principles must always be remembered and practiced:

a.Replication – repetition of the experiment on many experimental units

Functions:

1.To provide for an estimate of experimental error which is used for tests of significance;

2.To improve the precision of the experiment by reducing the standard error of the mean;

3.To increase the scope of inference of the experiment; and

4. To effect control of error variance.

Precision – the closeness to one another of a set of separated measurements of a quantity

Accuracy – closeness to the absolute or true value of the quantity measured

Coefficient of Variation – indicates the variation within the data set

Factors that determine the number of replications:

1. Uniformity of experimental units

2. Experimental designs

3. Degree of precision required

4. Number of treatments

5. Time allotment

6. Cost and availability of funds or resources

b. Randomization – use of chance to divide experimental units into groups

- a process of assigning the treatments among the experimental units such that every treatment has equal chance of being assigned to any experimental unit