Tuesday, October 23, 2012

R Study Note: Understand the object/data

R is a powerful statistic software which is widely used to deal with large data. It is programmable, which means it could be customerized.

This is a study note from http://cran.r-project.org/, which has very helpful study materials.

R is an objective-orientated program, which has a strong focus on statistics. To master this language, I need to get through the understanding of

1. What are the objectives in R, and how to recognize them
2. How to operate on these objectives in R
3. A basic understanding about the syntax about function

In objective-orientated program, each objective has attributes, and its methods. I will first list out those objectives commonly used in R, and then their attributes. For the method, most people know how to operate on them, or anyway we can get an error message :)

I did not include "expression" or "function" as objectives, because those are mostly conceptual objectives, has not much effect on our data analysis stage.

OBJECTIVE TYPE:

1. Vector
    R has six basic (`atomic') vector types: logical, integer, real, complex, string (or character) and raw.          

2. Matrix
    All columns in a matrix must have the same mode(numeric, character, etc.) and the same length.

3. Array
    Arrays are similar to matrices but can have more than two dimensions.

4. Data Frame
    A data frame is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.). Similar with SAS or SPSS data set. So this is more general if dealing with real statistical data.

5. List
    Generic vectors. An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name. This is more GENERAL than data frame.

6. Factors
    Factors are used to describe items that can have a finite number of values (gender, social class, etc.). A factor has a levels attribute and class "factor".
    This one is powerful in terms of it groups those have same values into the same factor.

No comments:

Post a Comment