Data Manipulation with R

Front Cover
Springer Science & Business Media, Mar 19, 2008 - Computers - 152 pages
3 Reviews

Since its inception, R has become one of the preeminent programs for statistical computing and data analysis. The ready availability of the program, along with a wide variety of packages and the supportive R community make R an excellent choice for almost any kind of computing task related to statistics. However, many users, especially those with experience in other languages, do not take advantage of the full power of R. Because of the nature of R, solutions that make sense in other languages may not be very efficient in R. This book presents a wide array of methods applicable for reading data into R, and efficiently manipulating that data.

In addition to the built-in functions, a number of readily available packages from CRAN (the Comprehensive R Archive Network) are also covered. All of the methods presented take advantage of the core features of R: vectorization, efficient use of subscripting, and the proper use of the varied functions in R that are provided for common data management tasks.

Most experienced R users discover that, especially when working with large data sets, it may be helpful to use other programs, notably databases, in conjunction with R. Accordingly, the use of databases in R is covered in detail, along with methods for extracting data from spreadsheets and datasets created by other programs. Character manipulation, while sometimes overlooked within R, is also covered in detail, allowing problems that are traditionally solved by scripting languages to be carried out entirely within R. For users with experience in other languages, guidelines for the effective use of programming constructs like loops are provided. Since many statistical modeling and graphics functions need their data presented in a data frame, techniques for converting the output of commonly used functions to data frames are provided throughout the book.

Using a variety of examples based on data sets included with R, along with easily simulated data sets, the book is recommended to anyone using R who wishes to advance from simple examples to practical real-life data manipulation solutions.

Phil Spector is Applications Manager of the Statistical Computing Facility and Adjunct Professor in the Department of Statistics at University of California, Berkeley.

  

What people are saying - Write a review

User Review - Flag as inappropriate

Excellent book

Review: Data Manipulation with R

User Review  - Greg - Goodreads

Really handy book. It contains lots of tips and tricks for working with data. It is especially handy for someone who is an old fashioned S user who has come to R as less than a noob, but not familiar with all the extensions added since the original Bell Labs distro of S. Read full review

Selected pages

Contents

Data in R
1
12 Data Storage in R
2
13 Testing for Modes and Classes
7
15 Conversion of Objects
8
16 Missing Values
10
Reading and Writing Data
12
readtable
15
23 Comma and TabDelimited Input Files
17
45 Time Intervals
64
46 Time Sequences
65
Factors
67
52 Numeric Factors
70
54 Creating Factors from Continuous Variables
72
55 Factors Based on Dates and Times
73
56 Interactions
74
Subscripting
75

25 Extracting Data from R Objects
18
26 Connections
23
27 Reading Large Data Files
25
28 Generating Data
27
282 Random Numbers
29
29 Permutations
30
210 Working with Sequences
31
211 Spreadsheets
33
2112 The gdata Package All Platforms
34
212 Saving and Loading R Data Objects
35
213 Working with Binary Files
36
214 Writing R Objects to Files in ASCII Format
38
2142 The writetable function
39
R and Databases
43
312 Basics of SQL
44
313 Aggregation
45
314 Joining Two Databases
46
315 Subqueries
47
316 Modifying Database Records
48
32 ODBC
49
33 Using the RODBC Package
50
34 The DBI Package
51
36 Performing Queries
52
38 Getting Data into MySQL
53
39 More Complex Aggregations
55
Dates
57
42 The chron Package
59
43 POSIX Classes
60
44 Working with Dates
63
64 Logical Subscripts
76
65 Subscripting Matrices and Arrays
77
66 Specialized Functions for Matrices
81
67 Lists
82
68 Subscripting Data Frames
83
Character Manipulation
86
73 Working with Parts of Character Values
89
74 Regular Expressions in R
90
75 Basics of Regular Expressions
91
76 Breaking Apart Character Values
93
77 Using Regular Expressions in R
94
78 Substitutions and Tagging
98
Data Aggregation
101
82 Road Map for Aggregation
106
83 Mapping a Function to a Vector or List
107
84 Mapping a function to a matrix or array
110
85 Mapping a Function Based on Groups
113
86 The reshape Package
120
87 Loops in R
126
Reshaping Data
131
92 Recoding Variables
132
93 The recode Function
134
94 Reshaping Data Frames
135
95 The reshape Package
140
96 Combining Data Frames
142
97 Under the Hood of merge
146
Index
148
Copyright

Common terms and phrases

Bibliographic information