## Data Manipulation with RSince its inception, R has become one of the preeminent programs for statistical computing and data analysis. The ready availability of the program, along with a wide variety of packages and the supportive R community make R an excellent choice for almost any kind of computing task related to statistics. However, many users, especially those with experience in other languages, do not take advantage of the full power of R. Because of the nature of R, solutions that make sense in other languages may not be very efficient in R. This book presents a wide array of methods applicable for reading data into R, and efficiently manipulating that data. In addition to the built-in functions, a number of readily available packages from CRAN (the Comprehensive R Archive Network) are also covered. All of the methods presented take advantage of the core features of R: vectorization, efficient use of subscripting, and the proper use of the varied functions in R that are provided for common data management tasks. Most experienced R users discover that, especially when working with large data sets, it may be helpful to use other programs, notably databases, in conjunction with R. Accordingly, the use of databases in R is covered in detail, along with methods for extracting data from spreadsheets and datasets created by other programs. Character manipulation, while sometimes overlooked within R, is also covered in detail, allowing problems that are traditionally solved by scripting languages to be carried out entirely within R. For users with experience in other languages, guidelines for the effective use of programming constructs like loops are provided. Since many statistical modeling and graphics functions need their data presented in a data frame, techniques for converting the output of commonly used functions to data frames are provided throughout the book. Using a variety of examples based on data sets included with R, along with easily simulated data sets, the book is recommended to anyone using R who wishes to advance from simple examples to practical real-life data manipulation solutions. Phil Spector is Applications Manager of the Statistical Computing Facility and Adjunct Professor in the Department of Statistics at University of California, Berkeley. |

### What people are saying - Write a review

Excellent book

### Contents

Data in R | 1 |

12 Data Storage in R | 2 |

13 Testing for Modes and Classes | 7 |

15 Conversion of Objects | 8 |

16 Missing Values | 10 |

Reading and Writing Data | 12 |

readtable | 15 |

23 Comma and TabDelimited Input Files | 17 |

45 Time Intervals | 64 |

46 Time Sequences | 65 |

Factors | 67 |

52 Numeric Factors | 70 |

54 Creating Factors from Continuous Variables | 72 |

55 Factors Based on Dates and Times | 73 |

56 Interactions | 74 |

Subscripting | 75 |

25 Extracting Data from R Objects | 18 |

26 Connections | 23 |

27 Reading Large Data Files | 25 |

28 Generating Data | 27 |

282 Random Numbers | 29 |

29 Permutations | 30 |

210 Working with Sequences | 31 |

211 Spreadsheets | 33 |

2112 The gdata Package All Platforms | 34 |

212 Saving and Loading R Data Objects | 35 |

213 Working with Binary Files | 36 |

214 Writing R Objects to Files in ASCII Format | 38 |

2142 The writetable function | 39 |

R and Databases | 43 |

312 Basics of SQL | 44 |

313 Aggregation | 45 |

314 Joining Two Databases | 46 |

315 Subqueries | 47 |

316 Modifying Database Records | 48 |

32 ODBC | 49 |

33 Using the RODBC Package | 50 |

34 The DBI Package | 51 |

36 Performing Queries | 52 |

38 Getting Data into MySQL | 53 |

39 More Complex Aggregations | 55 |

Dates | 57 |

42 The chron Package | 59 |

43 POSIX Classes | 60 |

44 Working with Dates | 63 |

64 Logical Subscripts | 76 |

65 Subscripting Matrices and Arrays | 77 |

66 Specialized Functions for Matrices | 81 |

67 Lists | 82 |

68 Subscripting Data Frames | 83 |

Character Manipulation | 86 |

73 Working with Parts of Character Values | 89 |

74 Regular Expressions in R | 90 |

75 Basics of Regular Expressions | 91 |

76 Breaking Apart Character Values | 93 |

77 Using Regular Expressions in R | 94 |

78 Substitutions and Tagging | 98 |

Data Aggregation | 101 |

82 Road Map for Aggregation | 106 |

83 Mapping a Function to a Vector or List | 107 |

84 Mapping a function to a matrix or array | 110 |

85 Mapping a Function Based on Groups | 113 |

86 The reshape Package | 120 |

87 Loops in R | 126 |

Reshaping Data | 131 |

92 Recoding Variables | 132 |

93 The recode Function | 134 |

94 Reshaping Data Frames | 135 |

95 The reshape Package | 140 |

96 Combining Data Frames | 142 |

97 Under the Hood of merge | 146 |

148 | |