R Read CSV Function

The R read.csv function is very useful to import the csv files from file system and URLs, and store the data in a Data Frame. In this article we will show you, How to use this R read csv function, how to manipulate the csv data in R Programming with example.

R Read CSV Syntax

The basic syntax to read the data from a csv file using R programming is as shown below

read.csv(file, header = , sep = , quote = )

There are many arguments supported by the read.csv in R programming language. The following are some of the most useful arguments in realtime usage of R read csv function:

  • file: You have to specify the file name, or Full path along with file name. You can also use the URL of the external (online) csv files. For example, sample.csv or “C:/Users/ Suresh/ Documents/ R Programs/ sample.csv”
  • header: If the csv contains Columns names as the First Row then please specify TRUE otherwise, FALSE
  • sep: It is a short form of separator. You have to specify the character that is separating the fields. ” , ” means data is separated by comma
  • quote: If your character values (FirstName, Education column tc) are enclosed in quotes then you have to specify the quote type. For double quotes we use: quote = “\”” in r read.csv function
  • as.is: Please specify the Boolean vector of same length as the number of column. This argument will convert the character values to factors based on the Boolean value. For example, we have two columns (FirstName, Sales) then we can use as.is = c(TRUE, FALSE), and this will keep the character FirstName as character (not an implicit factor)
  • nrows: It is an integer value. You can use this argument to restrict the number of rows to read. For example, if you want top 5 records, use nrows = 5
  • skip: Please specify the number of rows you want to skip from file before beginning the csv read. For example, if you want to skip top 2 records, use skip = 2
  • strip.white: When the sep argument is not equal to “” then you can use this Boolean value to trim the extra leading and tailing white spaces from the character field.
  • comment.char: If there are any comment lines in your file then you can use this argument to ignore those lines. You have describe the single special character that you used for comment line. For example, if your data contains comment starting with $ then use comment.char = “$” to skip this comment line from reading.
  • stringsAsFactors: Boolean Value indicating whether the text fields present in the csv file should be converted to factors or not.

The following screenshot will show you the data inside our employee.csv file and we are going use this file to dmonstrate the R read.csv function. As you can see, it has Columns names, 14 rows, and 7 columns

R Read CSV File 1

If you want to use the same data then Please copy the below data and paste it in notepad, and save it as employee.csv

FirstName,LastName,Education,Occupation,YearlyIncome,Sales,HireDate
John,Yang,Bachelors,Professional,90000,3578.27,28-01-06
Rob,Johnson,Bachelors,Management,80000,3399.99,29-12-10
Ruben,Torres,Partial College,Skilled Manual,50000,699.0982,29-12-11
Christy,Zhu,Bachelors,Professional,80000,3078.27,28-12-12
Rob,Huang,High School,Skilled Manual,60000,2319.99,22-09-08
John,Ruiz,Bachelors,Professional,70000,539.99,06-07-09
John,Miller,Masters Degree,Management,80000,2320.49,12-08-09
Christy,Mehta,Partial High School,Clerical,50000,24.99,05-07-07
Rob,Verhoff,Partial High School,Clerical,45000,24.99,15-09-13
Christy,Carlson,Graduate Degree,Management,70000,2234.99,25-01-14
Gail,Erickson,Education,Professional,90000,4319.99,02-10-06
Barry,Johnson,Education,Management,80000,4968.59,15-05-14
Peter,Krebs,Graduate Degree,Clerical,50000,59.53,14-01-13
Greg,Alderson,Partial High School,Clerical,45000,23.5,05-07-13

R Read csv File from Current Working Directory

In this example, we will show you, How to read data from the csv (comma separated values) file that is present in the current working directory in R Programming.

# R Read csv File from Current Working Directory

# Locate the Current Working Directory
getwd()

employee <- read.csv("Employee.csv", TRUE, sep = ",")

print(employee)
R Read CSV File 2

R Read csv File from Custom Directory

In this r read.csv example we will show you, How to read data from the csv file that is present in the custom directory.

  • getwd(): This method will return the current working directory. Mostly, it is your Documents folder
  • setwd(“system address”): This method can help us to change the current directory as per your requirement
  • list.files(): It displays the list of files present in that directory
# R Read csv File from Optional Working Directory

# Locate the Current Working Directory
getwd()

setwd("R Programs") # Or use Full path C:/Users/Suresh/Documents/R Programs 
list.files()
getwd()

employee <- read.csv("Employee.csv", TRUE, sep = ",")
print(employee)
R Read CSV File 3

Accessing csv file Data

In R programming, read.csv function will automatically convert the data into Data Frame. So, all the functions that are supported by the Data Frame can be used on csv data. Please refer Data Frame article to understand the functions description.

# Accessing Data from csv in R Programming
 
# Locate the Current Working Directory
getwd()
employee <- read.csv("Employee.csv", TRUE, sep = ",")
print(employee)

# Accessing all the Elements (Rows) Present in the 3rd Column (i.e., Occupation)
Index Values: 1 = FirstNmae, 2 = LastName, 3 = Education, 4 = Occupation, 4 = Yearly Income 5 = Salary, and 6 = HireDate
employee[[5]] 

# Accessing all the Elements (Rows) Present in the Occupation Item (Column)
employee$Occupation

# Accessing Element at 4th Row and 3rd Column 
employee[4, 3] 

# Accessing Item at 1st, 2nd 4th Rows and 4th, 5th, 6th, 7th Columns 
employee[c(1, 2, 4), c(4:7)]
R Read CSV File 4

Common Functions in R read.csv

While we are working with csv files or read from csv files in R programming, the following functions are the common functions.

  • max: This method will return the maximum value within the column
  • min: This method will return the minimum value within the column
  • subset(data, condition): This method will return the subset of data, and the data depends on the condition
# R Read csv - Common Functions 
# Locate the Current Working Directory
getwd()
employee <- read.csv("Employee.csv", TRUE, sep = ",")
print(employee)

# It returns the Maximum Value within the Yearly Income Column
maximum.salary <- max(employee$YearlyIncome)
print(maximum.salary)

# It returns the Minimum Value within the Sales Column
minimum.sales <- min(employee$Sales)
print(minimum.sales)

# It will calculate and returns the Sales Column Mean Value
mean.sales <- mean(employee$Sales)
print(mean.sales)

# It returns all the records, whose Education is equal to Bachelors
subdata <- subset(employee, Education == "Bachelors")
print(subdata)

# It returns all the records, whose Education is equal to Bachelors and Yearly Income > 70000
partialdata <- subset(employee, Education == "Bachelors" & YearlyIncome > 70000)
print(partialdata)
R Read CSV File 5

R Read CSV – Important Functions

Following functions are some of the most useful functions, while reading csv files in R programming.

  • typeof: This method will tell you the type of the variable. Since, the data frame is a kind of list, this function will return a list
  • class: This method will tell you the class of the Data present in CSV file
  • length: This method will count number of items (columns) in a CSV file
  • nrow: This method will return the total number of Rows present in the CSV file.
  • ncol: This method will return the total number of Columns available in the CSV file.
  • dim: This method will return the total number of Rows and Columns present in the CSV file.
# R Read csv - Important Functions

# Locate the Current Working Directory
getwd()

employee <- read.csv("Employee.csv", TRUE, sep = ",")
print(employee)

typeof(employee)
class(employee)
names(employee)

length(employee)
nrow(employee)
ncol(employee)
dim(employee)
R Read CSV File 6

Head and Tail Functions in R read csv

In R Programming, Following functions are the very useful functions to work with external data(read csv files). If your csv file is too big and you want to extract the top performing records (top 20 records) then you can use these functions

  • head(Data, limit): This method will return top six elements (if you Omit the limit). If you specify the limit as 3 then, it will return first three records. It is something like selecting top 20 records.
  • tail(Data, limit): This method will return last six elements (if you Omit the limit). If you specify the limit as 4, it will return last four records. It is something like selecting bottom 10 records.
# R Read csv - head and Tail

# Locate the Current Working Directory
getwd()

employee <- read.csv("Employee.csv", TRUE, sep = ",")
print(employee)

# No limit - It will Display Top Six Records 
head(employee)

# Limit is 4 - It will Display Top Four Records
head(employee, 4)

# No limit - It will Display Bottom Six Records 
tail(employee)

# Limit is 3 - It will Display Bottom Three Records
tail(employee, 3)
R Read CSV - Head and Tail Functions

R read CSV Special Functions

Following two are the very useful functions supported by the R programming while reading csv files. It is always good to check the structure of the external data before we start manipulating, or inserting new records

  • str(Data): This method will return the structure of the data present in the csv file.
  • summary(Data Frame): This method will return the nature of the external data, and the statistical summary such as: Minimum, Median, Mean, Median etc.
# R Read csv - str and summary functions

# Locate the Current Working Directory
getwd()

employee <- read.csv("Employee.csv", TRUE, sep = ",")
print(employee)

print(str(employee))
print(summary(employee))
R Read CSV File 7

StringsAsFactor in R Read csv function

If your csv file contains both character and numeric variables then the character variables get automatically converted to the factors type. To prevent this automatic conversion, we have to specify stringsAsFactors = FALSE explicitly.

# R Read csv - Factors to String

# Locate the Current Working Directory
getwd()

employee <- read.csv("Employee.csv", TRUE, sep = ",", stringsAsFactors = FALSE)
print(employee)

str(employee)
R Read CSV File 8

If you observe the above screenshot, it is returning FirstName as char, rather than Factor type.