R Read table Function

The R read.table function is very useful to import the data from text files from the file system & URLs and store the data in a Data Frame. Let us see how to use this R read table function and manipulate the data in R Programming with an example.

R Read table Syntax

The syntax behind R read.table function to read the data from a text file is

read.table(file, header = FALSE, sep = "", quote = "\"'", dec = ".", 
         row.names, col.names, na.strings = "NA", nrows = -1, skip = 0,
         numerals = c("allow.loss", "warn.loss", "no.loss"), colClasses = NA,
         as.is = !stringsFactors, check.names = TRUE, strip.white = FALSE,
         fill = !blank.lines.skip, blank.lines.skip = TRUE, comment.char = "#",
         allowEscapes = FALSE, flush = FALSE, fileEncoding = "", text,
         stringsAsFactors = default.stringsAsFactors(), encoding = "unknown",
         skipNul = FALSE)

The list of arguments supported by the read.table in R Programming language is

  • dec: Specify the character used for the decimal points
  • check.names: Please specify whether you want to check if the column names are valid in R programming or not.
  • row.names: A Character vector that contains the row names for the returned Data Frame
  • nrows: It is an integer value. You can use this argument to restrict the number of rows to read. For example, if you want the top 5 records, use nrows = 5 in the read table function in R.
  • colClasss: A character vector of class names assigned to each column.
  • fill: Sometimes, we may get a file that contains the unequal length of rows, and we have to add blank spaces to that missing values.
  • flush: After reading all requested fields from a line, if you want the read.table to skip for the next line, then you can use this Boolean Value argument.
  • encoding: If there are any encoding schemes, then specify the scheme used for the source file. The default value is “unknown”.

The screenshot below shows the data inside our EmployeeSales.txt file, and we are going use this file to demonstrate the R read.table function. As you can see, it has Columns names, 14 rows, and 7 columns.

Text File 1

If you want to use the same data then Please copy the below data and paste it in notepad, and save it as EmployeeSales.txt

Employee_ID,FirstName,LastName,Education,Occupation,YearlyIncome,Sales
1,"John","Yang","Bachelors","Professional",90000,3578.27
2,"Rob","Johnson","Bachelors","Management",80000,3399.9899999999998
3,"Ruben","Torres","Partial College","Skilled Manual",50000,699.09820000000002
4,"Christy","Zhu","Bachelors","Professional",80000,3078.27
5,"Rob","Huang","High School","Skilled Manual",60000,2319.9899999999998
6,"John","Ruiz","Bachelors","Professional",70000,539.99000000000001
7,"John","Miller","Masters Degree","Management",80000,2320.4899999999998
8,"Christy","Mehta","Partial High School","Clerical",50000,24.989999999999998
9,"Rob","Verhoff","Partial High School","Clerical",45000,24.989999999999998
10,"Christy","Carlson","Graduate Degree","Management",70000,2234.9899999999998
11,"Gail","Erickson","Education","Professional",90000,4319.9899999999998
12,"Barry","Johnson","Education","Management",80000,4968.5900000000001
13,"Peter","Krebs","Graduate Degree","Clerical",50000,59.530000000000001
14,"Greg","Alderson","Partial High School","Clerical",45000,23.5

R Read table to read text File from Current Directory

In this example, we show how to use R read.table function to read data from the text file (.txt file) that is present in the current working directory.

  • file: You have to specify the file name or Full path along with file name. You can also use the URL of the external (online) txt files. For example, sampleFile.txt “C:/Users /Suresh /Documents /R Programs /sampleFile.txt”.
  • header: If the text file contains Columns names as the First Row, then please specify TRUE otherwise, FALSE.
  • sep: It is a short form of the separator. You have to specify the character that is separating the fields.” , ” means data separated by a comma.
  • quote: If your character values (LastName, Occupation, Education column etc) enclosed in quotes, then you have to specify the quote type. For double quotes we use: quote = “\””.
# Read Text File from Current Working Directory

# To Locate the Current Working Directory
getwd()

Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")

print(Company.employees)
R Read table Function 2

R Read table to read text File from Custom Directory

In this example, we use read.table function to read data from the text file that is present in the custom directory.

  • getwd(): This R Programming method returns the current working directory. Mostly, it is your Documents folder.
  • setwd(“system address”): This method can help us to change the current directory as per your requirement.
  • list.files(): This method displays the list of files present in that directory.
# Read Text File from Custom Working Directory

# To Locate the Current Working Directory
getwd()

setwd("R Programs") # Or use Full path C:/Users/Suresh/Documents/R Programs 
list.files()
getwd()

Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")

print(Company.employees)
R Read table Function 3

Arguments

The following screenshot shows the data inside our modified EmployeeSales.txt file. Here, we are going to use this file to demonstrate the arguments in R read.table function. As you can see, it has some empty rows, empty records, and Comment lines.

Text File 10

R Read table Function testing arguments

In this R read table example, we show how to read NA records, escape the blank lines, and comment lines while reading data from the text file.

  • allowEscapes: A Boolean value that indicates whether you want to allow the escapes (such as \n for new line) or not.
  • strip.white: If the sep argument is not equal to “”, then you may use this Boolean value to trim the extra leading & trailing white spaces from the character field.
  • comment.char: If there are any comment lines in your text file, then you can use this argument to ignore those lines. Here, You have to describe the single special character that you used to comment on the line. For example, if your text file contains a comment starting with $, then use a comment.char = “$” to skip this comment line from reading.
  • blank.lines.skip: A Boolean value that specifies whether you want to skip/ignore the blank lines or not.
  • na.strings: A character vector specifying values read as NA.
# Testing argument
# To Locate the Current Working Directory
getwd()

employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"", 
                        na.strings = TRUE, strip.white = TRUE,
                        comment.char = "$",blank.lines.skip = TRUE)
print(employees)
R Read table Function 11

Testing R Read table arguments

In this example, we show how to rename the column names, skip the number of rows, changing the default factors.

  • col.names: A Character vector that contains the column names for the returned data frame
  • as.is: Please specify the Boolean vector of the same length as the number of the column. This argument converts the character values to factors based on the Boolean value. For example, we have two columns (FirstName, Occupation), and we use them as.is = c(TRUE, FALSE). It keeps the FirstName as a character (not an implicit factor), and Occupation as a Factor.
  • skip: Please specify the number of rows you want to skip from a text file before beginning the data read. For example, if you want to skip the top 3 records, use skip = 3.
# Testing argument
# To Locate the Current Working Directory
getwd()
employeeNames <- c("Employee_ID", "First Name", "Last Name", "Education", "Profession","Salary","Sales")
employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"", 
                        na.strings = TRUE, strip.white = TRUE, skip = 3,
                        as.is = c(TRUE, TRUE, FALSE, FALSE, TRUE),
                        col.names = employeeNames, 
                        comment.char = "$", blank.lines.skip = TRUE)
print(employees)
print(str(employees))
Available arguments 12

StringsAsFactors in R Read table function

If your text file contains both character and numeric variables, then the character variables get automatically converted to the factors type. To prevent this automatic conversion, we have to specify stringsAsFactors = FALSE explicitly.

  • stringsAsFactors: Boolean Value indicating whether the text fields in the.txt file converted to factors or not. Default value is default.stringsAsFactors().
# stringsAsFactors argument
# To Locate the Current Working Directory
getwd()

# It will keep the Character Columns as it is 
Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"", 
                        stringsAsFactors = FALSE)

# It will Implicitly convert all the Character Columns to factors
employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")

print(str(Company.employees))
print(str(employees))
stringsAsFactors argument 9

Accessing text file Data

The read.table in R programming automatically converts the data into Data Frame. So, all the functions that are supported by the Data Frame are used on text data. Please refer Data Frame article to understand the description of the function.

# Access Data

# To Locate the Current Working Directory
getwd()

Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")
print(Company.employees)

# Accessing all the Rows (Elements) Present in the 4th Column (i.e., Education)
#Index Values: 1 = Employee_ID, 2 = FirstNmae, 3 = LastName, 4 = Education, 5 = Occupation, 6 = Yearly Income, and 7 = Salary
Company.employees[[4]] 

# Accessing all the Elements (Rows) Present in the YearlyIncome Item (Column)
Company.employees$YearlyIncome

# Accessing Element at 9th Row and 7th Column 
Company.employees[9, 7] 

# Accessing Item at 3rd, 5th, 7th, 13th Rows and 3rd, 4th, 5th, 6th, 7th Columns 
Company.employees[c(3, 5, 7, 13), c(3:7)]
R Read table Function 4

Common Functions in R read.table

While we are working with text files in R programming, the following functions are the most commonly used function.

  • max: This method returns the maximum value within the column.
  • min: This method returns the minimum value within the column.
  • mean: It calculates the Mean value.
  • median: It calculates the median value of the specified column.
  • subset(data, condition): This method returns the subset of data, and the data depends on the condition.
# Common Methods
# To Locate the Current Working Directory
getwd()

Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")
summary(Company.employees)

# It returns the Maximum Value present in the Yearly Income Column
max.salary <- max(Company.employees$YearlyIncome)
print(max.salary)

# It returns the Minimum Value present in the Sales Column
min.sales <- min(Company.employees$Sales)
print(min.sales)

# It will calculate and returns the Median of Sales Column
median.sales <- median(Company.employees$Sales)
print(median.sales)

# It will calculate and returns the Mean value of Sales Column
mean.sales <- mean(Company.employees$Sales)
print(mean.sales)

# It will returns all the records, whose Education is equal to Bachelors
data1 <- subset(Company.employees, Education == "Bachelors")
print(data1)

# It will return all the records, whose Education is equal to Bachelors and Yearly Income > 70000
data <- subset(Company.employees, Education == "Bachelors" & YearlyIncome > 70000)
print(data)
R Read table Function 5

Head and Tail Functions in R read text file

Below read table functions in R Programming are very useful functions to work with external data (text files). If your text file has millions of records, and you want to extract the top and underperforming records (top 10, bottom 10 records), then use these functions.

  • head(Data Frame, limit): This method returns the top six elements (if you Omit the limit). If you specify the limit as 3, it returns the first three records. It is something like selecting the top 20 records.
  • tail(Data Frame, limit): It returns the last six elements (if you Omit the limit). If you specify the limit as 4, then it returns the last four records. It is something like selecting the bottom 10 records.
#  Head and Tail Functions
# To Locate the Current Working Directory
getwd()

Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")
print(Company.employees)

# No limit - It will Display Top Six Records 
head(Company.employees)

# Limit = 5 - It will Display Top Five Records
head(Company.employees, 4)Head and Tail Functions

# No limit - It will Display Bottom Six Records 
tail(Company.employees)

# Limit = 4 - It will Display Bottom Four Records
tail(Company.employees)
Head and Tail Limit  6

R Read text file Important Functions

The following functions are some of the most useful functions while working with or reading text files in R programming.

  • typeof: This method tells you the type of the variable. Since the data frame is a kind of list, this function returns a list.
  • class: This method tells you the class of the Data present in a text file.
  • names: It returns the Column Names.
  • length: This method counts the number of items (columns) in a text file.
  • dim: It returns the total number of Rows & Columns present in a text file.
  • nrow: This method returns the number of Rows present in a text file.
  • ncol: This returns the total number of Columns in a text file.
# Important Functions
# To Locate the Current Working Directory
getwd()
Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")
print(Company.employees)

class(Company.employees)
typeof(Company.employees)
names(Company.employees)

length(Company.employees)
dim(Company.employees)
nrow(Company.employees)
ncol(Company.employees)
R Read table Function 7

Special Functions

Below two are the very useful functions supported by the read table function in R programming. It is always good to check the structure of the external data before we start manipulating or inserting new records

  • summary(Data Frame): It returns the nature of the external data and the statistical summary such as Minimum, Median, Mean, Median, etc.
  • str(Data Frame): This read table function returns the structure of the data present in a text file.
# Important Functions
# To Locate the Current Working Directory
getwd()
Company.employees <- read.table("EmployeeSales.txt", TRUE, sep = ",", quote="\"")
print(Company.employees)

print(str(Company.employees))
print(summary(Company.employees))
R Read table Function 8