Specifying the Row Names when Reading Excel File in R

In this post, let us look at how to specify row names while reading an Excel file in R.

Here we are using the row.names argument of the read.xlsx2() function to specify row names when reading a file.

This is the simplest method for specifying row names when reading a file in the R programming language because the user only needs to call the read.xlsx() function.

The read.xlsx() function is specifically used for importing the excel type size, and the row.names argument of this function is used with the required values into it, and this will be read to the specifying row names as per mentioned as the argument value with reading o file in the R programming language.

xlsx package:

The excel package is a robust Java-based language for reading, writing, and updating data from and to excel files.

Installation of xlsx package

Type the below command in the R console:

install.packages("xlsx")

Output:

Installing package into ‘C:/Users/vicky/AppData/Local/R/win-library/4.2’
(as ‘lib’ is unspecified)
also installing the dependencies ‘rJava’, ‘xlsxjars’

trying URL 'https://cloud.r-project.org/bin/windows/contrib/4.2/rJava_1.0-6.zip'
Content type 'application/zip' length 1245703 bytes (1.2 MB)
downloaded 1.2 MB

trying URL 'https://cloud.r-project.org/bin/windows/contrib/4.2/xlsxjars_0.6.1.zip'
Content type 'application/zip' length 9485708 bytes (9.0 MB)
downloaded 9.0 MB

trying URL 'https://cloud.r-project.org/bin/windows/contrib/4.2/xlsx_0.6.5.zip'
Content type 'application/zip' length 375143 bytes (366 KB)
downloaded 366 KB

package ‘rJava’ successfully unpacked and MD5 sums checked
package ‘xlsxjars’ successfully unpacked and MD5 sums checked
package ‘xlsx’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
C:\Users\vicky\AppData\Local\Temp\Rtmp84RZij\downloaded_packages

read.xlsx2() Function in R:

The read.xlsx2() function reads data from an Excel file or Workbook object and converts it to data.frame.

Syntax:

read.xlsx(xlsxFilepath, sheet, startRow = 1, colNames = TRUE, rowNames = FALSE,
detectDates = FALSE, skipEmptyRows = TRUE, skipEmptyCols = TRUE, rows = NULL,
cols = NULL, check.names = FALSE, sep.names = ".", namedRegion = NULL,
na.strings = "NA", fillMergedCells = FALSE, row.names)

Here we now take a sampleExcelFile.xlsx excel as an example which is shown below:

Parameters:

xlsxFilepath: It is the path of the excel file.

sheet: It is the index or sheet name we wanted to work with and read the data.

startRow: The Default value is 1. It is the first row to begin and check for data.

colNames: TRUE by default. If it is True, the first row will be selected as column names.

rowNames: FALSE by default. If it is True, the first column will be selected as row names.

detectDates: FALSE by default. If True, this method will attempt to recognize Dates and convert them if required.

skipEmptyRows: TRUE by default. If set to True, empty rows are skipped during loading. Other empty rows will be returned as NAs after the first row containing data.

skipEmptyCols: TRUE by default. If this value is True, empty columns will be skipped during loading.

rows: NULL by default.

  • Null: When rows are set to NULL, all rows are read.
  • Vector: If rows are set to a vector of numeric values, the rows specified will be read.

cols: NULL by default.

  • Null: When cols is set to NULL, all columns are read.
  • Vector: If cols is set to a vector of numeric values, the columns given will be read.

check.names: FALSE by default. If it is True, it will verify whether or not each variable is syntactically valid.

sep.names: The Default is “.”. It Separates each variable in the column names.

namedRegion: NULL by default.

  • NULL: The values of the rows, cols, and startRow parameters will be taken into account within a workbook.
  • Not NULL: If the rows, cols, and startRow arguments their values are NOT considered.

na.strings: The Default is “NA.” Blank cells are returned as NAs.

fillMergedCells: FALSE by default. If True, the value in a merged cell (Parent) is given to all cells in the merge (Child).

rowNamesExcelFile.xlsx:

 

row names excel file

Specifying the Row Names when Reading Excel File in R

Approach:

  • Import xlsx package using the library() function.
  • Pass the excel file, sheetIndex,and row.names as argument to the xlsx2() function.
  • Print the given excel file.
  • The Exit of the Program.

Below is the implementation:

# Import     xlsx package using the library() function
library(xlsx)

# Pass the excel file, sheetIndex,and row.names as argument to the xlsx2() function
excelFile = read.xlsx2("C:/Users/cirus/Downloads/rowNamesExcelFile.xlsx",
                    sheetIndex = 1,
                    row.names = 1)
# Print the given excel file
excelFile

Output:

    B  C  D  E
1   2  3  4  5
6   7  8  9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
26 27 28 29 30

We will utilize the same xlsx file with 6 rows and 5 columns as in the previous example, and we will specify the row names to 3 while reading this file using the row. names option.

Below is the implementation:

# Import     xlsx package using the library() function
library(xlsx)

# Pass the excel file, sheetIndex,and row.names as argument to the xlsx2() function
excelFile = read.xlsx2("C:/Users/cirus/Downloads/rowNamesExcelFile.xlsx",
                    sheetIndex = 1,
                    row.names = 3)
# Print the given excel file
excelFile

Output:

    A  B  D  E
3   1  2  4  5
8   6  7  9 10
13 11 12 14 15
18 16 17 19 20
23 21 22 24 25
28 26 27 29 30

 

Leave a Comment