Reading an Excel File and Performing More Operations on it using Python

Reading Excel Files in Python using Pandas: If you are an Excel user who wants to retrieve the data from the spreadsheet in Python, then this page is for you. To read the Excel data in Python, we can use the Pandas read_excel() function. The read_excel() function in Python will extract the information from Excel and help the users read the same data in Python.

Read_excel() in Python can fetch various information from the spreadsheet, such as reading data from the first sheet, a specific sheet, multiple sheets, or even all the spreadsheets available in the Excel workbook. In this article, we have provided various methods with the help of which you can read the Excel files in Python using Pandas. Read further to find out more.

How Do You Read Data From an Excel File Using Python?

Microsoft Excel is a spreadsheet tool created by Microsoft that is available for Windows, macOS, Android, and iOS. It was released in 1987 and is used for a variety of calculations, graphic tools, pivot tables, and a macro programming language similar to Visual Basic for Applications. MS Excel has grown in popularity over the years because of the ease with which data can be saved and updated or withdrawn.

Python xlrd module:

The xlrd module can be used to obtain data from a spreadsheet. Python, for example, can be used to read, write, or alter data. In addition, the user may have to navigate through several sheets and get data based on some criteria, or edit some rows and columns, among other things.

The xlrd module is used to extract the data from the spreadsheet.

NOTE: Support for reading xlsx sheets has been explicitly removed from xlrd.

Installation of xlrd:

Use the following command to install the xlrd module:

pip install xlrd

Here we will use the following Excel spreadsheet as an example.

SampleExcelFile:
SampleExcelFile

Python Code to Read an Excel File and Perform More Operations on it

Method #1: Extracting a Specific Cell of Excel File

Approach:

  • Import openpyxl module to read an excel file using the import keyword
  • Pass the excel file path to the open_workbook() function of xlrd module to open a excel file and store it in a variable.
  • Get the Sheet1 of the above workbook object and store it in another variable
  • Print the 0th row and 0th column cell value of the excel file using the cell_value() function by applying it to the above workbook.
  • Print the 1st row and 1st column cell value of the excel file using the cell_value() function.
  • The Exit of the Program.

Below is the implementation:

# Import openpyxl module to read an excel file using the import keyword
import xlrd

# Pass the excel file path to the open_workbook() function of xlrd 
# module to open a excel file and store it in a variable
workbookObj = xlrd.open_workbook("SampleExcelFile.xlsx")
# Get the Sheet1 of the above workbook object and store it in another variable
worksheet = workbookObj.sheet_by_index(0)

# Print the 0th row and 0th column cell value of the excel file using the cell_value() function
# by applying it on the above workbook.
print("0th row and 0th column cell value of the excel file = ", worksheet.cell_value(0, 0))
# Print the 1st row and 1st column cell value of the excel file using the cell_value() function
print("1st row and 1st column cell value of the excel file = ", worksheet.cell_value(1, 1))

Output:

0th row and 0th column cell value of the excel file = EmployeeId
1st row and 1st column cell value of the excel file = Danny

Method #2: Extracting the Number of Rows Present in an Excel File

Approach:

  • Import openpyxl module to read an excel file using the import keyword
  • Pass the excel file path to the open_workbook() function of xlrd module to open a excel file and store it in a variable.
  • Get the Sheet1 of the above workbook object and store it in another variable.
  • Apply nrows attribute on the above worksheet to get the number of rows of the given Excel File and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import openpyxl module to read an excel file using the import keyword
import xlrd

# Pass the excel file path to the open_workbook() function of xlrd 
# module to open a excel file and store it in a variable
workbookObj = xlrd.open_workbook("SampleExcelFile.xlsx")
# Get the Sheet1 of the above workbook object and store it in another variable
worksheet = workbookObj.sheet_by_index(0)

# Apply nrows attribute on the above worksheet to get the number of rows of the 
# Excel File and print the result.
print("Number of rows present in the given Excel File = ", worksheet.nrows)

Output:

Number of rows present in the given Excel File = 7

Method #3: Extracting the Number of Columns Present in an Excel File

Approach:

  • Import openpyxl module to read an excel file using the import keyword
  • Pass the excel file path to the open_workbook() function of xlrd module to open a excel file and store it in a variable.
  • Get the Sheet1 of the above workbook object and store it in another variable.
  • Apply ncols attribute on the above worksheet to get the number of columns present in the given Excel file and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import openpyxl module to read an excel file using the import keyword
import xlrd

# Pass the excel file path to the open_workbook() function of xlrd 
# module to open a excel file and store it in a variable
workbookObj = xlrd.open_workbook("SampleExcelFile.xlsx")
# Get the Sheet1 of the above workbook object and store it in another variable
worksheet = workbookObj.sheet_by_index(0)

# Apply ncols attribute on the above worksheet to get the number of columns 
# present in the given Excel File and print the result.
print("Number of columns present in the given Excel File = ", worksheet.ncols)

Output:

Number of columns present in the given Excel File = 3

Method #4: Extracting all the Column Names of an Excel File

Approach:

  • Import openpyxl module to read an excel file using the import keyword
  • Pass the excel file path to the open_workbook() function of xlrd module to open a excel file and store it in a variable.
  • Get the Sheet1 of the above workbook object and store it in another variable.
  • Loop till the last column of the worksheet using the for loop.
  • Print the value of each column in the above worksheet using the cell_value() function.
  • The Exit of the Program.

Below is the implementation:

# Import openpyxl module to read an excel file using the import keyword
import xlrd

# Pass the excel file path to the open_workbook() function of xlrd 
# module to open a excel file and store it in a variable
workbookObj = xlrd.open_workbook("SampleExcelFile.xlsx")
# Get the Sheet1 of the above workbook object and store it in another variable
worksheet = workbookObj.sheet_by_index(0)

# Loop till the last column of the worksheet using the for loop
for col in range(worksheet.ncols):
  # Print the value of each column in the above worksheet using the cell_value() function
    print(worksheet.cell_value(0, col))

Output:

EmployeeId
EmployeeName
Salary

Method #5: Extracting the First Column of an Excel File

Approach:

  • Import openpyxl module to read an excel file using the import keyword
  • Pass the excel file path to the open_workbook() function of xlrd module to open a excel file and store it in a variable.
  • Get the Sheet1 of the above workbook object and store it in another variable.
  • Loop till the last row of the worksheet using the for loop and nrows attribute.
  • Print all the row values of the 1st column of the given excel file using the cell_value() function.
  • The Exit of the Program.

Below is the implementation:

# Import openpyxl module to read an excel file using the import keyword
import xlrd

# Pass the excel file path to the open_workbook() function of xlrd 
# module to open a excel file and store it in a variable
workbookObj = xlrd.open_workbook("SampleExcelFile.xlsx")
# Get the Sheet1 of the above workbook object and store it in another variable
worksheet = workbookObj.sheet_by_index(0)

# Loop till the last row of the worksheet using the for loop and nrows attribute
for row in range(worksheet.nrows):
  # Print all the row values of 1st column of the given excel file using
  # the cell_value() function
  print(worksheet.cell_value(row, 0))

Output:

EmployeeId
2121.0
2122.0
2123.0
2124.0
2125.0
2126.0

Method #6: Extracting a Specific Row Values

Approach:

  • Import openpyxl module to read an excel file using the import keyword
  • Pass the excel file path to the open_workbook() function of xlrd module to open a excel file and store it in a variable.
  • Get the Sheet1 of the above workbook object and store it in another variable.
  •  Apply row_values() function on the above worksheet by passing the row number as an argument to it to extract the corresponding row data of the excel file and print it.
  • Here we print the 3rd-row values(as the index starts from 0).
  • The Exit of the Program.

Below is the implementation:

# Import openpyxl module to read an excel file using the import keyword
import xlrd

# Pass the excel file path to the open_workbook() function of xlrd 
# module to open a excel file and store it in a variable
workbookObj = xlrd.open_workbook("SampleExcelFile.xlsx")
# Get the Sheet1 of the above workbook object and store it in another variable
worksheet = workbookObj.sheet_by_index(0)

# Apply row_values() function on the above worksheet by passing the row number as 
# an argument to it to extract the corresponding row data of the excel file and print it
# Here we print the 3rd row values(as index starts from 0) 
print(worksheet.row_values(2))

Output:

[2122.0, 'Sindhu', 50000.0]

With all the necessary information on the best and fastest ways to read the Excel file in Python, provided above, start reading the Excel data in Python. But make sure the Excel files have the extension of.xlsx or.xls for making Pandas extract the data in Python.

Leave a Comment