When working with data in Pandas, we might exclude a column or several columns from a Pandas DataFrame. They are often eliminated if columns or rows are no longer required for further research. There are several approaches. However, the .drop() approach in Pandas is the most effective. Columns in a DataFrame that are not related to the research can frequently be found. To focus on the remaining columns, such columns should be eliminated from the DataFrame.
Columns may be removed by defining the label names and related axis or by supplying the index or column names. Additionally, utilizing a multi-index and setting the level allows for removing labels on several levels. We will discuss the dropping columns in pandas and provide some examples in this article.
Let’s talk about removing one or more columns from a Pandas Dataframe. There are several ways to delete a column from a Pandas DataFrame or drop one or more columns from a DataFrame. First, make a straightforward dataframe containing a dictionary of lists with the columns cars, laptops, companies, fruits, and clubs as the names. Note that this post will explore alternative techniques for removing specific columns from a Pandas DataFrame.
# Import pandas package import pandas as pd # build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) print(df)
Drop Columns from a Dataframe using the drop() method
A group of labels can be eliminated from a row or column using the drop() function. We can exclude rows or columns by specifying label names and matching axes or by defining index or column names directly. By setting the level while utilizing a multi-index, labels on multiple levels can be deleted. Using the .drop() function, we can drop or remove one or more columns from a Python DataFrame.
Syntax:
The drop() function’s syntax can be described as follows:
DataFrame.drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
According to the syntax shown above, the parameters are as follows:
- Labels: A single word, a list of column names, or the row index value.
- Index: To supply the row labels, use the index.
- Level: It is used to choose the level from which the labels should be deleted in the case of a MultiIndex DataFrame. Further, it will take either a level name or a level location as input.
- Axis: It suggests eliminating some columns or rows. Set an axis to 1 or “columns” to remove columns. By default, it removes the rows from the DataFrame.
- Columns: This is a different word for axis = “columns.” It will accept either a list of column labels or a single column label as input.
- Inplace: is responsible for determining whether a new DataFrame should be returned or an existing one updated is determined by the inplace clause. It has the Boolean value of False by default.
- Errors: Ignore errors if “ignore” is set.
Returns
- It returns the DataFrame with the deleted columns or None if inplace = True. In addition, it throws a KeyError if no labels are found.
When using the .drop() method, we can delete a singular column or numerous columns in the given DataFrame. Below, we demonstrate how each of these can be achieved using examples.
Remove specific single columns
Example 1
# Import pandas package import pandas as pd # build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) # Delete the "clubs" column name. df.drop(['clubs'], axis=1)
Example 2
In the example below, the ‘age’ column is removed from the DataFrame using df.drop(columns = ‘col name’).
import pandas as pd employee_dict = {"name": ["Joy", "Green"], "age": [32, 29], "salary": [1185.10, 1077.80]} # Creation of the DataFrame based on the dict employee_df = pd.DataFrame(employee_dict ) print(employee_df) # dropping the specified column employee_df = employee_df.drop(columns='age') print(employee_df)
Utilizing the drop function with an axis of “column” or “1”
Here, we use a DataFrame’s axis argument to remove columns.
When using the function drop(), the axis may be a row or a column. The word “columns” or the number 1 designates the column axis. Have a list of the column names to be eliminated and set the axis to 1 or “columns.” Let’s look at the previous example to demonstrate how to utilize the drop function with axis = “column” and axis = 1.
employee_df = employee_df.drop(['age', 'salary'], axis='columns') # the alternative below also generates the same result employee_df = employee_df.drop(['age', 'salary'], axis=1)
Remove specific multiple columns
The DataFrame has two arguments. We can utilize the drop() function’s parameters to erase a DataFrame’s numerous columns at once. Use the column argument to specify a list of column names to remove. Additionally, move the list of column names while setting the axis to 1.
Example 1
# Importing the panda's package import pandas as pd # build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) # Take out columns with the names "companies" and "fruits." df.drop(['companies', 'fruits'], axis=1) # df.drop(columns =['companies', 'fruits'])
Example 2
import pandas as pd employee_dict = {"name": ["Joy", "Green"], "age": [32, 29], "salary": [1177.29, 1069.15]} employee_df = pd.DataFrame(employee_dict) print(employee_df.columns.values) # drop 2 columns at a time employee_df = employee_df.drop(columns=['age', 'salary']) print(employee_df.columns.values)
Remove columns based on the column index
Because the alteration was not in place prior to using a drop operation, pandas created a new copy of the DataFrame. Whether to remove a column from an existing DataFrame or make a copy of it is determined by the argument inplace.
Without giving anything back, it updates the current DataFrame if inplace=True. If the inplace argument is set to False, a new DataFrame is generated and returned with the updated changes. Let’s use examples to illustrate how we could put the column using the drop function.
Example 1
# Importing the panda's package import pandas as pd # build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) # Delete three columns from the index base. df.drop(df.columns[[0, 4, 2]], axis=1, inplace=True) print(df)
Example 2
import pandas as pd employee_dict = {"name": ["Joy", "Green"], "age": [32, 29], "salary": [1177.29, 1069.15]} employee_df = pd.DataFrame(employee_dict) print(employee_df.columns.values) # dropping the columns in place employee_df.drop(columns=['age', 'salary'], inplace=True) print(employee_df.columns.values)
Using iloc[] and the drop() method, remove columns from a dataframe
Between one column and another, eliminate all columns.
# Importing the panda's package import pandas as pd # build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) # Remove all columns from column index 1 to column index 3. df.drop(df.iloc[:, 1:3], inplace=True, axis=1) print(df)
Using the ix() and drop() methods to remove Columns from a Dataframe
Remove every column from one name of column to the name of another.
# Importing the panda's package import pandas as pd # build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) # Between column names "laptops" and "fruits," eliminate every column. df.drop(df.ix[:, 'laptops':'fruits'].columns, axis=1)
Using the loc[] and drop() methods to remove columns from a dataframe
Remove every column from one name of column to the name of another.
# Import the panda's package import pandas as pd #build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(vals) #Between column names "laptops" and "fruits," eliminate every column. df.drop(df.loc[:, 'laptops':'fruits'].columns, axis=1)
It should be noted that iloc() excludes the last column range element in contrast to loc().
Iteratively remove columns from a dataframe
Remove every column from one name of column to the name of another.
# Importation of the package -pandas import pandas as pd # creation of a five-field dictionary vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary to DataFrame conversion df = pd.DataFrame(data) for col in df.columns: if 'cars' in col: del df[col] print(df)
The dataframe.pop() Python function
# Import pandas package import pandas as pd #build a dictionary with five fields for each entry. vals = { 'cars': ['toyota', 'ford', 'mercedes', 'nissan', 'chevrolet'], 'laptops': ['toshiba', 'apple', 'lenovo', 'ibm', 'chrome-book'], 'companies': ['microsoft', 'hp', 'google', 'uber', 'amazon'], 'fruits': ['mango', 'orange', 'pineapple', 'beetroot', 'passion'], 'clubs': ['chelsea', 'manchester', 'liverpool', 'villa', 'arsenal']} # Dictionary into DataFrame conversion df = pd.DataFrame(vals) df.pop('laptops') print(df)
Conclusion
This post has reviewed several ways to remove a column from a Pandas DataFrame. The appropriately named .drop technique is the most popular method for dropping multiple columns in pandas. This method was developed to make it simple for us to remove one or more rows or columns. Further, you can remove a single information column to evaluate the updated DataFrame methodically.