use Pandas to check cell value is NaN

This article explores how to use Pandas to determine whether a cell value is NaN (np.nan). The latter is often referred to as Not a Number or NaN. Pandas uses nump.nan as NaN. Call the numpy.isnan() function with the value supplied as an input to determine whether a value in a particular place in the Pandas database is NaN or not. You may determine if a pandas DataFrame has NaN/None values in any cell by using the isnull().values.any() method in all the rows & columns. If NaN/None is discovered in any cell of a DataFrame, this method returns True; otherwise, it returns False.

Null values are described as missing values in the primary panda’s documentation. As most developers do, we can designate the missing or null data in pandas as NaN. NaN, an acronym for Not A Number, is one of the usual ways to show a value missing from a data set. It’s usually recommended practice to verify if a dataframe has any missing data and replace them with values that make sense, such as empty string or numeric zero. One of the main issues with data analysis is the NaN value because having NaN will have side effects on operations.

Developers can display the values in the dataframe that are missing by using either the NaN or None keywords. The fact that the pandas treat NaN and None equally is its best feature. If the cell contains NaN or None, pandas.notnull will return False thus determining whether a value is missing. Therefore, we shall examine and describe various techniques in this article to determine whether a specific cell value is null or not (NaN or None). In other words, we will seek to find which values in a pandas DataFrame are NaN.

Using Pandas to check cell value is NaN

The several approaches that we’ll talk about include:

  • isnull
  • isnan
  • isna
  • notnull

Let’s go over each technique in more depth.

Using the isnull function

The isnull() function will be used in this method to determine whether a given cell contains a NaN value.

# python isnull.py

import pandas as pd
import numpy as np

vals = {'x': [6,7,8,9,10,np.nan,11,12,np.nan,13,14,15,np.nan],
        'y': [16,17,np.nan,18,19,np.nan,20,21,np.nan,np.nan,22,np.nan,24]}
df = pd.DataFrame(vals)

print (df)

nan_in_df = df.isnull(df.iloc[10,0])
print (nan_in_df)

Numpy and the panda’s library are imported. With np.nan, we establish a dictionary with the keys x and y and their values. The output above shows the dataframe created after converting the dictionary to a dataframe.

We use the dataframe method isnull to determine whether a specific cell’s dataframe value of [10, 0] is null or not. In this instance, we are not verifying the value of the entire dataframe and a single dataframe cell. As a result, it produces the output True, as seen in the output above. The first number, 5, represents the index position. Meanwhile, the name of the column index is represented by the second value, 0.

Using the Isnan() technique

Using the dataframe’s isnull method, we verified the NaN value in the example above. We will now employ a different technique termed isnan. The method is part of the numpy and not the dataframe. The program listed below checks solely for the specified cell.

# Additionally, we can examine the dataframe's cell NaN value.
vals = {'x': [6,7,8,9,10,np.nan,11,12,np.nan,13,14,15,np.nan],
        'y': [16,17,np.nan,18,19,np.nan,20,21,np.nan,np.nan,22,np.nan,24]}
df = pd.DataFrame(vals)
print(df)
value = df.at[10, 'x']  #nan
isNaN = np.isnan(value)
print("Is value at df[5, 'x'] NaN :", isNaN)

With some np.nan, we establish a dictionary with the keys x and y and their values. The output above shows the dataframe created after converting the dictionary to a dataframe.

After filtering it using the index and column name [10, ‘x’], we assigned the selected cell value to the variable value. The column name is represented by the first number, “x,” which is 10 and represents the index position. We are determining whether or not the value is NaN. Finally, we report the results, which demonstrate that the value has NaN is True.

Using isnan to determine a series’ cell NaN values

In the preceding example, we looked for the NaN value in a cell dataframe. Additionally, we can determine whether a cell value in the pandas series is NaN or not. So let’s see how we can put that into practice.

# Additionally, we can examine the dataframe series' cell NaN value.

series_vals = pd.Series([2,3,np.nan,7,25])

print(series_vals)
value = series_vals[2]  #nan
isNaN = np.isnan(value)

print("Is value at df[2] NaN :", isNaN)

We started by creating the panda series presented in the code block above. Subsequently, we give another variable, the cell value we want to verify. Finally, we seek to determine whether the variable’s value is NaN or not.

Utilizing pandas.isna

Another approach is to use pandas to determine whether a specific dataframe cell value is null or not by using the pandas.isna method.

vals = {'x': [6,7,8,9,10,np.nan,11,12,np.nan,13,14,15,np.nan],
        'y': [16,17,np.nan,18,19,np.nan,20,21,np.nan,np.nan,22,np.nan,24]}
df = pd.DataFrame(vals)

print(df)

print("checking NaN value in cell [10, 0]")
pd.isna(df.iloc[10,0])

Using the pandas.notnull method

Using the np.nan, we establish a dictionary with the keys x and y and their values. The above output results from converting the dictionary to a dataframe and printing it.

We determine if the value of cell [10, 0] is NaN or not. The column name is represented by the first value 0 and the index position by the first value 10, respectively. Finally, we output our results, which demonstrate that the value has NaN is True.

vals = {'x': [6,7,8,9,10,np.nan,11,12,np.nan,13,14,15,np.nan],
        'y': [16,17,np.nan,18,19,np.nan,20,21,np.nan,np.nan,22,np.nan,24]}
df = pd.DataFrame(vals)

print (df)

print("checking NaN value in cell [10, 0]")
pd.notnull(df.iloc[10,0])

Using np.nan, we establish a dictionary with the keys x and y and their values. The above output results from converting the dictionary to a dataframe and printing it.

We are determining whether the value in the cell [10, 0] is not NaN. The column name is represented by the first value 0 and the index position by the first value 10, respectively. Our result, which we ultimately print, reveals that the value has NaN and returns False since we are attempting to determine whether the cell is null when it is null.

Example: Iteratively check if Cell Value is NaN in a Pandas DataFrame

In this example, we’ll use a DataFrame that has NaN values in a few places. In this DataFrame, each cell value will be iterated over to determine whether the value is NaN.

import pandas as pd
import numpy as np

df = pd.DataFrame(
	[[np.nan, 92, 87],
	[43, 98, 82],
	[52, 94, np.nan],
	[np.nan, 74, 96]])

for i in range(df.shape[0]): #iterate over rows
    for j in range(df.shape[1]): #iterate over columns
        value = df.at[i, j] #get cell value
        print(np.isnan(value), end="\t")
    print()

Example: Verify whether a cell value in a Pandas dataframe is NaN

In this example, we’ll use a DataFrame that has NaN values in a few places. We’ll determine whether certain values are NaN or not.

import pandas as pd
import numpy as np

vals = pd.DataFrame(
	[[np.nan, 92, 87],
	[43, 98, 82],
	[52, 94, np.nan],
	[np.nan, 74, 96]],
	columns=['a', 'b', 'c'])

value = vals.at[0, 'a']  #nan
isNaN = np.isnan(value)
print("Is value at df[0, 'a'] NaN :", isNaN)

value = vals.at[0, 'b']  #72
isNaN = np.isnan(value)
print("Is value at vals[0, 'b'] NaN :", isNaN)

Example: Using isnull().values.any() method

# begin by importing the libraries below
import pandas as pd
import numpy as np


vals = {'Integers': [13, 18, 33, 43, 58, np.nan,
					78, np.nan, 93, 153, np.nan]}

# Establish the dataframe
val_df = pd.DataFrame(vals, columns=['Integers'])

# Using the technique
validate_nan = val_df['Integers'].isnull().values.any()

#publishing the outcome
print(validate_nan)

Additionally, it is feasible to determine precisely where NaN values are located. We can achieve this by removing .values.any() from isnull().values.any() as follows:

val_df['Integers'].isnull()

Example: Using isnull().sum().sum() Method

# start by importing the libraries
import pandas as pd
import numpy as np

vals = {'Integers_1': [31, 36, 51, 61, 76, np.nan, 96,
					np.nan, 111, 171, np.nan],
		'Integers_2': [np.nan, 42, 43, 44, np.nan, 45, 46,
					np.nan, 47, np.nan, np.nan]}

# Establish the dataframe.
val_df = pd.DataFrame(vals, columns=['Integers_1', 'Integers_2'])

# application of the method
val_nan_in_df = val_df.isnull().sum().sum()

# showing the total number of values in the entire dataframe
print('Count of NaN values present: ' + str(val_nan_in_df))

Example : Utilizing the isnull().sum() Method

# begin by importing the libraries
import pandas as pd
import numpy as np


vals = {'Integers': [13, 18, 33, 43, 58, np.nan,
					78, np.nan, 93, 153, np.nan]}

# Create the dataframe
val_df = pd.DataFrame(vals, columns=['Integers'])

# application of the method
var_count_nan =val_df['Integers'].isnull().sum()

# printing of the number of values available in the column
print('Count of the NaN values present: ' + str(var_count_nan))

Conclusion

Because occasionally, we only need to know a cell value and not the entire dataframe, we have seen a variety of approaches to determine whether a certain cell value is NaN or None in this article. Note how this article has centered on cell value focus. In this Python Examples tutorial, we learned how to use the numpy.isnan() function to determine whether a particular cell value in Pandas is NaN or not.

Both the pandas and numpy techniques for examining missing values have been encountered. In addition, we don’t employ an iteration loop; instead, we solely use the concept to provide straightforward approach. Further, even if you wish to check the entire dataframe, all previous methods are quick to execute.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *