# How to Check If Any Value is NaN in a Pandas DataFrame

The official documentation for pandas defines what most developers would know as `null`

values as `missing`

or `missing data`

in pandas. Within pandas, a `missing`

value is denoted by `NaN`

.

In most cases, the terms `missing`

and `null`

are interchangeable, but to abide by the standards of pandas, we'll continue using `missing`

throughout this tutorial.

## Evaluating for Missing Data

At the base level, pandas offers two functions to test for `missing`

data, `isnull()`

and `notnull()`

. As you may suspect, these are simple functions that return a `boolean`

value indicating whether the passed in argument value is in fact `missing`

data.

In addition to the above functions, pandas also provides two methods to check for `missing`

data on Series and DataFrame objects. These methods evaluate each object in the **Series** or **DataFrame** and provide a `boolean`

value indicating if the data is `missing`

or not.

For example, let's create a simple **Series** in pandas:

import pandas as pd import numpy as np s = pd.Series([2,3,np.nan,7,"The Hobbit"])

Now evaluating the **Series** `s`

, the output shows each value as expected, including index `2`

which we explicitly set as `missing`

.

In [2]: s Out[2]: 0 2 1 3 2 NaN 3 7 4 The Hobbit dtype: object

To test the `isnull()`

method on this series, we can use `s.isnull()`

and view the output:

In [3]: s.isnull() Out[3]: 0 False 1 False 2 True 3 False 4 False dtype: bool

As expected, the only value evaluated as `missing`

is index `2`

.

## Determine if ANY Value in a Series is Missing

While the `isnull()`

method is useful, sometimes we may wish to evaluate whether any value is `missing`

in a **Series**.

There are a few possibilities involving chaining multiple methods together.

The **fastest** method is performed by chaining `.values.any()`

:

In [4]: s.isnull().values.any() Out[4]: True

In some cases, you may wish to determine **how many** `missing`

values exist in the collection, in which case you can use `.sum()`

chained on:

In [5]: s.isnull().sum() Out[5]: 1

## Count Missing Values in DataFrame

While the chain of `.isnull().values.any()`

will work for a **DataFrame** object to indicate if any value is `missing`

, in some cases it may be useful to also *count* the number of `missing`

values across the entire **DataFrame**. Since **DataFrames** are inherently multidimensional, we must invoke *two* methods of summation.

For example, first we need to create a simple **DataFrame** with a few `missing`

values:

In [6]: df = pd.DataFrame(np.random.randn(5,5)) df[df > 0.9] = pd.np.nan

Now if we chain a `.sum()`

method on, instead of getting the total sum of `missing`

values, we're given a list of all the summations of each `column`

:

In [7]: df.isnull().sum() Out[7]: 0 3 1 0 2 1 3 1 4 0 dtype: int64

We can see in this example, our first column contains three `missing`

values, along with one each in column `2`

and `3`

as well.

In order to get the **total summation** of all `missing`

values in the **DataFrame**, we chain two `.sum()`

methods together:

In [8]: df.isnull().sum().sum() Out[8]: 5