Pandas Get Index Values

We might need to retrieve the row or index names when examining real datasets, which are frequently very large, to carry out specific actions. Dataframe indexes refer to the indexes of rows, whereas available column names refer to the indexes of columns. Most of the time, indexes retrieve or store data within a dataframe. But by utilizing the .index property, we can also get the index itself.

In this lesson, we’ll show you how to retrieve an index as a list object, transform the index into a dataframe column, and use several conditions and the index property of pandas to extract the index.

Let’s talk about retrieving row names from a Pandas dataframe.

How do Pandas get Index Values?

The index of a Pandas DataFrame can be found using the DataFrame.index attribute. The DataFrame.index property returns an Index list object that contains the DataFrame’s index.

Syntax:

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>pandas.Index(data = None, dtype = None, copy = False, name = None, tupleize_cols = True, **kwargs)</pre>
</div>

From the above syntax, we can extrapolate the following information:

  • data: array-like (unidimensional)
  • dtype: NumPy’s dtype is dtype. It is, by default, an “object.” If the dtype is “None,” we shall select the dtype that is appropriate for the data. A dtype will be coerced if it is specified and safe. A warning will be shown if not.
  • copy: is of type boolean and denotes duplication of the provided ndarray.
  • name: it is an object denoting the name that is being index-stored.
  • tupleize_cols: It is True by nature. Additionally, it will attempt to generate a MultiIndex if True.
  • The examples below show the many ways to obtain the dataframe’s index.
  • Example #1: Using the index Property, get the Dataframe Row Index
  • First, we construct a dataframe with many rows to show how to obtain the row index using the panda’s index property. We will first load the panda’s module to use its functions before building the dataframe.
<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>import pandas as pd
math_score =[('Joy', 78),
	('Bright', 64),
	('Green', 97),
	('Winy', 35),
	('Sally', 69),
	('Ken', 54),
	('Mercy', 28),
	('Evans', 58)

]
df = pd.DataFrame(math_score, columns=['name', 'age'])
print(df)</pre>
</div>

We have built our dataframe by using a list inside the pd.DataFrame() function. Our dataframe has two columns: student_name and student_age. The names of a few arbitrary people are stored in the column “name”: “Joy,” “Bright,” “Winy,” “Sally,” “Ken,” “Mercy,” and “Evans .” While the ages of each person are listed in the “score” column (78,64,97,35,69,54,28,58). Each row has an index value that the pandas constructor automatically creates at the beginning. We will now extract this index column using the index attribute.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>df.index

Out[7]: RangeIndex(start=0,stop=7, step=1)</pre>
</div>

The output demonstrates that the rows begin at 0, increase by 1, and conclude at some index prior to 7. As a result, we can use the function print() or an iterator inside to output each index value.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>i = df.index
for _ in i:
	print(_)</pre>
</div>

All the values from index 0 to 7 have been printed.

Using a Condition, Extract the Dataframe Row Index

A condition can be used to retrieve the index values. The index property will retrieve the dataframe’s index values that fall under the criteria provided. Then, to return the retrieved values as a list, we will utilize the tolist() function. Let’s first make our dataframe using the pd.DataFrame() function.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>import pandas as pd

df = df.DataFrame({
	'laptop':['dell','hp','ibm','toshiba','lenovo','chrome-book','mac','tecra'],
	'cost':[250,300,450,185,290,411,650,150],
	'code':['d','h','i','t','l','c','m','t']
})
</pre>
</div>

We have produced a dataframe by utilizing a Python dictionary inside the pd.Dataframe() function. Eight rows from 0 to 7 are divided into three columns in our dataframe. The data values are stored as strings in the column “items” and are ‘dell’,’hp’,’ibm’,’toshiba’,’lenovo’,’chrome-book’,’mac’, and ‘tecra’. The numeric numbers for each laptop’s piece are contained in the “cost” column (250,300,450,185,300,411,650,300). The data values code for this column are (‘d’,’h’,’i’,’t’,’l’,’c’,’m’,’te’).

Let’s use the script below to retrieve the index values right away.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>i = df.index
index = df["cost"] ==300
result =i[index]

result.tolist()</pre>
</div>

Using the index attribute, we obtained the indexes for the “df” dataframe. Then, we have established a requirement to only extract data if the cost column values equal 100. After gathering the data, we retrieved the index values for the rows that met the predetermined criteria. Finally, the output is converted into a list object using the tolist() function. The function returns a list with four index values—[1, 4, 7]—.

Making use of the get loc() Function in extracting the Dataframe Column Index

We learned how to get the values of the row indexes in a dataframe. But we can also get the contents of the column indexes in a dataframe. The get_loc() function can obtain any dataframe column’s index value. We only need to pass the column label to the get_loc() function to get the index. To access the index position or index value, let’s construct a dataframe with multiple columns.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>import pandas as pd

df = pd.DataFrame({
	'department':['IT','IT','HR','HR','Sales','Sales','Transport'],
	'participants':[10,9,10,10,9,11,9,9],
	'allowance':[450,300,400,380,110,520,480, 260],
	'housing':[11450,10300,2400,9380,7110,6520,3480, 1260]
	
})

print(df)</pre>
</div>

We have established four columns in our dataframe: department, participants, allowance, and housing. The data values (‘IT’,’IT’,’HR’,’HR’,’Sales’,’Sales’,’Transport’) are stored in the class column. The numbers in the columns allowance and housing are respectively (450,300,400,380,110,520,480, 260) and (11450,10300,2400,9380,7110,6520,3480, 1260). The values in the columns participants are (10,9,10,10,9,11,9,9). Let’s say we need to determine the salary column’s index value:

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>df.columns.get_loc("allowance")
</pre>
</div>

The function has located the provided column’s index, which is 2.

Using the get_loc() Function to Extract the Values of the Specified Row Indexes

If the labels for the row indexes are supplied, we can also use the get_loc() function to retrieve the index location of the row indexes. By utilizing a list with names for each value of the row index, we can specify the labels for our row index. Let’s update the dataframe we prepared in example #3 by adding the index labels.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>import pandas as pd

df = pd.DataFrame({
	'department':['IT','IT','HR','HR','Sales','Sales','Transport'],
	'participants':[10,9,10,10,9,11,9,9],
	'allowance':[450,300,400,380,110,520,480, 260],
	'housing':[11450,10300,2400,9380,7110,6520,3480, 1260]},
	index =['I1','I2','I3','I4','I5','I6','I7','I8'])
	
print(df)
</pre>
</div>

We have supplied a list of labels from I1 to I8 as the index parameter inside the pd.DataFrame() function. The default integer index of the dataframe has been replaced with the labels “I1,” “I2,” “I3,” “I4”, “I5, “I6,” “I7,” and “I8.” Let’s now locate the index point for a certain label.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>print(df.index.get_loc("I8"))</pre>
</div>

To acquire the indexes, the dataframe is first given the index property. The index position of the row’s supplied index label is then extracted using the get_loc() function.

Extraction of Row Index Values Using Numpy’s Where() Function

The where() function of NumPy allows us to obtain the index values by defining a condition inside of it. In quest to using NumPy library functions’, we will first create a dataframe and import the panda’s package.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre># First import the Pandas library
import pandas as pd

# secondly import NumPy library
import numpy as np

df = pd.DataFrame({
	'item':['01','02','03','04','05','06','07'],
	'cost':[150,200,180,250,170,220,170],
	'discount':[20,30,0,10,50,0,20]
})

print(df)</pre>
</div>

Our dataframe is built after the necessary libraries have been imported. We have three columns in our dataframe (item, cost, and discount). The data values (’01’,’02’,’03’,’04’,’05’,’06’,’07’), (150,200,180,250,170,220,170), and (20,30,0,10,50,0,20), respectively, are stored in the columns item, cost, and discount. Let’s use the where() method inside the list() function to get the row index value.

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>list(np.where(df["discount"] > 0))</pre>
</div>

A condition was added for the where() function to return the rows where the value in the “discount” column is more than zero. We used the list() method to make a list out of the returned values.

Example :

<div class="wp-block-codemirror-blocks-code-block code-block">
<pre>import pandas as pd

# creation of  a series
new_series = pd.Series({97:'a', 98:'b', 99:'c', 100:'d', 101:'e', 102:'f'})
print(new_series)

# Finding values and index information
index  = new_series.index
values = new_series.values

print('')
# showing the outputs
print(index)
print(values)</pre>
</div>

We used a Python dictionary with pairs of integer keys and string values to create a pandas Series. Additionally, a ndarray of indices and values will be returned by the series properties of index and values. The arrays that the functions new_series.index and new_series.values will return are stored in the index and values variables, respectively. Finally, we use the print function to output the results.

The output block contains a pandas Series that was made using a Python dictionary and includes labeled data. The above output displays values and index data in ndarray format in the second block. Each output’s data type is visible in the block above; in this instance, values are object dtypes, and indexes are int64 dtypes.

Conclusion

A pandas series contains labeled data, and we may access the pieces of the series and manipulate the data using the labels. We occasionally need to obtain each label and value separately, though. Labels can be considered indexes and data organized into a series as values if you desire to obtain labels and values separately. The Series object’s index and values attributes can then be used.

This article explained how to use Pandas to retrieve dataframe index values. We used various functions to extract the row and column index of the dataframe. We also used several examples to demonstrate how to use the get_loc() method, the index property, and conditions to extract the dataframe row index. Additionally, we have spoken about how to use the get_loc() function to retrieve column index data.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *