How To Iterate Over Pandas DataFrame

Traversal is a required operation in many programming languages, such as Python, which iterates over list structures through the for loop. How do Pandas traverse Series and DataFrame objects? We should make it clear that they have different types of data structures and then different traversal methods. This article will tell you the difference between Series and DataFrame and show you examples of how to traverse them.

1. How To Iterate Over Pandas DataFrame Columns.

  1. For a Series object, you can traverse it as a one-dimensional array. For the DataFrame object which has a two-dimensional data table structure, it is similar to traversing a python dictionary.
  2. Pandas use the for loop for data traversal. After traversing with the for loop, Series gets the value directly, while DataFrame gets the column label,  then you can get the column label related Series object. below is an example.
    import pandas as pd
    
    import numpy as np
    
    def create_example_dataframe_object():
        
        # create a 1 dimensional array with numbers.
        array = np.arange(15)
        print('the original array : ')
        print(array)
        
        print('\r\n')
        
        # convert the  1 dimensional array to a 2 dimensional array that has 5 rows and 3 columns.
        array_5_rows_3_columns = array.reshape(5, 3)
        
        print('reshape the original array to 5 rows & 3 columns.\n\r')
        
        print(array_5_rows_3_columns)
        
        print('\r\n')
        
        # create the DataFrame object based on the above 2D array.
        df = pd.DataFrame(array_5_rows_3_columns, columns=['python', 'java', 'javascript'])
        
        print('the DataFrame object created by the above 2 dimensional array : \r\n')
        print(df)
        
        print('\r\n')
        
        return df
    
    
    def pandas_dataframe_iterate_by_column():
        
        df_obj = create_example_dataframe_object()
        
        for col in df_obj:
            
            # get the column label related Series object.
            value = df_obj[col]
            
            print('\ncol: ', col)
            
            print('type(col): ', type(col))
            
            print('df_obj[col]: ', value)
            
            print('type(df_obj[col]): ', type(value))
            
             
    
    if __name__ == '__main__':
        
        pandas_dataframe_iterate_by_column()
  3. Below is the above example execution result.
    the original array : 
    [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
    
    
    
    reshape the original array to 5 rows & 3 columns.
    
    
    [[ 0  1  2]
     [ 3  4  5]
     [ 6  7  8]
     [ 9 10 11]
     [12 13 14]]
    
    
    
    the DataFrame object created by the above 2 dimensional array : 
    
    
       python  java  javascript
    0       0     1           2
    1       3     4           5
    2       6     7           8
    3       9    10          11
    4      12    13          14
    
    
    
    
    col:  python
    type(col):  <class 'str'>
    df_obj[col]:  0     0
    1     3
    2     6
    3     9
    4    12
    Name: python, dtype: int32
    type(df_obj[col]):  <class 'pandas.core.series.Series'>
    
    col:  java
    type(col):  <class 'str'>
    df_obj[col]:  0     1
    1     4
    2     7
    3    10
    4    13
    Name: java, dtype: int32
    type(df_obj[col]):  <class 'pandas.core.series.Series'>
    
    col:  javascript
    type(col):  <class 'str'>
    df_obj[col]:  0     2
    1     5
    2     8
    3    11
    4    14
    Name: javascript, dtype: int32
    type(df_obj[col]):  <class 'pandas.core.series.Series'>

2. How To Iterate Over Pandas DataFrame Rows.

  1. Pandas DataFrame provides 3 methods for us to iterate over it’s rows.
  2.  iteritems(): iterate in the form of key, value pairs.
  3.  iterrows(): iterate the rows in the form of (row_index, row).
  4. itertuples(): iterate rows using named tuples.
  5. The below example shows how to use the above 3 methods.
    import pandas as pd
    
    import numpy as np
    
    def pandas_dataframe_iterate_by_row():
        
        df_obj = create_example_dataframe_object()
        
        
        print('---------- dataframe iteritems() example ----------')
        
        for key,value in df_obj.iteritems():
       
            print ('\nkey: ', key) 
            
            print ('value: ', value) 
         
            
        print('\n---------- dataframe iterrows() example ----------')
            
        for row_index,row in df_obj.iterrows():
                
            print ('\nrow_index: ', row_index) 
            
            print ('row: ', row)  
                 
            
        print('\n---------- dataframe itertuples() example ----------')
            
        for row in df_obj.itertuples():
            
            print(row)       
             
             
    
    if __name__ == '__main__':
        
        pandas_dataframe_iterate_by_row()
  6. Below is the above example execution output.
    the original array : 
    [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]
    
    
    
    reshape the original array to 5 rows & 3 columns.
    
    
    [[ 0  1  2]
     [ 3  4  5]
     [ 6  7  8]
     [ 9 10 11]
     [12 13 14]]
    
    
    
    the DataFrame object created by the above 2 dimensional array : 
    
    
       python  java  javascript
    0       0     1           2
    1       3     4           5
    2       6     7           8
    3       9    10          11
    4      12    13          14
    
    
    
    ---------- dataframe iteritems() example ----------
    
    key:  python
    value:  0     0
    1     3
    2     6
    3     9
    4    12
    Name: python, dtype: int32
    
    key:  java
    value:  0     1
    1     4
    2     7
    3    10
    4    13
    Name: java, dtype: int32
    
    key:  javascript
    value:  0     2
    1     5
    2     8
    3    11
    4    14
    Name: javascript, dtype: int32
    
    ---------- dataframe iterrows() example ----------
    
    row_index:  0
    row:  python        0
    java          1
    javascript    2
    Name: 0, dtype: int32
    
    row_index:  1
    row:  python        3
    java          4
    javascript    5
    Name: 1, dtype: int32
    
    row_index:  2
    row:  python        6
    java          7
    javascript    8
    Name: 2, dtype: int32
    
    row_index:  3
    row:  python         9
    java          10
    javascript    11
    Name: 3, dtype: int32
    
    row_index:  4
    row:  python        12
    java          13
    javascript    14
    Name: 4, dtype: int32
    
    ---------- dataframe itertuples() example ----------
    Pandas(Index=0, python=0, java=1, javascript=2)
    Pandas(Index=1, python=3, java=4, javascript=5)
    Pandas(Index=2, python=6, java=7, javascript=8)
    Pandas(Index=3, python=9, java=10, javascript=11)
    Pandas(Index=4, python=12, java=13, javascript=14)

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.