How to Utilize Fancy Indexing in NumPy Arrays

Fancy indexing, a concept embraced by NumPy, empowers users to index arrays using integer arrays. This feature enables the selection of specific subsets of data in a flexible and concise manner. Let’s explore how to leverage fancy indexing effectively to manipulate NumPy arrays.

1. Selecting Rows in a Specified Order.

  1. Imagine you have a NumPy array with dimensions 8 × 4. Let’s generate such an array:
    import numpy as np
    
    def create_data_set():
        arr = np.zeros((8, 4))
        print(arr)
        print("*******************")
        for i in range(8):
            arr[i] = i
        print(arr)
        return arr
    
    
    if __name__ == "__main__":
        create_data_set()
  2. Output.
    [[0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]]
    *******************
    [[0. 0. 0. 0.]
     [1. 1. 1. 1.]
     [2. 2. 2. 2.]
     [3. 3. 3. 3.]
     [4. 4. 4. 4.]
     [5. 5. 5. 5.]
     [6. 6. 6. 6.]
     [7. 7. 7. 7.]]
  3. This generates an array where each row contains the same value as the row index.
  4. To select a subset of rows in a particular order, you can simply pass a list or ndarray of integers specifying the desired order:c
    import numpy as np
    
    def create_data_set():
        arr = np.zeros((8, 4))
        print(arr)
        print("*******************")
        for i in range(8):
            arr[i] = i+3
        print(arr)
        return arr
    
    def select_subset(arr):
        selected_rows = arr[[4,3,6]]
        print("*******************")
        print(selected_rows)
    
    if __name__ == "__main__":
        arr = create_data_set()
        select_subset(arr)
  5. Output.
    [[0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]]
    *******************
    [[ 3.  3.  3.  3.]
     [ 4.  4.  4.  4.]
     [ 5.  5.  5.  5.]
     [ 6.  6.  6.  6.]
     [ 7.  7.  7.  7.]
     [ 8.  8.  8.  8.]
     [ 9.  9.  9.  9.]
     [10. 10. 10. 10.]]
    *******************
    [[7. 7. 7. 7.]
     [6. 6. 6. 6.]
     [9. 9. 9. 9.]]

2. Negative Indices.

  1. Negative indices can also be used to select rows from the end of the array:
    import numpy as np
    
    def create_data_set():
        arr = np.zeros((8, 4))
        print(arr)
        print("*******************")
        for i in range(8):
            arr[i] = i+3
        print(arr)
        return arr
    
    def negative_indices(arr):
        selected_rows_negative = arr[[-3, -5, -7]]
        print("*******************")
        print(selected_rows_negative)
    
    if __name__ == "__main__":
        arr = create_data_set()
        negative_indices(arr)
  2. Output.
    [[0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]]
    *******************
    [[ 3.  3.  3.  3.]
     [ 4.  4.  4.  4.]
     [ 5.  5.  5.  5.]
     [ 6.  6.  6.  6.]
     [ 7.  7.  7.  7.]
     [ 8.  8.  8.  8.]
     [ 9.  9.  9.  9.]
     [10. 10. 10. 10.]]
    *******************
    [[8. 8. 8. 8.]
     [6. 6. 6. 6.]
     [4. 4. 4. 4.]]

3. Selecting Elements Using Multiple Index Arrays.

  1. Passing multiple index arrays selects a one-dimensional array of elements corresponding to each tuple of indices:
    import numpy as np
    
    def select_elements_using_multiple_index_arrays():
        arr = np.arange(32).reshape((8, 4))
        print(arr)
        print("*******************")
        selected_elements = arr[[1, 5, 7, 2], [0, 3, 1, 2]]
        print(selected_elements)
    
    if __name__ == "__main__":
        select_elements_using_multiple_index_arrays()
  2. Output.
    [[ 0  1  2  3]
     [ 4  5  6  7]
     [ 8  9 10 11]
     [12 13 14 15]
     [16 17 18 19]
     [20 21 22 23]
     [24 25 26 27]
     [28 29 30 31]]
    *******************
    [ 4 23 29 10]

4. Obtaining a Rectangular Region.

  1. To obtain a rectangular region formed by selecting a subset of the matrix’s rows and columns, you can achieve it by:
    import numpy as np
    
    def create_data_set():
        arr = np.zeros((8, 4))
        print(arr)
        print("*******************")
        for i in range(8):
            arr[i] = i+3
        print(arr)
        print("*******************")
        return arr
    
    
    def obtaining_a_rectangular_region(arr):
        rectangular_region = arr[[1, 5, 7, 2], [0, 3, 1, 2]]
        print(rectangular_region)
        print("*******************")
        
        rectangular_region = arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]
        print(rectangular_region)
        print("*******************")    
        
        rectangular_region = arr[[1, 5, 7, 2]][:, [0, 1, 2]]
        print(rectangular_region)
        print("*******************")
    
    
    if __name__ == "__main__":
        arr = create_data_set()
        obtaining_a_rectangular_region(arr)
  2. Output.
    [[0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]]
    *******************
    [[ 3.  3.  3.  3.]
     [ 4.  4.  4.  4.]
     [ 5.  5.  5.  5.]
     [ 6.  6.  6.  6.]
     [ 7.  7.  7.  7.]
     [ 8.  8.  8.  8.]
     [ 9.  9.  9.  9.]
     [10. 10. 10. 10.]]
    *******************
    [ 4.  8. 10.  5.]
    *******************
    [[ 4.  4.  4.  4.]
     [ 8.  8.  8.  8.]
     [10. 10. 10. 10.]
     [ 5.  5.  5.  5.]]
    *******************
    [[ 4.  4.  4.]
     [ 8.  8.  8.]
     [10. 10. 10.]
     [ 5.  5.  5.]]
    *******************
  3. Explain.
  4. `arr`: This is the NumPy array we’ve defined earlier, which is a 2-dimensional array with dimensions 8 × 4. It contains numbers generated by `np.arange(32).reshape((8, 4))`.
  5. `[[1, 5, 7, 2]]`: This part of the code selects specific rows from the `arr` array. The list `[1, 5, 7, 2]` contains the row indices we want to select. So, `arr[[1, 5, 7, 2]]` selects the rows with indices 1, 5, 7, and 2. This operation produces a new array containing only those selected rows.
  6. `[:, [0, 3, 1, 2]]`: This part further modifies the result obtained from the previous step. It selects specific columns from the array obtained in step 2. The `:` before the comma indicates that we want to include all rows. `[0, 3, 1, 2]` is a list containing column indices that we want to select. So, `[:, [0, 3, 1, 2]]` selects columns with indices 0, 3, 1, and 2 for all rows.

     

  7. Combining these steps, `arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]` first selects rows with indices 1, 5, 7, and 2 from the original array `arr`, and then from these selected rows, it selects columns with indices 0, 3, 1, and 2. 
  8. In essence, the resulting `rectangular_region` array represents a subset of the original array `arr`, containing specific rows and columns according to the specified indices. This operation effectively creates a rectangular region within the original array.

5. Modifying Indexed Values.

  1. It’s essential to note that fancy indexing, unlike slicing which always copies the data into a new array when assigning the result to a new variable.
  2. If you assign values with fancy indexing, the indexed values will be modified:
    import numpy as np
    
    def create_data_set():
        arr = np.zeros((8, 4))
        print(arr)
        print("*******************")
        for i in range(8):
            arr[i] = i+3
        print(arr)
        print("*******************")
        return arr
    
    def modifying_indexed_values(arr):
        selected_values = arr[[1, 5, 7, 2], [0, 3, 1, 2]]
        print(selected_values)
    
        arr[[1, 5, 7, 2], [0, 3, 1, 2]] = 0
        print(arr)
    
    if __name__ == "__main__":
        arr = create_data_set()
        modifying_indexed_values(arr)

6. Conclusion.

  1. By understanding and mastering fancy indexing techniques, you unlock the full potential of NumPy arrays, enabling efficient data manipulation and analysis.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.