Fancy indexing, a concept embraced by NumPy, empowers users to index arrays using integer arrays. This feature enables the selection of specific subsets of data in a flexible and concise manner. Let’s explore how to leverage fancy indexing effectively to manipulate NumPy arrays.
1. Selecting Rows in a Specified Order.
- Imagine you have a NumPy array with dimensions 8 × 4. Let’s generate such an array:
import numpy as np def create_data_set(): arr = np.zeros((8, 4)) print(arr) print("*******************") for i in range(8): arr[i] = i print(arr) return arr if __name__ == "__main__": create_data_set()
- Output.
[[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]] ******************* [[0. 0. 0. 0.] [1. 1. 1. 1.] [2. 2. 2. 2.] [3. 3. 3. 3.] [4. 4. 4. 4.] [5. 5. 5. 5.] [6. 6. 6. 6.] [7. 7. 7. 7.]]
- This generates an array where each row contains the same value as the row index.
- To select a subset of rows in a particular order, you can simply pass a list or ndarray of integers specifying the desired order:c
import numpy as np def create_data_set(): arr = np.zeros((8, 4)) print(arr) print("*******************") for i in range(8): arr[i] = i+3 print(arr) return arr def select_subset(arr): selected_rows = arr[[4,3,6]] print("*******************") print(selected_rows) if __name__ == "__main__": arr = create_data_set() select_subset(arr)
- Output.
[[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]] ******************* [[ 3. 3. 3. 3.] [ 4. 4. 4. 4.] [ 5. 5. 5. 5.] [ 6. 6. 6. 6.] [ 7. 7. 7. 7.] [ 8. 8. 8. 8.] [ 9. 9. 9. 9.] [10. 10. 10. 10.]] ******************* [[7. 7. 7. 7.] [6. 6. 6. 6.] [9. 9. 9. 9.]]
2. Negative Indices.
- Negative indices can also be used to select rows from the end of the array:
import numpy as np def create_data_set(): arr = np.zeros((8, 4)) print(arr) print("*******************") for i in range(8): arr[i] = i+3 print(arr) return arr def negative_indices(arr): selected_rows_negative = arr[[-3, -5, -7]] print("*******************") print(selected_rows_negative) if __name__ == "__main__": arr = create_data_set() negative_indices(arr)
- Output.
[[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]] ******************* [[ 3. 3. 3. 3.] [ 4. 4. 4. 4.] [ 5. 5. 5. 5.] [ 6. 6. 6. 6.] [ 7. 7. 7. 7.] [ 8. 8. 8. 8.] [ 9. 9. 9. 9.] [10. 10. 10. 10.]] ******************* [[8. 8. 8. 8.] [6. 6. 6. 6.] [4. 4. 4. 4.]]
3. Selecting Elements Using Multiple Index Arrays.
- Passing multiple index arrays selects a one-dimensional array of elements corresponding to each tuple of indices:
import numpy as np def select_elements_using_multiple_index_arrays(): arr = np.arange(32).reshape((8, 4)) print(arr) print("*******************") selected_elements = arr[[1, 5, 7, 2], [0, 3, 1, 2]] print(selected_elements) if __name__ == "__main__": select_elements_using_multiple_index_arrays()
- Output.
[[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19] [20 21 22 23] [24 25 26 27] [28 29 30 31]] ******************* [ 4 23 29 10]
4. Obtaining a Rectangular Region.
- To obtain a rectangular region formed by selecting a subset of the matrix’s rows and columns, you can achieve it by:
import numpy as np def create_data_set(): arr = np.zeros((8, 4)) print(arr) print("*******************") for i in range(8): arr[i] = i+3 print(arr) print("*******************") return arr def obtaining_a_rectangular_region(arr): rectangular_region = arr[[1, 5, 7, 2], [0, 3, 1, 2]] print(rectangular_region) print("*******************") rectangular_region = arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]] print(rectangular_region) print("*******************") rectangular_region = arr[[1, 5, 7, 2]][:, [0, 1, 2]] print(rectangular_region) print("*******************") if __name__ == "__main__": arr = create_data_set() obtaining_a_rectangular_region(arr)
- Output.
[[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]] ******************* [[ 3. 3. 3. 3.] [ 4. 4. 4. 4.] [ 5. 5. 5. 5.] [ 6. 6. 6. 6.] [ 7. 7. 7. 7.] [ 8. 8. 8. 8.] [ 9. 9. 9. 9.] [10. 10. 10. 10.]] ******************* [ 4. 8. 10. 5.] ******************* [[ 4. 4. 4. 4.] [ 8. 8. 8. 8.] [10. 10. 10. 10.] [ 5. 5. 5. 5.]] ******************* [[ 4. 4. 4.] [ 8. 8. 8.] [10. 10. 10.] [ 5. 5. 5.]] *******************
- Explain.
- `arr`: This is the NumPy array we’ve defined earlier, which is a 2-dimensional array with dimensions 8 × 4. It contains numbers generated by `np.arange(32).reshape((8, 4))`.
- `[[1, 5, 7, 2]]`: This part of the code selects specific rows from the `arr` array. The list `[1, 5, 7, 2]` contains the row indices we want to select. So, `arr[[1, 5, 7, 2]]` selects the rows with indices 1, 5, 7, and 2. This operation produces a new array containing only those selected rows.
- `[:, [0, 3, 1, 2]]`: This part further modifies the result obtained from the previous step. It selects specific columns from the array obtained in step 2. The `:` before the comma indicates that we want to include all rows. `[0, 3, 1, 2]` is a list containing column indices that we want to select. So, `[:, [0, 3, 1, 2]]` selects columns with indices 0, 3, 1, and 2 for all rows.
- Combining these steps, `arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]` first selects rows with indices 1, 5, 7, and 2 from the original array `arr`, and then from these selected rows, it selects columns with indices 0, 3, 1, and 2.
- In essence, the resulting `rectangular_region` array represents a subset of the original array `arr`, containing specific rows and columns according to the specified indices. This operation effectively creates a rectangular region within the original array.
5. Modifying Indexed Values.
- It’s essential to note that fancy indexing, unlike slicing which always copies the data into a new array when assigning the result to a new variable.
- If you assign values with fancy indexing, the indexed values will be modified:
import numpy as np def create_data_set(): arr = np.zeros((8, 4)) print(arr) print("*******************") for i in range(8): arr[i] = i+3 print(arr) print("*******************") return arr def modifying_indexed_values(arr): selected_values = arr[[1, 5, 7, 2], [0, 3, 1, 2]] print(selected_values) arr[[1, 5, 7, 2], [0, 3, 1, 2]] = 0 print(arr) if __name__ == "__main__": arr = create_data_set() modifying_indexed_values(arr)
6. Conclusion.
- By understanding and mastering fancy indexing techniques, you unlock the full potential of NumPy arrays, enabling efficient data manipulation and analysis.