Pandas provide two sorting methods, namely sorting by label and sorting by value. This article describes how to sort in Pandas with the 2 methods.
1. Pandas Sort Example DataFrame Data.
- The below code will create the example DataFrame object. for sorting
import numpy as np import pandas as pd def pandas_dataframe_sorting_example(): # create a 2 dimensional array with 5 rows and 3 columns, each element value is a floating number. data_array = np.random.randn(5, 3) # the index array contains unsorted index number. index_array = [0,2,1,6,3] # the column name array. columns_array = ['column-3','column-1','column-2'] # create the unsorted DataFrame object with the above . dataframe_unsorted = pd.DataFrame(data = data_array,index = index_array,columns = columns_array) print('dataframe_unsorted\r') print(dataframe_unsorted) if __name__ == '__main__': pandas_dataframe_sorting_example()
- Below is the above code execution result, from the result we can see neither the row label nor the numeric element is sorted. Let’s operate on them using label sort and numeric sort respectively in the following examples.
dataframe_unsorted column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112 3 1.243266 0.558752 -0.703006
2. Pandas DataFrame Sort By Label Example.
- The Pandas DataFrame’s sort_index(axis, ascending) method can be used to sort the DataFrame object by the label.
2.1 Sort DataFrame Object By Row Label.
- When you do not pass any parameters to the method, it will sort the DataFrame object by row label in ascending order.
dataframe_unsorted.sort_index()
- It is because the default axis parameter’s value is 0 and the default ascending parameter’s value is True.
dataframe_unsorted.sort_index(axis = 0, ascending = True)
2.2 Sort DataFrame Object By Column Label.
- If you want to sort the DataFrame object by column label, you can pass axis = 1 to the sort_index() method.
dataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False)
- If you pass ascending = False parameter to the method, it will sort the column label in descending order.
3. Pandas DataFrame Sort By Values Example.
- The DataFrame object’s sort_values(by, kind) method can be used to sort the DataFrame object values.
- The parameter by is used to specify one column or multiple columns.
- The kind parameter specifies the sort algorithms, it has 3 values, they are heapsort, mergesort, and quicksort.
- The kind parameter only takes effect when sort by one column, the default value is quicksort, and the mergesort algorithms are the most stable choice.
4. Pandas DataFrame Sorting Example Source Code.
- Below is the full source code of this example.
import pandas as pd import numpy as np def pandas_dataframe_sorting_example(): # create a 2 dimensional array with 5 rows and 3 columns, each element value is a floating number. data_array = np.random.randn(5, 3) # the index array contains unsorted index number. index_array = [0,2,1,6,3] # the column name array. columns_array = ['column-3','column-1','column-2'] # create the unsorted DataFrame object with the above . dataframe_unsorted = pd.DataFrame(data = data_array,index = index_array,columns = columns_array) print('dataframe_unsorted\r') print(dataframe_unsorted) # sort DataFrame by row index label in ascending order. dataframe_sort_by_row_index_label_ascending = dataframe_unsorted.sort_index(ascending=True) print('\ndataframe_sort_by_row_index_label_ascending = dataframe_unsorted.sort_index(ascending=True)\r') print(dataframe_sort_by_row_index_label_ascending) # sort DataFrame by row index label in descending order. dataframe_sort_by_row_index_label_descending = dataframe_unsorted.sort_index(ascending=False) print('\ndataframe_sort_by_row_index_label_descending = dataframe_unsorted.sort_index(ascending=False)\r') print(dataframe_sort_by_row_index_label_descending) # sort DataFrame by column index label. dataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False) print('\ndataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False)\r') print(dataframe_sort_by_column_label) # sort DataFrame by column value. dataframe_sort_by_column_value = dataframe_unsorted.sort_values(by='column-1') print('\ndataframe_sort_by_column_value = dataframe_unsorted.sort_values(by=\'column-1\')\r') print(dataframe_sort_by_column_value) # when 2 rows has same colimn-1 value then order by the column-2 value. dataframe_sort_by_multiple_columns_value = dataframe_unsorted.sort_values(by=['column-1','column-2'], ascending=False) print('\ndataframe_sort_by_multiple_columns_value = dataframe_unsorted.sort_values(by=[\'column-1\',\'column-2\'], ascending=False)\r') print(dataframe_sort_by_multiple_columns_value) dataframe_sorting_algorithm = dataframe_unsorted.sort_values(by='column-1' ,kind='heapsort') print('\ndataframe_sorting_algorithm = dataframe_unsorted.sort_values(by=\'column-1\' ,kind=\'heapsort\')\r') print (dataframe_sorting_algorithm) if __name__ == '__main__': pandas_dataframe_sorting_example()
- Below is the above example source code execution output.
dataframe_unsorted column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112 3 1.243266 0.558752 -0.703006 dataframe_sort_by_row_index_label_ascending = dataframe_unsorted.sort_index(ascending=True) column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 1 0.088765 0.672521 -0.195716 2 -1.417987 0.636136 -0.068395 3 1.243266 0.558752 -0.703006 6 0.600821 0.814108 -0.086112 dataframe_sort_by_row_index_label_descending = dataframe_unsorted.sort_index(ascending=False) column-3 column-1 column-2 6 0.600821 0.814108 -0.086112 3 1.243266 0.558752 -0.703006 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 0 0.395101 -0.051456 0.327673 dataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False) column-3 column-2 column-1 0 0.395101 0.327673 -0.051456 2 -1.417987 -0.068395 0.636136 1 0.088765 -0.195716 0.672521 6 0.600821 -0.086112 0.814108 3 1.243266 -0.703006 0.558752 dataframe_sort_by_column_value = dataframe_unsorted.sort_values(by='column-1') column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 3 1.243266 0.558752 -0.703006 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112 dataframe_sort_by_multiple_columns_value = dataframe_unsorted.sort_values(by=['column-1','column-2'], ascending=False) column-3 column-1 column-2 6 0.600821 0.814108 -0.086112 1 0.088765 0.672521 -0.195716 2 -1.417987 0.636136 -0.068395 3 1.243266 0.558752 -0.703006 0 0.395101 -0.051456 0.327673 dataframe_sorting_algorithm = dataframe_unsorted.sort_values(by='column-1' ,kind='heapsort') column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 3 1.243266 0.558752 -0.703006 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112