Pandas Series Tutorial And Examples

The Series structure, also known as series sequence, is one of the commonly used data structures in pandas. It is a structure similar to a one-dimensional array, which is composed of a set of data values and a set of labels, in which the labels and data values are one-to-one correspondings.

Series can save any data type, such as integer, string, floating-point number, python object, etc. its label defaults to integer and starts from 0. Through index labels, we can more intuitively view the index location of the data.

1. How To Create Series Object.

  1. Pandas use the series() function to create a Series object.
  2. Through this Series object, you can call the corresponding methods and properties to process data.
    import pandas as pd
    
    series = pd.Series( data, index, dtype, copy)
    
    data: The input data, can be lists, constants, ndarray arrays, etc.
    
    index: The index value must be unique. If no index is passed, it defaults to np.arrange(n).
    
    dtype: dtype indicates the data type. If it is not provided, it will be determined automatically.
    
    copy: Indicates copying data. The default value is false.
  3. We can also create series objects using arrays, dictionaries, scalar values, or Python objects. Below are examples for creating Series objects.

1.1 Create An Empty Series Object.

  1. Create an empty Series object, this is deprecated.
    >>> import pandas as pd
    >>> # create an empty Series object, this is deprecated.
    ... s = pd.Series()
    __main__:2: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
    >>> print(s)
    Series([], dtype: float64)

1.2 Create Series Object From NumPy ndarray Object.

  1. Ndarray is an array type in NumPy. When the data type is Ndarray, the passed index must have the same length as the array.
  2. If no parameter is passed to the index parameter, by default, the index value will be generated using the range(n) function, where n represents the length of the array.
  3. The below example will create a Series sequence object using the default index. Because no index is passed, so the index is allocated from 0 by default, and the index range is 0 to len(data) – 1, that is, 0 to 3. This setting method is called “implicit index“.
    >>> # import the pandas module.
    ... import pandas as pd
    >>> 
    >>> # import the numpy module.
    ... import numpy as np
    >>> 
    >>> # create a ndarray object.
    ... data = np.array(['python','javascript','java','php'])
    >>> 
    >>> # create a Series object with the above ndarray object.
    ... s = pd.Series(data)
    >>> 
    >>> # print out the Series object.
    ... print (s)
    0        python
    1    javascript
    2          java
    3           php
    dtype: object
  4. You can also use the “explicit index” to define index labels like the below example.
    >>> # import pandas, numpy python library.
    ... import pandas as pd
    >>> import numpy as np
    >>> 
    >>> # create a numpy ndarray object. 
    ... data = np.array(['python','javascript','java'])
    >>> 
    >>> # create an list object.
    ... idx = ['language-1', 2, 'language-3'] 
    >>> 
    >>> # set the Series object's index explicitly. 
    >>> s = pd.Series(data=data,index=idx)
    >>> 
    >>> print(s)
    language-1        python
    2             javascript
    language-3          java
    dtype: object
    

1.3 Create Pandas Series Object From A Python Dictionary Object.

  1. You can use a python dictionary object as input data to create the Series object.
  2. If no index is passed in, the index will be constructed according to the key of the dictionary object.
    >>> # import the python pandas library
    >>> import pandas as pd
    >>>
    >>> # create the python dictionary object.
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> # create the pandas Series object based on the above python dictionary object, do not pass in the index parameter.
    >>> series_obj = pd.Series(dict_data)
    >>>
    >>> # print out the pandas Series object,the keys value is the python dictionary object keys value.
    >>> print(series_obj)
    PL     Python
    OS    Windows
    DB      MySQL
    dtype: object
    
  3. On the contrary, when the index parameter is passed, the index label needs to correspond to the values in the dictionary one by one.
    >>> # import the python pandas library
    >>> import pandas as pd
    >>>
    >>> # create the python dictionary object.
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>> 
    >>> # define the Series object's index list, the index list element value should match the above dictionary object keys' value.  
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> # create the pandas Series object based on the above python dictionary object, pass in the index parameter.
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> # print out the pandas Series object,the keys value is the python dictionary object keys value.
    >>> print(series_obj)
    DB      MySQL
    OS    Windows
    PL     Python
    ISO       NaN # Because the dictionary object's keys value does not contain 'ISO', then this element's value is NaN.
    dtype: object

1.4 Create Pandas Series Object From Scalar Values.

  1. You can create the pandas Series object from a scalar value, in this case, an index must be provided, below is the example.
    >>> # import python pandas module with the pd name alias.
    >>> import pandas as pd
    >>>
    >>> # create a list object.
    >>> index_list=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    >>>
    >>> # create a pandas Series object from a scalar value 10, you must pass in the index array, the scalar values are repeated according to the number of indexes and correspond to them one by one.
    >>> s = pd.Series(10, index = index_list)
    >>>
    >>> # print out the pandas Series object
    >>> print(s)
    1     10
    2     10
    3     10
    4     10
    5     10
    6     10
    7     10
    8     10
    9     10
    10    10
    dtype: int64

2. How To Read Series Object.

  1. The above section explains various ways to create pandas Series objects, but how should you read the elements in the Series object?
  2. We can read the Series object elements by Series number index, tag index, or index slice.
  3. Below are some examples.
    >>> # import the python pandas library 
    >>> import pandas as pd 
    >>> 
    >>> # create the python dictionary object. 
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'} 
    >>> 
    >>> # define the Series object's index list, the index list element value should match the above dictionary object keys' value. 
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> # create the pandas Series object based on the above python dictionary object, pass in the index parameter. 
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>> 
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>> print(series_obj[0]) # get series element by number index.
    MySQL
    >>>
    >>> print(series_obj['OS']) # get series element by tag index.
    Windows
    >>>
    >>> print(series_obj[:3]) # get series elements by number index slice.
    DB      MySQL
    OS    Windows
    PL     Python
    dtype: object
    >>>
    >>> print(series_obj[-3:])# get series elements by number index slice.
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> print(series_obj[['OS','DB']])# get series elements by tag index slice.
    OS    Windows
    DB      MySQL
    dtype: object
    >>>
    >>> print(series_obj[['OS','DB','BD']]) # when the tag index dose not exist then it will throw KeyError.
    Traceback (most recent call last):
      File "", line 1, in 
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\series.py", line 966, in __getitem__
        return self._get_with(key)
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\series.py", line 1006, in _get_with
        return self.loc[key]
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\indexing.py", line 931, in __getitem__
        return self._getitem_axis(maybe_callable, axis=axis)
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\indexing.py", line 1153, in _getitem_axis
        return self._getitem_iterable(key, axis=axis)
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\indexing.py", line 1093, in _getitem_iterable
        keyarr, indexer = self._get_listlike_indexer(key, axis)
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\indexing.py", line 1314, in _get_listlike_indexer
        self._validate_read_indexer(keyarr, indexer, axis)
      File "C:\Users\zhaosong\anaconda3\envs\MyPythonEnv\lib\site-packages\pandas\core\indexing.py", line 1377, in _validate_read_indexer
        raise KeyError(f"{not_found} not in index")
    KeyError: "['BD'] not in index"
    

3. How To Remove Series Object Data.

  1. Call the Series object’s drop() method to remove its data.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> series_obj.drop(['DB'])
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object

4. General Pandas Series Object Methods.

  1. head(n): Return the first n rows of data, if the argument n is not provided, it’s default value is 5, then return the first 5 rows of the data.
  2. tail(n): Return the last n rows of data,  default return the last 5 rows.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> print(series_obj.head(3)) # return the first 3 rows in the Series object.
    DB      MySQL
    OS    Windows
    PL     Python
    dtype: object
    >>>
    >>> print(series_obj.tail(3))
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
  3. isnull() / notnull(): Detects whether the series object elements value is null or not. If the element’s value is null then isnull() return True and notnull() return False.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> print(pd.isnull(series_obj))
    DB     False
    OS     False
    PL     False
    ISO     True
    dtype: bool
    >>>
    >>> print(pd.notnull(series_obj))
    DB      True
    OS      True
    PL      True
    ISO    False
    dtype: bool

5. General Pandas Series Object Attributes.

  1. axes: Returns all row index labels in a list.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>>
    >>> print(series_obj.axes)
    [Index(['DB', 'OS', 'PL', 'ISO'], dtype='object')]
  2. dtype: The data type of the Series element.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> print(series_obj.dtype)
    object
  3. empty: Returns a Boolean value used to determine whether the Series object is empty or not.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> print(series_obj.empty)
    False
  4. index: Returns a RangeIndex object that describes the value range of the index.
    >>> import pandas as pd
    >>>
    >>> dict_data = {'PL' : 'Python', 'OS' : 'Windows', 'DB' : 'MySQL'}
    >>>
    >>> index_data= ['DB', 'OS', 'PL', 'ISO']
    >>>
    >>> series_obj = pd.Series(dict_data, index=index_data)
    >>>
    >>> print(series_obj)
    DB       MySQL
    OS     Windows
    PL      Python
    ISO        NaN
    dtype: object
    >>>
    >>> print(series_obj.index)
    Index(['DB', 'OS', 'PL', 'ISO'], dtype='object')
    >>>
    >>> print(series_obj.index.array)
    <PandasArray>
    ['DB', 'OS', 'PL', 'ISO']
    Length: 4, dtype: object
    >>>
    >>> for key in series_obj.index.array:
    ...      print(key, ' = ', series_obj[key])
    ...
    DB  =  MySQL
    OS  =  Windows
    PL  =  Python
    ISO  =  nan
  5. ndim: Returns the dimension of the Series object. By definition, a Series object is a one-dimensional data structure, so it always returns 1.
    >>> print(series_obj.ndim)
    1
  6. size: Returns the size (length) of the series object.
    >>> print(series_obj.size)
    4
  7. values: Returns the data in the series object as an array.
    >>> print(series_obj.values)
    ['MySQL' 'Windows' 'Python' nan]

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.