The Difference Between Pytorch .to (device) and. cuda() Function in Python

This article mainly introduces the difference between pytorch .to (device) and .cuda() function in Python.

1. .to (device) Function Can Be Used To Specify CPU or GPU.

# Single GPU or CPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# If it is multi GPU
if torch.cuda.device_count() > 1:
  model = nn.DataParallel(model,device_ids=[0,1,2])

2. .cuda() Function Can Only Specify GPU.

# Specify a GPU

# If it is multi GPU
os.environment['CUDA_VISIBLE_DEVICES'] = '0,1,2,3'
device_ids = [0,1,2,3]
net  = torch.nn.Dataparallel(net, device_ids =device_ids)

# Use all device_ids by default 
net  = torch.nn.Dataparallel(net) 
net = net.cuda()

3. The Concept Of device-agnostic.

  1. Device agnostic means that your code can run on any device.
  2. Code written by PyTorch to method can run on any different devices (CUDA / CPU).
  3. It is very difficult to write device-agnostic code in PyTorch of previous versions.
  4. Pytorch 0.4.0 makes code compatible.
  5. Pytorch 0.4.0 makes code compatibility very easy in two ways.
  6. Below is some example source code.
    # Start the script and create a tensor
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    # However, no matter whether you get a new tensor or module, if they are already on the target device, the copy operation will not be performed
    input =
    model = MyModule(...).to(device)


Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.