Skip to content


Getting Started with PyTorch

Deep Learning, PyTorch, Machine Learning, Python3 min read


TL;DR Learn how to create and manipulate Tensors on the CPU and GPU. Find out about common errors you might encounter while working with PyTorch and some easy ways to prevent them!

PyTorch is:

An open source machine learning framework that accelerates the path from research prototyping to production deployment.

In my humble opinion, PyTorch is the sweet way to solve Machine Learning problems, in the real world! The vast community allows you to work state-of-the-art models and deploy them to production in no time (relatively speaking). Let’s get started!

1!pip install -q -U torch watermark
1%load_ext watermark
2%watermark -v -p numpy,torch
1CPython 3.6.9
2IPython 5.5.0
4numpy 1.17.5
5torch 1.4.0

Run the complete notebook in your browser

The complete project on GitHub

PyTorch ❤ NumPy

Do you know NumPy? If you do, learning PyTorch will be a breeze! If you don’t, prepare to learn the skills that will guide you on your journey Machine Learning Mastery!

Let’s start with something simple:

1import torch
2import numpy as np
1a = np.array([1, 2])
2b = np.array([8, 9])
4c = a + b
1array([ 9, 11])

Adding the same arrays with PyTorch looks like this:

1a = torch.tensor([1, 2])
2b = torch.tensor([8, 9])
4c = a + b
1tensor([ 9, 11])

Fortunately, you can go from NumPy to PyTorch:

1a = torch.tensor([1, 2])
1array([1, 2])

and vice versa:

1a = np.array([1, 2])
1tensor([1, 2])

The good news is that the conversions incur almost no cost on the performance of your app. The NumPy and PyTorch store data in memory in the same way. That is, PyTorch is reusing the work done by NumPy.


Tensors are just n-dimensional number (including booleans) containers. You can find the complete list of supported data types at PyTorch’s Tensor Docs.

So, how can you create a Tensor (try to ignore that I’ve already shown you how to do it)?

1torch.tensor([[1, 2], [2, 1]])
1tensor([[1, 2],
2 [2, 1]])

You can create a tensor from floats:

1torch.FloatTensor([[1, 2], [2, 1]])
1tensor([[1., 2.],
2 [2., 1.]])

Or define the type like so:

1torch.tensor([[1, 2], [2, 1]], dtype=torch.bool)
1tensor([[True, True],
2 [True, True]])

You can use a wide range of factory methods to create Tensors without manually specifying each number. For example, you can create a matrix with random numbers like this:

1torch.rand(3, 2)
1tensor([[0.6686, 0.7622],
2 [0.0341, 0.5835],
3 [0.2423, 0.0651]])

Or one full of ones:

1torch.ones(3, 2)
1tensor([[1., 1.],
2 [1., 1.],
3 [1., 1.]])

PyTorch has a variety of useful operations:

1x = torch.tensor([[2, 3], [1, 2]])
3print(f'sum: {x.sum()}')
1tensor([[2, 3],
2 [1, 2]])
3sum: 8

Get the transpose of a 2-D tensor:

1tensor([[2, 1],
2 [3, 2]])

Get the shape of each dimension:

1torch.Size([2, 2])

Generally, performing some operation creates a new Tensor:

1y = torch.tensor([[2, 2], [5, 1]])
2z = x.add(y)
1tensor([[4, 5],
2 [6, 3]])

But you can do it in-place:

1tensor([[4, 5],
2 [6, 3]])

Almost all operations have an in-place version - the name of the operation, followed by an underscore.

Running on GPU

At this point, you might be like: “Why do I need PyTorch at all? All of this is perfectly doable with NumPy?“. PyTorch has three major superpowers:

Doing your Deep Learning computations on the GPU speeds up your experiment by a lot! And PyTorch makes it ridiculously easy to do it. Let’s start by checking if GPU is available:

1device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

Good, we have a CUDA-enabled GPU device on our hands. Let’s store a Tensor on it:

1x = torch.tensor([[2, 3], [1, 2]])
1tensor([[2, 3],
2 [1, 2]], device='cuda:0')

Notice that our Tensor is now on device cuda:0. What can we do with it? Pretty much everything as before:

1x =
3y = torch.tensor([[2, 2], [5, 1]])
4y =
1tensor([[4, 5],
2 [6, 3]], device='cuda:0')

Common Issues

I got to be honest with you. You will fuck up, multiple times, before understanding how this whole thing works out. That’s alright!

However, there are a couple of things you can do that might minimize the frustrations along your journey:

  • Doing operations between GPU and CPU Tensors is not allowed
  • Size mismatch between Tensors occurs often and is (almost every time) easy to fix:
1a = torch.ones(2, 2)
2b = torch.ones(1, 3)
3a * b
3RuntimeError Traceback (most recent call last)
5<ipython-input-21-b2a4e8765762> in <module>()
6 1 a = torch.ones(2, 2)
7 2 b = torch.ones(1, 3)
8----> 3 a * b
11RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1

PyTorch is very descriptive in this case. When doing more complex stuff, you would want to check the shape of your Tensors obsessively, after every operation. Just print the size!

  • Running out of GPU memory: You might be leaking memory or too large of a dataset/model. Faster/better GPU always helps. But remember, you can solve really large problems with a single powerful GPU these days. Think carefully if that is not enough for you - why that is?


Welcome to the dark side! You might’ve been working with Keras, TensorFlow, or another Deep Learning framework, until recently. Almost every framework is great, but PyTorch has really solid roots. Easy to use and understand, allows for fast experimentation and standard debugging tools apply! Enjoy!

Run the complete notebook in your browser

The complete project on GitHub



Want to be a Machine Learning expert?

Join the weekly newsletter on Data Science, Deep Learning and Machine Learning in your inbox, curated by me! Chosen by 10,000+ Machine Learning practitioners. (There might be some exclusive content, too!)

You'll never get spam from me