Python

Pytorch 튜토리얼 - 텐서연산 및 신경망 구성

남생이a 2024. 8. 16. 23:18

파이토치(PyTorch)¶

페이스북이 초기 루아(Lua) 언어로 개발된 토치(Torch)를 파이썬 버전으로 개발하여 2017년도에 공개
초기에 토치(Torch)는 넘파이(NumPy) 라이브러리처럼 과학 연산을 위한 라이브러리로 공개
이후 GPU를 이용한 텐서 조작 및 동적 신경망 구축이 가능하도록 딥러닝 프레임워크로 발전시킴
파이썬답게 만들어졌고, 유연하면서도 가속화된 계산 속도를 제공

파이토치 모듈 구조¶

출처: Deep Learning with PyTorch by Eli Stevens Luca Antiga. MEAP Publication. https://livebook.manning.com/#!/book/deep-learning-with-pytorch/welcome/v-7/

파이토치의 구성요소¶

torch: 메인 네임스페이스, 텐서 등의 다양한 수학 함수가 포함
torch.autograd: 자동 미분 기능을 제공하는 라이브러리
torch.nn: 신경망 구축을 위한 데이터 구조나 레이어 등의 라이브러리
torch.multiprocessing: 병럴처리 기능을 제공하는 라이브러리
torch.optim: SGD(Stochastic Gradient Descent)를 중심으로 한 파라미터 최적화 알고리즘 제공
torch.utils: 데이터 조작 등 유틸리티 기능 제공
torch.onnx: ONNX(Open Neural Network Exchange), 서로 다른 프레임워크 간의 모델을 공유할 때 사용

텐서(Tensors)¶

데이터 표현을 위한 기본 구조로 텐서(tensor)를 사용
텐서는 데이터를 담기위한 컨테이너(container)로서 일반적으로 수치형 데이터를 저장
넘파이(NumPy)의 ndarray와 유사
GPU를 사용한 연산 가속 가능

In [ ]:

import torch

텐서 초기화와 데이터 타입¶

초기화 되지 않은 텐서

In [ ]:

t = torch.FloatTensor([0., 1., 2., 3., 4., 5., 6.])
print(t)

tensor([0., 1., 2., 3., 4., 5., 6.])

무작위로 초기화된 텐서

In [ ]:

데이터 타입(dtype)이 long이고, 0으로 채워진 텐서

In [ ]:

사용자가 입력한 값으로 텐서 초기화

In [ ]:

x = 10

In [ ]:

2 x 4 크기, double 타입, 1로 채워진 텐서

In [ ]:

x와 같은 크기, float 타입, 무작위로 채워진 텐서

In [ ]:

텐서의 크기 계산

In [ ]:

데이터 타입(Data Type)¶

Data type	dtype	CPU tensor	GPU tensor
32-bit floating point	`torch.float32` or `torch.float`	`torch.FloatTensor`	`torch.cuda.FloatTensor`
64-bit floating point	`torch.float64` or `torch.double`	`torch.DoubleTensor`	`torch.cuda.DoubleTensor`
16-bit floating point	`torch.float16` or `torch.half`	`torch.HalfTensor`	`torch.cuda.HalfTensor`
8-bit integer(unsinged)	`torch.uint8`	`torch.ByteTensor`	`torch.cuda.ByteTensor`
8-bit integer(singed)	`torch.int8`	`torch.CharTensor`	`torch.cuda.CharTensor`
16-bit integer(signed)	`torch.int16` or `torch.short`	`torch.ShortTensor`	`torch.cuda.ShortTensor`
32-bit integer(signed)	`torch.int32` or `torch.int`	`torch.IntTensor`	`torch.cuda.IntTensor`
64-bit integer(signed)	`torch.int64` or `torch.long`	`torch.LongTensor`	`torch.cuda.LongTensor`

In [ ]:

CUDA Tensors¶

.to 메소드를 사용하여 텐서를 어떠한 장치(cpu, gpu)로도 옮길 수 있음

In [ ]:

x = torch.randn(1)
print(x)
print(x.item())
print(x.dtype)

tensor([-0.5172])
-0.5171635746955872
torch.float32

In [41]:

device = torch.device('cuda' if torch.cuda. is_available() else 'cpu')
print(device)
y = torch.ones_like(x, device = device)
print(y)
x = x.to(device)
print(x)
z = x + y
print(z)
print(z.to('cpu', torch.double))

cpu

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-41-1cae97758f4a> in <cell line: 3>()
      1 device = torch.device('cuda' if torch.cuda. is_available() else 'cpu')
      2 print(device)
----> 3 y = torch.ones_like(x, device = device)
      4 print(y)
      5 x = x.to(device)

NameError: name 'x' is not defined

다차원 텐서 표현¶

0D Tensor(Scalar)

하나의 숫자를 담고 있는 텐서(tensor)
축과 형상이 없음

In [ ]:

t0 = torch.tensor(0)
print(t0.ndim)
print(t0.shape)
print(t0)

0
torch.Size([])
tensor(0)

1D Tensor(Vector)

값들을 저장한 리스트와 유사한 텐서
하나의 축이 존재

In [ ]:

t1 = torch.tensor([1, 2, 3])
print(t1.ndim)
print(t1.shape)
print(t1)

1
torch.Size([3])
tensor([1, 2, 3])

2D Tensor(Matrix)

행렬과 같은 모양으로 두개의 축이 존재
일반적인 수치, 통계 데이터셋이 해당
주로 샘플(samples)과 특성(features)을 가진 구조로 사용

In [ ]:

t2 = torch.tensor([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
print(t2.ndim)
print(t2.shape)
print(t2)

2
torch.Size([3, 3])
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

3D Tensor

큐브(cube)와 같은 모양으로 세개의 축이 존재
데이터가 연속된 시퀀스 데이터나 시간 축이 포함된 시계열 데이터에 해당
주식 가격 데이터셋, 시간에 따른 질병 발병 데이터 등이 존재
주로 샘플(samples), 타임스텝(timesteps), 특성(features)을 가진 구조로 사용

In [ ]:

4D Tensor

4개의 축
컬러 이미지 데이터가 대표적인 사례 (흑백 이미지 데이터는 3D Tensor로 가능)
주로 샘플(samples), 높이(height), 너비(width), 컬러 채널(channel)을 가진 구조로 사용

5D Tensor

5개의 축
비디오 데이터가 대표적인 사례
주로 샘플(samples), 프레임(frames), 높이(height), 너비(width), 컬러 채널(channel)을 가진 구조로 사용

텐서의 연산(Operations)¶

텐서에 대한 수학 연산, 삼각함수, 비트 연산, 비교 연산, 집계 등 제공

In [ ]:

import math

a = torch.rand(1, 2) * 2 - 1
print(a)
print(torch.abs(a))
print(torch.ceil(a))
print(torch.floor(a))
print(torch.clamp(a, -0.5, 0.5))

tensor([[-0.7389, -0.0339]])
tensor([[0.7389, 0.0339]])
tensor([[-0., -0.]])
tensor([[-1., -1.]])
tensor([[-0.5000, -0.0339]])

In [ ]:

print(a)
print(torch.min(a))
print(torch.max(a))
print(torch.mean(a))
print(torch.std(a))
print(torch.prod(a))
print(torch.unique(torch.tensor([1, 2, 3, 1, 2, 3])))

tensor([[-0.7389, -0.0339]])
tensor(-0.7389)
tensor(-0.0339)
tensor(-0.3864)
tensor(0.4986)
tensor(0.0250)
tensor([1, 2, 3])

max와 min은 dim 인자를 줄 경우 argmax와 argmin도 함께 리턴

argmax: 최대값을 가진 인덱스
argmin: 최소값을 가진 인덱스

In [ ]:

x = torch.rand(2, 2)
print(x)
print(x.max(dim=0))
print(x.max(dim=1))

tensor([[0.7870, 0.3448],
        [0.5940, 0.6871]])
torch.return_types.max(
values=tensor([0.7870, 0.6871]),
indices=tensor([0, 1]))
torch.return_types.max(
values=tensor([0.7870, 0.6871]),
indices=tensor([0, 1]))

In [ ]:

print(x)
print(x.min(dim=0))
print(x.min(dim=1))

tensor([[0.7870, 0.3448],
        [0.5940, 0.6871]])
torch.return_types.min(
values=tensor([0.5940, 0.3448]),
indices=tensor([1, 0]))
torch.return_types.min(
values=tensor([0.3448, 0.5940]),
indices=tensor([1, 0]))

In [ ]:

x = torch.rand(2, 2)
print(x)
y = torch.rand(2, 2)
print(y)

tensor([[0.9778, 0.2322],
        [0.9288, 0.9593]])
tensor([[0.4161, 0.8909],
        [0.9978, 0.6027]])

torch.add: 덧셈

In [ ]:

print(x + y)
print(torch.add(x, y))

tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])
tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])

결과 텐서를 인자로 제공

In [ ]:

result = torch.empty(2, 4)
torch.add(x, y, out=result)
print(result)

tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])

C:\Users\xodus\AppData\Local\Temp\ipykernel_36412\2363212739.py:2: UserWarning: An output with one or more elements was resized since it had shape [2, 4], which does not match the required output shape [2, 2]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\Resize.cpp:35.)
  torch.add(x, y, out=result)

in-place 방식

in-place방식으로 텐서의 값을 변경하는 연산 뒤에는 _''가 붙음
x.copy_(y), x.t_()

In [ ]:

print(x)
print(y)
y.add_(x)
print(y)

tensor([[0.9778, 0.2322],
        [0.9288, 0.9593]])
tensor([[0.4161, 0.8909],
        [0.9978, 0.6027]])
tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])

torch.sub: 뺄셈

In [ ]:

print(x)
print(y)
print(x - y)
print(torch.sub(x, y))
print(x.sub(y))

tensor([[0.9778, 0.2322],
        [0.9288, 0.9593]])
tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])
tensor([[-0.4161, -0.8909],
        [-0.9978, -0.6027]])
tensor([[-0.4161, -0.8909],
        [-0.9978, -0.6027]])
tensor([[-0.4161, -0.8909],
        [-0.9978, -0.6027]])

torch.mul: 곱셉

In [ ]:

print(x)
print(y)
print(x * y)
print(torch.mul(x, y))
print(x.mul(y))

tensor([[0.9778, 0.2322],
        [0.9288, 0.9593]])
tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])
tensor([[1.3629, 0.2608],
        [1.7893, 1.4984]])
tensor([[1.3629, 0.2608],
        [1.7893, 1.4984]])
tensor([[1.3629, 0.2608],
        [1.7893, 1.4984]])

torch.div: 나눗셈

In [ ]:

print(x)
print(y)
print(x / y)
print(torch.div(x, y))
print(x.div(y))

tensor([[0.9778, 0.2322],
        [0.9288, 0.9593]])
tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])
tensor([[0.7015, 0.2068],
        [0.4821, 0.6142]])
tensor([[0.7015, 0.2068],
        [0.4821, 0.6142]])
tensor([[0.7015, 0.2068],
        [0.4821, 0.6142]])

torch.mm: 내적(dot product)

In [ ]:

print(x)
print(y)
print(torch.matmul(x, y))
z = torch.mm(x, y)
print(z)
print(torch.svd(z))

tensor([[0.9778, 0.2322],
        [0.9288, 0.9593]])
tensor([[1.3939, 1.1232],
        [1.9266, 1.5620]])
tensor([[1.8103, 1.4609],
        [3.1428, 2.5416]])
tensor([[1.8103, 1.4609],
        [3.1428, 2.5416]])
torch.return_types.svd(
U=tensor([[-0.4988, -0.8667],
        [-0.8667,  0.4988]]),
S=tensor([4.6635e+00, 2.0650e-03]),
V=tensor([[-0.7777, -0.6286],
        [-0.6286,  0.7777]]))

텐서의 조작(Manipulations)¶

인덱싱(Indexing): NumPy처럼 인덱싱 형태로 사용가능

In [ ]:

x = torch.Tensor([[1, 2],
                  [3, 4]])
print(x)
print(x[0, 0])
print(x[0, 1])
print(x[1, 0])
print(x[1, 1])

print(x[:, 0])
print(x[:, 1])
print(x[0, :])
print(x[1, :])

tensor([[1., 2.],
        [3., 4.]])
tensor(1.)
tensor(2.)
tensor(3.)
tensor(4.)
tensor([1., 3.])
tensor([2., 4.])
tensor([1., 2.])
tensor([3., 4.])

view: 텐서의 크기(size)나 모양(shape)을 변경

기본적으로 변경 전과 후에 텐서 안의 원소 개수가 유지되어야 함
-1로 설정되면 계산을 통해 해당 크기값을 유추

In [ ]:

x = torch.randn(4, 5)
print(x)
y = x.view(20)
print(y)
z = x.view(5, -1)
print(z)

tensor([[ 1.3183,  0.4399,  0.4315,  0.7125, -0.4232],
        [ 0.3619, -1.0817, -0.5947, -0.5867,  0.4631],
        [-2.2517,  0.8459,  1.5883, -0.3597, -0.1833],
        [-1.4298, -0.5841, -1.0301, -0.6693,  0.1959]])
tensor([ 1.3183,  0.4399,  0.4315,  0.7125, -0.4232,  0.3619, -1.0817, -0.5947,
        -0.5867,  0.4631, -2.2517,  0.8459,  1.5883, -0.3597, -0.1833, -1.4298,
        -0.5841, -1.0301, -0.6693,  0.1959])
tensor([[ 1.3183,  0.4399,  0.4315,  0.7125],
        [-0.4232,  0.3619, -1.0817, -0.5947],
        [-0.5867,  0.4631, -2.2517,  0.8459],
        [ 1.5883, -0.3597, -0.1833, -1.4298],
        [-0.5841, -1.0301, -0.6693,  0.1959]])

item: 텐서에 값이 단 하나라도 존재하면 숫자값을 얻을 수 있음

In [ ]:

x = torch.rand(1)
print(x)
print(x.item())
print(x.dtype)

tensor([0.5269])
0.5268673896789551
torch.float32

스칼라값 하나만 존재해야 item() 사용 가능

In [ ]:

tensor([0.6151, 0.8299])

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[24], line 3
      1 x = torch.rand(2)
      2 print(x)
----> 3 print(x.item())
      4 print(x.dtype)

RuntimeError: a Tensor with 2 elements cannot be converted to Scalar

squeeze: 차원을 축소(제거)

In [ ]:

tensor = torch.rand(1, 3, 3)
print(tensor)
print(tensor.shape)

tensor([[[0.9809, 0.5227, 0.9596],
         [0.2491, 0.3174, 0.0162],
         [0.4459, 0.2969, 0.0172]]])
torch.Size([1, 3, 3])

In [ ]:

t = tensor.squeeze()
print(t)
print(t.shape)

tensor([[0.9809, 0.5227, 0.9596],
        [0.2491, 0.3174, 0.0162],
        [0.4459, 0.2969, 0.0172]])
torch.Size([3, 3])

unsqueeze: 차원을 증가(생성)

In [ ]:

t = torch.rand(3, 3)
print(t)
print(t.shape)

tensor([[0.7453, 0.8929, 0.4483],
        [0.2229, 0.6966, 0.1367],
        [0.0719, 0.9958, 0.3260]])
torch.Size([3, 3])

In [ ]:

tensor = t.unsqueeze(dim=0)
print(tensor)
print(tensor.shape)

tensor([[[0.7453, 0.8929, 0.4483],
         [0.2229, 0.6966, 0.1367],
         [0.0719, 0.9958, 0.3260]]])
torch.Size([1, 3, 3])

In [ ]:

tensor = t.unsqueeze(dim=2)
print(tensor)
print(tensor.shape)

tensor([[[0.7453],
         [0.8929],
         [0.4483]],

        [[0.2229],
         [0.6966],
         [0.1367]],

        [[0.0719],
         [0.9958],
         [0.3260]]])
torch.Size([3, 3, 1])

stack: 텐서간 결합

In [ ]:

x = torch.FloatTensor([1, 4])
print(x)
y = torch.FloatTensor([2, 5])
print(y)
z = torch.FloatTensor([3, 6])
print(z)

print(torch.stack([x, y, z]))

tensor([1., 4.])
tensor([2., 5.])
tensor([3., 6.])
tensor([[1., 4.],
        [2., 5.],
        [3., 6.]])

cat: 텐서를 결합하는 메소드(concatenate)

넘파이의 stack과 유사하지만, 쌓을 dim이 존재해야함
해당 차원을 늘려준 후 결합

In [ ]:

a = torch.randn(1, 3, 3)
print(a)
b = torch.randn(1, 3, 3)
print(b)
c = torch.cat((a, b), dim=0)
print(c)
print(c.size())

tensor([[[ 0.9720,  0.5302,  0.3912],
         [-0.3124, -0.6052,  0.6506],
         [ 2.0312,  0.1410, -0.0298]]])
tensor([[[-1.1560, -0.2555,  1.0128],
         [ 0.7819,  0.3953,  1.7806],
         [-1.0710,  0.2716, -1.2741]]])
tensor([[[ 0.9720,  0.5302,  0.3912],
         [-0.3124, -0.6052,  0.6506],
         [ 2.0312,  0.1410, -0.0298]],

        [[-1.1560, -0.2555,  1.0128],
         [ 0.7819,  0.3953,  1.7806],
         [-1.0710,  0.2716, -1.2741]]])
torch.Size([2, 3, 3])

In [ ]:

c = torch.cat((a, b), dim=1)
print(c)
print(c.size())

tensor([[[ 0.9720,  0.5302,  0.3912],
         [-0.3124, -0.6052,  0.6506],
         [ 2.0312,  0.1410, -0.0298],
         [-1.1560, -0.2555,  1.0128],
         [ 0.7819,  0.3953,  1.7806],
         [-1.0710,  0.2716, -1.2741]]])
torch.Size([1, 6, 3])

In [ ]:

c = torch.cat((a, b), dim=2)
print(c)
print(c.size())

tensor([[[ 0.9720,  0.5302,  0.3912, -1.1560, -0.2555,  1.0128],
         [-0.3124, -0.6052,  0.6506,  0.7819,  0.3953,  1.7806],
         [ 2.0312,  0.1410, -0.0298, -1.0710,  0.2716, -1.2741]]])
torch.Size([1, 3, 6])

chunk: 텐서를 여러 개로 나눌 때 사용 (몇 개로 나눌 것인가?)

In [ ]:

tensor = torch.rand(3, 6)
print(tensor)

t1, t2, t3 = torch.chunk(tensor, 3, dim=1)
print(t1)
print(t2)
print(t3)

tensor([[0.9582, 0.5981, 0.6283, 0.6279, 0.8309, 0.7105],
        [0.9168, 0.6787, 0.3635, 0.7603, 0.0968, 0.7248],
        [0.5706, 0.6225, 0.7284, 0.8518, 0.3439, 0.8861]])
tensor([[0.9582, 0.5981],
        [0.9168, 0.6787],
        [0.5706, 0.6225]])
tensor([[0.6283, 0.6279],
        [0.3635, 0.7603],
        [0.7284, 0.8518]])
tensor([[0.8309, 0.7105],
        [0.0968, 0.7248],
        [0.3439, 0.8861]])

split: chunk와 동일한 기능이지만 조금 다름 (텐서의 크기는 몇인가?)

In [ ]:

tensor = torch.rand(3, 6)
t1, t2 = torch.split(tensor, 3, dim=1)

print(tensor)
print(t1)
print(t2)

tensor([[0.6629, 0.7750, 0.7590, 0.0481, 0.2992, 0.2832],
        [0.4073, 0.6713, 0.6036, 0.0977, 0.4953, 0.0680],
        [0.4750, 0.0259, 0.5710, 0.3239, 0.5928, 0.6802]])
tensor([[0.6629, 0.7750, 0.7590],
        [0.4073, 0.6713, 0.6036],
        [0.4750, 0.0259, 0.5710]])
tensor([[0.0481, 0.2992, 0.2832],
        [0.0977, 0.4953, 0.0680],
        [0.3239, 0.5928, 0.6802]])

torch ↔ numpy

Torch Tensor(텐서)를 NumPy array(배열)로 변환 가능
- numpy()
- from_numpy()
Tensor가 CPU상에 있다면 NumPy 배열은 메모리 공간을 공유하므로 하나가 변하면, 다른 하나도 변함

In [ ]:

a = torch.ones(7)
print(a)

tensor([1., 1., 1., 1., 1., 1., 1.])

In [ ]:

b = a.numpy()
print(b)

[1. 1. 1. 1. 1. 1. 1.]

In [ ]:

a.add_(1)
print(a)
print(b)

tensor([2., 2., 2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2. 2. 2.]

In [ ]:

import numpy as np

a = np.ones(7)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

[2. 2. 2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2., 2., 2.], dtype=torch.float64)

In [ ]:

Autograd(자동미분)¶

torch.autograd 패키지는 Tensor의 모든 연산에 대해 자동 미분 제공
이는 코드를 어떻게 작성하여 실행하느냐에 따라 역전파가 정의된다는 뜻
backprop를 위해 미분값을 자동으로 계산

requires_grad 속성을 True로 설정하면, 해당 텐서에서 이루어지는 모든 연산들을 추적하기 시작

기록을 추적하는 것을 중단하게 하려면, .detach()를 호출하여 연산기록으로부터 분리

In [ ]:

a = torch.randn(3, 3)
a = a * 3
print(a)
print(a.requires_grad)

tensor([[-4.2719,  1.3259, -0.3379],
        [ 6.9943,  2.0913,  1.3107],
        [ 2.7305,  0.7880, -2.9714]])
False

requires_grad_(...)는 기존 텐서의 requires_grad 값을 바꿔치기(in-place)하여 변경

grad_fn: 미분값을 계산한 함수에 대한 정보 저장 (어떤 함수에 대해서 backprop 했는지)

In [ ]:

a.requires_grad_(True)
print(a.requires_grad)

b = (a * a).sum()
print(b)
print(b.grad_fn)

True
tensor(92.0392, grad_fn=<SumBackward0>)
<SumBackward0 object at 0x0000019C1347E770>

기울기(Gradient)¶

In [ ]:

x = torch.ones(3, 3, requires_grad=True)
print(x)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], requires_grad=True)

In [ ]:

y = x + 5
print(y)

tensor([[6., 6., 6.],
        [6., 6., 6.],
        [6., 6., 6.]], grad_fn=<AddBackward0>)

In [ ]:

z = y * y
out = z.mean()
print(z, out)

tensor([[36., 36., 36.],
        [36., 36., 36.],
        [36., 36., 36.]], grad_fn=<MulBackward0>) tensor(36., grad_fn=<MeanBackward0>)

계산이 완료된 후, .backward()를 호출하면 자동으로 역전파 계산이 가능하고, .grad 속성에 누적됨

In [ ]:

print(out)
out.backward()

tensor(36., grad_fn=<MeanBackward0>)

grad: data가 거쳐온 layer에 대한 미분값 저장

In [ ]:

print(x)
print(x.grad)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], requires_grad=True)
tensor([[1.3333, 1.3333, 1.3333],
        [1.3333, 1.3333, 1.3333],
        [1.3333, 1.3333, 1.3333]])

In [ ]:

x = torch.randn(3, requires_grad=True)

y = x * 2
while y.data.norm() < 1000:
  y = y * 2
print(y)

tensor([ 924.5911, -436.0419,  840.3173], grad_fn=<MulBackward0>)

In [ ]:

v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)
print(x.grad)

tensor([1.0240e+02, 1.0240e+03, 1.0240e-01])

with torch.no_grad()를 사용하여 기울기의 업데이트를 하지 않음

기록을 추적하는 것을 방지하기 위해 코드 블럭을 with torch.no_grad()로 감싸면 기울기 계산은 필요없지만, requires_grad=True로 설정되어 학습 가능한 매개변수를 갖는 모델을 평가(evaluate)할 때 유용

In [ ]:

print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
  print((x ** 2).requires_grad)

True
True
False

detach(): 내용물(content)은 같지만 require_grad가 다른 새로운 Tensor를 가져올 때

In [ ]:

print(x.requires_grad)
y = x.detach()
print(y.requires_grad)
print(x.eq(y).all())

True
False
tensor(True)

자동 미분 흐름 예제¶

계산 흐름 $a \rightarrow b \rightarrow c \rightarrow out $

$\quad \frac{\partial out}{\partial a} = ?$¶

backward()를 통해 $a \leftarrow b \leftarrow c \leftarrow out $을 계산하면 $\frac{\partial out}{\partial a}$값이 a.grad에 채워짐

In [ ]:

a = torch.ones(2, 2)
print(a)

tensor([[1., 1.],
        [1., 1.]])

In [ ]:

a = torch.ones(2, 2, requires_grad=True)
print(a)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [ ]:

print(a.data)
print(a.grad)
print(a.grad_fn)

tensor([[1., 1.],
        [1., 1.]])
None
None

$b = a + 2$

In [ ]:

b = a + 2
print(b)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)

$c = b^2$

In [ ]:

c = b ** 2
print(c)

tensor([[9., 9.],
        [9., 9.]], grad_fn=<PowBackward0>)

In [ ]:

out = c.sum()
print(out)

tensor(36., grad_fn=<SumBackward0>)

In [ ]:

print(out)
out.backward()

tensor(36., grad_fn=<SumBackward0>)

a의 grad_fn이 None인 이유는 직접적으로 계산한 부분이 없었기 때문

In [ ]:

print(a.data)
print(a.grad)
print(a.grad_fn)

tensor([[1., 1.],
        [1., 1.]])
tensor([[6., 6.],
        [6., 6.]])
None

In [ ]:

print(b.data)
print(b.grad)
print(b.grad_fn)

tensor([[3., 3.],
        [3., 3.]])
None
<AddBackward0 object at 0x0000019C7DCD14B0>

C:\Users\xodus\AppData\Local\Temp\ipykernel_36412\2485455394.py:2: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\build\aten\src\ATen/core/TensorBody.h:494.)
  print(b.grad)

In [ ]:

print(out.data)
print(out.grad)
print(out.grad_fn)

tensor(36.)
None
<SumBackward0 object at 0x0000019C7DCD2110>

C:\Users\xodus\AppData\Local\Temp\ipykernel_36412\578081240.py:2: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\build\aten\src\ATen/core/TensorBody.h:494.)
  print(out.grad)

In [ ]:

데이터 준비¶

파이토치에서는 데이터 준비를 위해 torch.utils.data의 Dataset과 DataLoader 사용 가능

Dataset에는 다양한 데이터셋이 존재 (MNIST, FashionMNIST, CIFAR10, ...)
- Vision Dataset: https://pytorch.org/vision/stable/datasets.html
- Text Dataset: https://pytorch.org/text/stable/datasets.html
- Audio Dataset: https://pytorch.org/audio/stable/datasets.html
DataLoader와 Dataset을 통해 batch_size, train 여부, transform 등을 인자로 넣어 데이터를 어떻게 load할 것인지 정해줄 수 있음

In [1]:

import torch
import numpy as np
from torch.utils.data import Dataset, DataLoader

토치비전(torchvision)은 파이토치에서 제공하는 데이터셋들이 모여있는 패키지

transforms: 전처리할 때 사용하는 메소드 (https://pytorch.org/docs/stable/torchvision/transforms.html)
transforms에서 제공하는 클래스 이외는 일반적으로 클래스를 따로 만들어 전처리 단계를 진행

In [2]:

import torchvision.transforms as transforms
from torchvision import datasets

DataLoader의 인자로 들어갈 transform을 미리 정의할 수 있고, Compose를 통해 리스트 안에 순서대로 전처리 진행

ToTensor()를 하는 이유는 torchvision이 PIL Image 형태로만 입력을 받기 때문에 데이터 처리를 위해서 Tensor형으로 변환 필요

In [3]:

mnist_transform = transforms.Compose([transforms.ToTensor(),
                                      transforms.Normalize(mean=(0.5,),std=(1.0,))])

In [4]:

trainset = datasets.MNIST(root='/content/',
                          train=True, download=True,
                          transform=mnist_transform)
testset = datasets.MNIST(root='/content/',
                          train=False, download=True,
                          transform=mnist_transform)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to /content/MNIST/raw/train-images-idx3-ubyte.gz

100%|██████████| 9912422/9912422 [00:03<00:00, 2557454.31it/s]

Extracting /content/MNIST/raw/train-images-idx3-ubyte.gz to /content/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to /content/MNIST/raw/train-labels-idx1-ubyte.gz

100%|██████████| 28881/28881 [00:00<00:00, 507791.50it/s]

Extracting /content/MNIST/raw/train-labels-idx1-ubyte.gz to /content/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to /content/MNIST/raw/t10k-images-idx3-ubyte.gz

100%|██████████| 1648877/1648877 [00:00<00:00, 4488439.88it/s]

Extracting /content/MNIST/raw/t10k-images-idx3-ubyte.gz to /content/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to /content/MNIST/raw/t10k-labels-idx1-ubyte.gz

100%|██████████| 4542/4542 [00:00<00:00, 3811630.41it/s]

Extracting /content/MNIST/raw/t10k-labels-idx1-ubyte.gz to /content/MNIST/raw

DataLoader는 데이터 전체를 보관했다가 실제 모델 학습을 할 때 batch_size 크기만큼 데이터를 가져옴

In [5]:

train_loader = DataLoader(trainset, batch_size=8, shuffle=True, num_workers=2)
testloader = DataLoader(testset, batch_size=8, shuffle=False, num_workers=2)

In [6]:

dataiter = iter(train_loader)
images, labels = next(dataiter)
images.shape, labels.shape

Out[6]:

(torch.Size([8, 1, 28, 28]), torch.Size([8]))

In [7]:

torch_image = torch.squeeze(images[0])
torch_image.shape

Out[7]:

torch.Size([28, 28])

In [8]:

import matplotlib.pyplot as plt
figure = plt.figure(figsize=(12, 6))
cols, rows = 4, 2
for i in range(1, cols * rows + 1):
  sample_idx = torch.randint(len(trainset), size=(1,)).item()
  img, label = trainset[sample_idx]
  figure.add_subplot(rows, cols, i)
  plt.title(label)
  plt.axis('off')
  plt.imshow(img.squeeze(), cmap='gray')
  plt.show()

No description has been provided for this image

신경망 구성¶

레이어(layer): 신경망의 핵심 데이터 구조로 하나 이상의 텐서를 입력받아 하나 이상의 텐서를 출력
모듈(module): 한 개 이상의 계층이 모여서 구성
모델(model): 한 개 이상의 모듈이 모여서 구성

`torch.nn` 패키지¶

주로 가중치(weights), 편향(bias)값들이 내부에서 자동으로 생성되는 레이어들을 사용할 때 사용 (weight값들을 직접 선언 안함)

https://pytorch.org/docs/stable/nn.html

In [9]:

import torch
import matplotlib.pyplot as plt
import torch.nn as nn
import numpy as np

nn.Linear 계층 예제

In [10]:

input = torch.randn(128, 20)
print(input)

m = nn.Linear(20, 30)
print(m)

output = m(input)
print(output)
print(output.size())

tensor([[ 0.1611, -0.1770,  0.9294,  ..., -1.3216,  1.1687,  0.7508],
        [ 0.0091,  0.2195, -0.1656,  ...,  1.5742, -0.1580, -0.6254],
        [-0.0739,  0.4897, -0.3660,  ...,  2.4579,  0.1372,  0.7276],
        ...,
        [-0.3231,  1.0198, -1.8128,  ..., -0.3414, -0.1527,  0.0857],
        [ 0.2109, -1.0355,  0.2443,  ...,  1.2963, -1.0275,  0.9258],
        [-2.5506,  0.8304,  1.2549,  ..., -0.7072,  1.2520, -0.8628]])
Linear(in_features=20, out_features=30, bias=True)
tensor([[ 0.9625, -0.8421, -0.2202,  ..., -0.6258, -0.4394, -0.1915],
        [-0.9019,  0.1840, -0.6796,  ...,  1.5266, -0.2464, -0.8951],
        [-0.2721, -0.2364,  0.2282,  ...,  0.3653,  0.5424, -1.0521],
        ...,
        [-0.1131, -0.7157, -0.3249,  ..., -0.2302,  0.4510, -0.3833],
        [ 0.9199, -0.7164,  0.2842,  ..., -1.1771,  0.0924,  0.3473],
        [-0.6566,  1.4463,  0.9695,  ...,  0.8163,  0.6002,  0.5698]],
       grad_fn=<AddmmBackward0>)
torch.Size([128, 30])

nn.Conv2d 계층 예시

In [11]:

input = torch.rand(20, 16, 50, 100)
print(input.size())

torch.Size([20, 16, 50, 100])

In [12]:

m = nn.Conv2d(16, 33, 3, stride=2)
print(m)
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 2), padding=(4,2))
print(m)
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 2), padding=(4,2), dilation=(3, 1))
print(m)

Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
Conv2d(16, 33, kernel_size=(3, 5), stride=(2, 2), padding=(4, 2))
Conv2d(16, 33, kernel_size=(3, 5), stride=(2, 2), padding=(4, 2), dilation=(3, 1))

In [13]:

output = m(input)
print(output.size())

torch.Size([20, 33, 26, 50])

컨볼루션 레이어(Convolution Layers)¶

nn.Conv2d 예제

in_channels: channel의 갯수
out_channels: 출력 채널의 갯수
kernel_size: 커널(필터) 사이즈

In [14]:

nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5, stride=1)

Out[14]:

Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))

In [15]:

layer = nn.Conv2d(1, 20, 5, 1).to(torch.device('cpu'))
layer

Out[15]:

Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))

weight 확인

In [16]:

weight = layer.weight
weight.shape

Out[16]:

torch.Size([20, 1, 5, 5])

weight는 detach()를 통해 꺼내줘야 numpy()변환이 가능

In [17]:

weight = weight.detach()

In [18]:

weight = weight.numpy()
weight.shape

Out[18]:

(20, 1, 5, 5)

In [19]:

plt.imshow(weight[0, 0, :, :], 'jet')
plt.colorbar()
plt.show()

In [20]:

print(images.shape)
print(images[0].size())

input_image = torch.squeeze(images[0])
print(input_image.size())

torch.Size([8, 1, 28, 28])
torch.Size([1, 28, 28])
torch.Size([28, 28])

In [21]:

input_data = torch.unsqueeze(images[0], dim=0)
print(input_data.size())

output_data = layer(input_data)
output = output_data.data
output_arr = output.numpy()
output_arr.shape

torch.Size([1, 1, 28, 28])

Out[21]:

(1, 20, 24, 24)

In [22]:

plt.figure(figsize=(15, 30))

plt.subplot(131)
plt.title("input")
plt.imshow(input_image, 'gray')

plt.subplot(132)
plt.title("Weight")
plt.imshow(weight[0, 0, :, :], 'jet')

plt.subplot(133)
plt.title("Output")
plt.imshow(output_arr[0, 0, :, :], 'gray')
plt.show()

풀링 레이어(Pooling layers)¶

F.max_pool2d
- stride
- kernel_size
torch.nn.MaxPool2d 도 많이 사용

In [23]:

import torch.nn.functional as F

pool = F.max_pool2d(output, 2, 2)
pool.shape

Out[23]:

torch.Size([1, 20, 12, 12])

MaxPool Layer는 weight가 없기 때문에 바로 numpy()변환 가능

In [24]:

pool_arr = pool.numpy()
pool_arr.shape

Out[24]:

(1, 20, 12, 12)

In [25]:

plt.figure(figsize=(10, 15))

plt.subplot(121)
plt.title("Input")
plt.imshow(input_image, 'gray')

plt.subplot(122)
plt.title("output")
plt.imshow(pool_arr[0, 0, :, :], 'gray')
plt.show()

선형 레이어(Linear layers)¶

1d만 가능하므로 .view()를 통해 1d로 펼쳐줘야함

In [26]:

flatten = input_image.view(1, 28 * 28)
flatten.shape

Out[26]:

torch.Size([1, 784])

In [27]:

lin = nn.Linear(784, 10)(flatten)
lin.shape

Out[27]:

torch.Size([1, 10])

In [28]:

lin

Out[28]:

tensor([[-0.5786,  0.0164, -0.2729,  0.3160,  0.2966, -0.1504, -0.1433,  0.4315,
          0.3447, -0.4311]], grad_fn=<AddmmBackward0>)

In [29]:

plt.imshow(lin.detach().numpy(), 'jet')
plt.colorbar()
plt.show()

비선형 활성화 (Non-linear Activations)¶

F.softmax와 같은 활성화 함수 등

In [30]:

with torch.no_grad():
  flatten = input_image.view(1, 28 * 28)
  lin = nn.Linear(784, 10)(flatten)
  softmax = F.softmax(lin, dim=1)

softmax

Out[30]:

tensor([[0.0863, 0.0940, 0.0719, 0.1549, 0.0652, 0.0969, 0.0874, 0.0771, 0.1595,
         0.1067]])

In [31]:

np.sum(softmax.numpy())

Out[31]:

1.0

F.relu

ReLU 함수를 적용하는 레이어
nn.ReLU로도 사용 가능

In [33]:

device = torch.device('cuda' if torch.cuda. is_available() else 'cpu')
inputs = torch.randn(4, 3, 28, 28).to(device)
inputs.shape

Out[33]:

torch.Size([4, 3, 28, 28])

In [34]:

layer = nn.Conv2d(3, 20, 5, 1).to(device)
output = F.relu(layer(inputs))
output.shape

Out[34]:

torch.Size([4, 20, 24, 24])

신경망 종류¶

모델 정의¶

`nn.Module` 상속 클래스 정의¶

nn.Module을 상속받는 클래스 정의
__init__(): 모델에서 사용될 모듈과 활성화 함수 등을 정의
forward(): 모델에서 실행되어야 하는 연산을 정의

In [35]:

class Model(nn.Module):
  def __init__(self, inputs):
    super(Model, self).__init__()
    self.layer = nn.Linear(inputs, 1)
    self.activation = nn.Sigmoid()

  def forward(self, x):
    x = self.layer(x)
    x = self.activation(x)
    return x

In [36]:

model = Model(1)
print(list(model.children()))
print(list(model.modules()))

[Linear(in_features=1, out_features=1, bias=True), Sigmoid()]
[Model(
  (layer): Linear(in_features=1, out_features=1, bias=True)
  (activation): Sigmoid()
), Linear(in_features=1, out_features=1, bias=True), Sigmoid()]

`nn.Sequential`을 이용한 신경망 정의¶

nn.Sequential 객체로 그 안에 각 모듈을 순차적으로 실행
__init__()에서 사용할 네트워크 모델들을 nn.Sequential로 정의 가능
forward()에서 실행되어야 할 계산을 가독성 높게 작성 가능

In [37]:

class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.layer1 = nn.Sequential(
        nn.Conv2d(in_channels=3, out_channels=64, kernel_size=5),
        nn.ReLU(inplace=True),
        nn.MaxPool2d(2)
    )

    self.layer2 = nn.Sequential(
        nn.Conv2d(in_channels=64, out_channels=30, kernel_size=5),
        nn.ReLU(inplace=True),
        nn.MaxPool2d(2)
    )

    self.layer3 = nn.Sequential(
        nn.Linear(in_features=30*5*5, out_features=10, bias=True),
        nn.ReLU(inplace=True)
    )

    def forward(self, x):
      x = self.layer1(x)
      x = self.layer2(x)
      x = x.view(x.shape[0], -1)
      x = self.layer3(x)
      return x

In [38]:

model = Model()
print(list(model.children()))
print(list(model.modules()))

[Sequential(
  (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
  (1): ReLU(inplace=True)
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
), Sequential(
  (0): Conv2d(64, 30, kernel_size=(5, 5), stride=(1, 1))
  (1): ReLU(inplace=True)
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
), Sequential(
  (0): Linear(in_features=750, out_features=10, bias=True)
  (1): ReLU(inplace=True)
)]
[Model(
  (layer1): Sequential(
    (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (0): Conv2d(64, 30, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer3): Sequential(
    (0): Linear(in_features=750, out_features=10, bias=True)
    (1): ReLU(inplace=True)
  )
), Sequential(
  (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
  (1): ReLU(inplace=True)
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
), Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1)), ReLU(inplace=True), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), Sequential(
  (0): Conv2d(64, 30, kernel_size=(5, 5), stride=(1, 1))
  (1): ReLU(inplace=True)
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
), Conv2d(64, 30, kernel_size=(5, 5), stride=(1, 1)), ReLU(inplace=True), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), Sequential(
  (0): Linear(in_features=750, out_features=10, bias=True)
  (1): ReLU(inplace=True)
), Linear(in_features=750, out_features=10, bias=True), ReLU(inplace=True)]

파이토치 사전학습 모델¶

https://pytorch.org/vision/stable/models.html

모델 파라미터¶

손실 함수(Loss function)¶

예측 값과 실제 값 사이의 오차 측정
학습이 진행되면서 해당 과정이 얼마나 잘 되고 있는지 나타내는 지표
모델이 훈련되는 동안 최소화될 값으로 주어진 문제에 대한 성공 지표
손실 함수에 따른 결과를 통해 학습 파라미터를 조정
최적화 이론에서 최소화 하고자 하는 함수
미분 가능한 함수 사용
파이토치의 주요 손실 함수
- torch.nn.BCELoss: 이진 분류를 위해 사용
- torch.nn.CrossEntropyLoss: 다중 클래스 분류를 위해 사용
- torch.nn.MSELoss: 회귀 모델에서 사용

In [39]:

criterion = nn.MSELoss()
criterion = nn.CrossEntropyLoss()

옵티마이저(Optimizer)¶

손실 함수를 기반으로 모델이 어떻게 업데이트되어야 하는지 결정 (특정 종류의 확률적 경사 하강법 구현)
optimizer는 step()을 통해 전달받은 파라미터를 모델 업데이트
모든 옵티마이저의 기본으로 torch.optim.Optimizer(params, defaults) 클래스 사용
zero_grad()를 이용해 옵티마이저에 사용된 파라미터들의 기울기를 0으로 설정
torch.optim.lr_scheduler를 이용해 에포크(epochs)에 따라 학습률(learning rate) 조절
파이토치의 주요 옵티마이저: optim.Adadelta, optim.Adagrad, optim.Adam, optim.RMSprop, optim.SGD

In [39]:

학습률 스케줄러(Learning rate scheduler)¶

학습시 특정 조건에 따라 학습률을 조정하여 최적화 진행
일정 횟수 이상이 되면 학습률을 감소(decay)시키거나 전역 최소점(global minimum) 근처에 가면 학습률을 줄이는 등
파이토치의 학습률 스케줄러 종류
- optim.lr_scheduler.LambdaLR: 람다(lambda) 함수를 이용해 그 결과를 학습률로 설정
- optim.lr_scheduler.StepLR: 단계(step)마다 학습률을 감마(gamma) 비율만큼 감소
- optim.lr_scheduler.MultiStepLR: StepLR과 비슷하지만 특정 단계가 아니라 지정된 에포크에만 감마 비율로 감소
- optim.lr_scheduler.ExponentialLR: 에포크마다 이전 학습률에 감마만큼 곱함
- optim.lr_scheduler.CosineAnnealingLR: 학습률을 코사인(cosine) 함수의 형태처럼 변화시켜 학습률일 커지기도 하고 작아지기도 함
- optim.lr_scheduler.ReduceLROnPlateau: 학습이 잘되는지 아닌지에 따라 동적으로 학습률 변화

지표(Metrics)¶

모델의 학습과 테스트 단계를 모니터링

In [40]:

!pip install torchmetrics

Collecting torchmetrics
  Downloading torchmetrics-1.4.1-py3-none-any.whl.metadata (20 kB)
Requirement already satisfied: numpy>1.20.0 in /usr/local/lib/python3.10/dist-packages (from torchmetrics) (1.26.4)
Requirement already satisfied: packaging>17.1 in /usr/local/lib/python3.10/dist-packages (from torchmetrics) (24.1)
Requirement already satisfied: torch>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from torchmetrics) (2.3.1+cu121)
Collecting lightning-utilities>=0.8.0 (from torchmetrics)
  Downloading lightning_utilities-0.11.6-py3-none-any.whl.metadata (5.2 kB)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from lightning-utilities>=0.8.0->torchmetrics) (71.0.4)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from lightning-utilities>=0.8.0->torchmetrics) (4.12.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->torchmetrics) (3.15.4)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->torchmetrics) (1.13.1)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->torchmetrics) (3.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->torchmetrics) (3.1.4)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->torchmetrics) (2024.6.1)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-nccl-cu12==2.20.5 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.7 kB)
Requirement already satisfied: triton==2.3.1 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->torchmetrics) (2.3.1)
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.10.0->torchmetrics)
  Using cached nvidia_nvjitlink_cu12-12.6.20-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.10.0->torchmetrics) (2.1.5)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.10.0->torchmetrics) (1.3.0)
Downloading torchmetrics-1.4.1-py3-none-any.whl (866 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 866.2/866.2 kB 17.1 MB/s eta 0:00:00
Downloading lightning_utilities-0.11.6-py3-none-any.whl (26 kB)
Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)
Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
Using cached nvidia_nvjitlink_cu12-12.6.20-py3-none-manylinux2014_x86_64.whl (19.7 MB)
Installing collected packages: nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, lightning-utilities, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, torchmetrics
Successfully installed lightning-utilities-0.11.6 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.6.20 nvidia-nvtx-cu12-12.1.105 torchmetrics-1.4.1

In [41]:

import torchmetrics

preds = torch.rand(10, 5).softmax(dim=-1)
target = torch.randint(5, (10, ))
print(preds, target)

acc = torchmetrics.functional.accuracy(preds, target, task="multiclass", num_classes=5)
print(acc)

tensor([[0.1885, 0.2010, 0.1557, 0.1867, 0.2681],
        [0.1426, 0.1439, 0.2550, 0.3109, 0.1475],
        [0.1143, 0.1913, 0.2336, 0.2436, 0.2172],
        [0.1712, 0.1424, 0.1882, 0.2578, 0.2404],
        [0.1272, 0.2463, 0.2691, 0.1844, 0.1730],
        [0.1754, 0.1645, 0.1275, 0.2694, 0.2633],
        [0.1431, 0.2439, 0.2712, 0.1571, 0.1847],
        [0.2170, 0.2167, 0.1969, 0.2300, 0.1394],
        [0.2083, 0.2762, 0.1354, 0.2309, 0.1492],
        [0.2302, 0.2255, 0.2538, 0.1488, 0.1417]]) tensor([3, 0, 1, 4, 2, 3, 3, 2, 0, 2])
tensor(0.3000)

In [42]:

metric = torchmetrics.Accuracy(task="multiclass", num_classes=5)

n_batches = 10
for i in range(n_batches):
  preds = torch.rand(10, 5).softmax(dim=1)
  target = torch.randint(5, (10, ))

  acc = torchmetrics.functional.accuracy(preds, target, task="multiclass", num_classes=5)
  print(acc)

tensor(0.1000)
tensor(0.1000)
tensor(0.1000)
tensor(0.1000)
tensor(0.2000)
tensor(0.1000)
tensor(0.3000)
tensor(0.1000)
tensor(0.4000)
tensor(0.3000)

선형 회귀 모델(Linear Regression Model)¶

데이터 생성¶

In [43]:

X = torch.randn(200, 1) * 10
y = X + 3 * torch.randn(200, 1)
plt.scatter(X.numpy(), y.numpy())
plt.ylabel('y')
plt.xlabel('x')
plt.grid()
plt.show()

모델 정의 및 파라미터¶

In [44]:

class LinearRegressionModel(nn.Module):
  def __init__(self):
    super(LinearRegressionModel, self).__init__()
    self.linear = nn.Linear(1, 1)

  def forward(self, x):
    pred = self.linear(x)
    return pred

In [45]:

model = LinearRegressionModel()
print(model)
print(list(model.parameters()))

LinearRegressionModel(
  (linear): Linear(in_features=1, out_features=1, bias=True)
)
[Parameter containing:
tensor([[0.7913]], requires_grad=True), Parameter containing:
tensor([-0.3978], requires_grad=True)]

In [46]:

w, b = model.parameters()

w1, b1 = w[0][0].item(), b[0].item()
x1 = np.array([-30, 30])
y1 = w1 * x1 + b1

plt.plot(x1, y1, 'r')
plt.scatter(X, y)
plt.grid()
plt.show()

손실 함수 및 옵티마이저¶

In [47]:

import torch.optim as optim
import matplotlib.pyplot as plt
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)

모델 학습¶

In [48]:

epochs = 100
losses = []

for epoch in range(epochs):
  optimizer.zero_grad()

  y_pred = model(X)
  loss = criterion(y_pred, y)
  losses.append(loss.item())
  loss.backward()

  optimizer.step()

In [49]:

plt.plot(range(epochs), losses)
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.show()

In [50]:

w1, b1 = w[0][0].item(), b[0].item()
x1 = np.array([-30, 30])
y1 = w1 * x1 + b1

plt.plot(x1, y1, 'r')
plt.scatter(X, y)
plt.grid()
plt.show()

FashionMNIST 분류 모델¶

GPU 설정

In [51]:

device = torch.device('cuda' if torch.cuda. is_available() else 'cpu')
device

Out[51]:

device(type='cuda')

데이터 로드¶

In [52]:

transform = transforms.Compose(([transforms.ToTensor(),
                                transforms.Normalize((0.5, ), (0.5, ))]))

In [53]:

trainset = datasets.FashionMNIST(root='/content/',
                                 train=True, download=True,
                                 transform=transform)
testset = datasets.FashionMNIST(root='/content/',
                                 train=False, download=True,
                                 transform=transform)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to /content/FashionMNIST/raw/train-images-idx3-ubyte.gz

100%|██████████| 26421880/26421880 [00:02<00:00, 12384588.47it/s]

Extracting /content/FashionMNIST/raw/train-images-idx3-ubyte.gz to /content/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to /content/FashionMNIST/raw/train-labels-idx1-ubyte.gz

100%|██████████| 29515/29515 [00:00<00:00, 201294.13it/s]

Extracting /content/FashionMNIST/raw/train-labels-idx1-ubyte.gz to /content/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to /content/FashionMNIST/raw/t10k-images-idx3-ubyte.gz

100%|██████████| 4422102/4422102 [00:01<00:00, 3720955.07it/s]

Extracting /content/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to /content/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to /content/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz

100%|██████████| 5148/5148 [00:00<00:00, 19313306.79it/s]

Extracting /content/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to /content/FashionMNIST/raw

In [54]:

train_loader = DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)
test_loader = DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

In [55]:

images, labels = next(iter(train_loader))
images.shape, labels.shape

Out[55]:

(torch.Size([128, 1, 28, 28]), torch.Size([128]))

In [56]:

labels_map = {
    0: 'T-Shirt',
    1: 'Trouser',
    2: 'Pullover',
    3: 'Dress',
    4: 'Coat',
    5: 'Sandal',
    6: 'Shirt',
    7: 'Sneaker',
    8: 'Bag',
    9: 'Ankle Boot'
}

figure = plt.figure(figsize=(12, 12))
cols, rows = 4, 4
for i in range(1, cols * rows + 1):
  image = images[i].squeeze()
  label_idx = labels[i].item()
  label = labels_map[label_idx]

  figure.add_subplot(rows, cols, i)
  plt.title(label)
  plt.axis('off')
  plt.imshow(image, cmap='gray')

plt.show()

모델 정의 및 파라미터¶

In [57]:

class NeuralNet(nn.Module):
  def __init__(self):
    super(NeuralNet, self).__init__()

    self.conv1 = nn.Conv2d(1, 6, 3)
    self.conv2 = nn.Conv2d(6, 16, 3)
    self.fc1 = nn.Linear(16 * 5 * 5, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, 10)

  def forward(self, x):
    x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
    x = F.max_pool2d(F.relu(self.conv2(x)), 2)
    x = x.view(-1, self.num_flat_features(x))
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

  def num_flat_features(self, x):
    size = x.size()[1:]
    num_features = 1
    for s in size:
      num_features *= s

    return num_features

net = NeuralNet()
print(net)

NeuralNet(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

In [58]:

params = list(net.parameters())
print(len(params))
print(params[0].size())

10
torch.Size([6, 1, 3, 3])

In [59]:

input = torch.randn(1, 1, 28, 28)
out = net(input)
print(out)

tensor([[-0.0871,  0.1913, -0.0433, -0.0459,  0.0379,  0.0232,  0.0585, -0.0060,
         -0.0409,  0.0344]], grad_fn=<AddmmBackward0>)

손실함수와 옵티마이저¶

In [60]:

import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

모델 학습¶

배치수 확인

In [61]:

total_batch = len(train_loader)
print(total_batch)

In [63]:

for epoch in range(10):

  running_loss = 0.0

  for i, data in enumerate(train_loader, 0):
    inputs, labels = data

    optimizer.zero_grad()

    outputs = net(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

    running_loss += loss.item()

    if i % 100 == 99:
      print('Epoch : {}, Iter: {}, Loss: {}'.format(epoch+1, i+1, running_loss/2000))
      running_loss = 0.0

Epoch : 1, Iter: 100, Loss: 0.11502504658699035
Epoch : 1, Iter: 200, Loss: 0.11446980822086335
Epoch : 1, Iter: 300, Loss: 0.11365943157672882
Epoch : 1, Iter: 400, Loss: 0.11183640277385712
Epoch : 2, Iter: 100, Loss: 0.09142237794399262
Epoch : 2, Iter: 200, Loss: 0.05995461705327034
Epoch : 2, Iter: 300, Loss: 0.04482015699148178
Epoch : 2, Iter: 400, Loss: 0.040373206377029416
Epoch : 3, Iter: 100, Loss: 0.03683351635932922
Epoch : 3, Iter: 200, Loss: 0.03587702712416649
Epoch : 3, Iter: 300, Loss: 0.03434920717775822
Epoch : 3, Iter: 400, Loss: 0.0334646110534668
Epoch : 4, Iter: 100, Loss: 0.032616506576538085
Epoch : 4, Iter: 200, Loss: 0.03125044773519039
Epoch : 4, Iter: 300, Loss: 0.032371528938412664
Epoch : 4, Iter: 400, Loss: 0.030714029729366304
Epoch : 5, Iter: 100, Loss: 0.03034444074332714
Epoch : 5, Iter: 200, Loss: 0.029525935858488082
Epoch : 5, Iter: 300, Loss: 0.02892155006527901
Epoch : 5, Iter: 400, Loss: 0.028165019646286964
Epoch : 6, Iter: 100, Loss: 0.028564947932958603
Epoch : 6, Iter: 200, Loss: 0.027898967817425728
Epoch : 6, Iter: 300, Loss: 0.02741909073293209
Epoch : 6, Iter: 400, Loss: 0.0271149540245533
Epoch : 7, Iter: 100, Loss: 0.026726744800806047
Epoch : 7, Iter: 200, Loss: 0.026773720502853395
Epoch : 7, Iter: 300, Loss: 0.026580355644226075
Epoch : 7, Iter: 400, Loss: 0.02655362620949745
Epoch : 8, Iter: 100, Loss: 0.026225275576114655
Epoch : 8, Iter: 200, Loss: 0.025761234283447267
Epoch : 8, Iter: 300, Loss: 0.024994379952549935
Epoch : 8, Iter: 400, Loss: 0.024353671863675118
Epoch : 9, Iter: 100, Loss: 0.02474271248281002
Epoch : 9, Iter: 200, Loss: 0.02464385850727558
Epoch : 9, Iter: 300, Loss: 0.02411820262670517
Epoch : 9, Iter: 400, Loss: 0.024146224185824395
Epoch : 10, Iter: 100, Loss: 0.02376581420004368
Epoch : 10, Iter: 200, Loss: 0.023817880272865296
Epoch : 10, Iter: 300, Loss: 0.023328649133443832
Epoch : 10, Iter: 400, Loss: 0.02291729202866554

모델의 저장 및 로드¶

torch.save: net.state_dict()를 저장
torch.load: load_state_dict로 모델을 로드

In [64]:

PATH = './fashion_mnist.pth'
torch.save(net.state_dict(),PATH)

In [65]:

net = NeuralNet()
net.load_state_dict(torch.load(PATH))

Out[65]:

<All keys matched successfully>

In [66]:

net.parameters

Out[66]:

torch.nn.modules.module.Module.parameters
def parameters(recurse: bool=True) -> Iterator[Parameter]

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.pyReturn an iterator over module parameters.

This is typically passed to an optimizer.

Args:
    recurse (bool): if True, then yields parameters of this module
        and all submodules. Otherwise, yields only parameters that
        are direct members of this module.

Yields:
    Parameter: module parameter

Example::

    >>> # xdoctest: +SKIP("undefined vars")
    >>> for param in model.parameters():
    >>>     print(type(param), param.size())
    <class 'torch.Tensor'> (20L,)
    <class 'torch.Tensor'> (20L, 1L, 5L, 5L)

모델 테스트¶

In [67]:

def imshow(image):
  image = image / 2 + 0.5
  npimg = image.numpy()

  fig = plt.figure(figsize=(16, 8))
  plt.imshow(np.transpose(npimg, (1, 2, 0)))
  plt.show()

In [69]:

import torchvision

dataiter = iter(test_loader)
images, labels = next(dataiter)

imshow(torchvision.utils.make_grid(images[:6]))

In [70]:

outputs = net(images)

_, predicted = torch.max(outputs, 1)
print(predicted)

tensor([9, 2, 1, 1, 6, 1, 2, 6, 5, 7, 4, 5, 5, 3, 4, 1, 2, 6, 8, 0, 2, 7, 7, 5,
        1, 2, 6, 0, 9, 4, 8, 8, 3, 3, 8, 0, 7, 5, 7, 9, 0, 1, 0, 9, 6, 7, 2, 1,
        2, 6, 6, 2, 5, 8, 4, 2, 8, 6, 8, 0, 7, 7, 8, 5, 1, 1, 0, 4, 7, 8, 7, 0,
        2, 6, 4, 3, 1, 2, 8, 4, 1, 8, 5, 9, 5, 0, 3, 2, 0, 2, 5, 3, 6, 7, 1, 8,
        0, 1, 4, 2, 3, 4, 7, 6, 7, 8, 5, 9, 9, 4, 2, 5, 7, 0, 5, 2, 8, 4, 7, 8,
        0, 0, 9, 9, 3, 0, 8, 4])

In [71]:

print(''.join('{}. '.format(labels_map[int(predicted[j].numpy())]) for j in range(6)))

Ankle Boot. Pullover. Trouser. Trouser. Shirt. Trouser.

In [72]:

correct = 0
total = 0

with torch.no_grad():
  for data in test_loader:
    images, labels = data
    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()

print(100 * correct / total)

81.81

'Python' 카테고리의 다른 글

PyGWalker - 파이썬에서 태블로처럼 빠른 EDA 시각화 (0)	2024.08.17
파이썬 기초 - 01_과목평균 (0)	2024.08.17
Home Credit Default Risk [1] feature engineering (0)	2024.07.24
분류(Classification) - 3 베이지안 최적화와 고객만족예측 실습 (2)	2024.07.23
분류(Classification) - 2 (2)	2024.07.20

현재글Pytorch 튜토리얼 - 텐서연산 및 신경망 구성

풀스택 데이터 전문가가 되고자 합니다

Today :
Yesterday :

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Pytorch 튜토리얼 - 텐서연산 및 신경망 구성

파이토치(PyTorch)¶

파이토치 모듈 구조¶

파이토치의 구성요소¶

텐서(Tensors)¶

텐서 초기화와 데이터 타입¶

데이터 타입(Data Type)¶

CUDA Tensors¶

다차원 텐서 표현¶

텐서의 연산(Operations)¶

텐서의 조작(Manipulations)¶

Autograd(자동미분)¶

기울기(Gradient)¶

자동 미분 흐름 예제¶

$\quad \frac{\partial out}{\partial a} = ?$¶

데이터 준비¶

신경망 구성¶

torch.nn 패키지¶

컨볼루션 레이어(Convolution Layers)¶

풀링 레이어(Pooling layers)¶

선형 레이어(Linear layers)¶

비선형 활성화 (Non-linear Activations)¶

신경망 종류¶

모델 정의¶

nn.Module 상속 클래스 정의¶

nn.Sequential을 이용한 신경망 정의¶

파이토치 사전학습 모델¶

모델 파라미터¶

손실 함수(Loss function)¶

옵티마이저(Optimizer)¶

학습률 스케줄러(Learning rate scheduler)¶

지표(Metrics)¶

선형 회귀 모델(Linear Regression Model)¶

데이터 생성¶

모델 정의 및 파라미터¶

손실 함수 및 옵티마이저¶

모델 학습¶

FashionMNIST 분류 모델¶

데이터 로드¶

모델 정의 및 파라미터¶

손실함수와 옵티마이저¶

모델 학습¶

모델의 저장 및 로드¶

모델 테스트¶

'Python' 카테고리의 다른 글

'Python'의 다른글

관련글

티스토리툴바

`torch.nn` 패키지¶

`nn.Module` 상속 클래스 정의¶

`nn.Sequential`을 이용한 신경망 정의¶