์˜ค๋Š˜์€ Pytorch๋ฅผ ํ†ตํ•ด RNN์„ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. 

 

https://www.youtube.com/watch?v=bPRfnlG6dtU&t=2674s

RNN์˜ ๊ธฐ๋ณธ๊ตฌ์กฐ๋ฅผ ๋ชจ๋ฅด์‹œ๋ฉด ์œ„ ๋งํฌ๋ฅผ ๋ณด์‹œ๋Š”๊ฑธ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

 

Pytorch document์— RNN์„ ํ™•์ธํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

https://pytorch.org/docs/stable/nn.html

 

RNN with Pytorch

 

RNN with Book

 

 

1. RNN (default)

 

RNN์˜ ์ž…๋ ฅ์€ [sequence, batch_size, input_size] ์œผ๋กœ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.

 

import torch
import torch.nn as nn


input = torch.randn(4, 7, 5)
print(input.size())
# ๊ฒฐ๊ณผ 
# torch.Size([4, 7, 5])

sequence = 4์ฐจ์›,

batch_size = 7์ฐจ์›,

input_size = 5์ฐจ์› ์ธ ์ž„์˜์˜ input ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

 

2. nn.RNN

 

RNN์˜ ๊ธฐ๋ณธ ์ธ์ž ๊ฐ’์œผ๋กœ input_size=5, hidden_size=4, num_layer=3 ๊ฐ’์„ ๋ฐ›์Šต๋‹ˆ๋‹ค. 

rnn_layer = nn.RNN(input_size=5, hidden_size=4, num_layers=3)
print(rnn_layer)

# ๊ฒฐ๊ณผ
# RNN(5, 4, num_layers=3, batch_first=True)

input_size๋Š” Input์˜ input_size์˜ ๊ฐ’์„ ๋ฐ›์œผ๋ฏ€๋กœ, 5๋กœ ์ ์–ด์ฃผ์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

 

hidden_size๋Š” nn.RNN์„ ๊ฑธ์น˜๋ฉด ๋‚˜์˜ค๋Š” Output์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์œ„ ๊ทธ๋ฆผ์˜ ์ฃผํ™ฉ์ƒ‰ ๋ฐ•์Šค์ž…๋‹ˆ๋‹ค. 

 

hidden_size = 4 ์ด๋ฏ€๋กœ output ๋˜ํ•œ 4์ฐจ์›์œผ๋กœ ๋งŒ๋“ค์–ด์ง‘๋‹ˆ๋‹ค. 

 

num_layer๋Š” ์ธต์œผ๋กœ ๋ณด์‹œ๋ฉด ํŽธํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. num_layer = 3 ์ด๋ฏ€๋กœ 3๊ฐœ์˜ ์ธต์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. 

 

(output, hidden) = rnn_layer(input)

print("Output size : {}".format(output.size()))
print("Hidden size : {}".format(hidden.size()))

# ๊ฒฐ๊ณผ 
# Output size : torch.Size([4, 7, 4])
# Hidden size : torch.Size([3, 7, 4])

 

 

RNN Structure

RNN์„ ๊ฒ€์ƒ‰ํ•ด๋ณด์…จ๋‹ค๋ฉด ์œ„ ๊ทธ๋ฆผ์€ ์ •๋ง ๋งŽ์ด ๋ณด์…จ์„๊ฑฐ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. 

 

์ฒ˜์Œ์— RNN์„ ๊ณต๋ถ€ํ•  ๋•Œ ์œ„ ๊ทธ๋ฆผ์ด ๋ฌด์Šจ ๋ง์ธ๊ณ  ํ•˜๊ณ  ๋ณ„๋กœ ์‹ ๊ฒฝ์“ฐ์ง€ ์•Š์•˜์ง€๋งŒ, ์ƒ๋‹นํžˆ ์ค‘์š”ํ•œ ๊ทธ๋ฆผ์ž…๋‹ˆ๋‹ค. 

 

nn.RNN์˜ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์œ„ ์ฝ”๋”ฉ์ฒ˜๋Ÿผ output, hidden์˜ ๊ฐ’์„ ๋ฑ‰์–ด๋ƒ…๋‹ˆ๋‹ค. 

 

2๊ฐœ์˜ ๊ฐ’์„ ๋ฑ‰์–ด๋‚ธ ์ด์œ ๋Š” ๊ทธ๋ฆผ์„ ์‰ฝ๊ฒŒ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 

RNN return output, hidden

A ๋ฐ•์Šค์— input X0์˜ ๊ฐ’์ด ๋“ค์–ด๊ฐ€๋ฉด, output์€ h0 ์˜ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•˜๊ณ  hidden์€ ๋‹ค์Œ Sequnece์˜ A๋ฐ•์Šค์— ๊ฐ’์ด ๋“ค์–ด๊ฐ€๋ฏ€๋กœ output๊ณผ hidden์˜ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. 

 

Output size : torch.Size[4, 7, 4])

Hidden size : torch.Size[3, 7, 4])

 

๊ฐ’์˜ size๋ฅผ ์ฐ์–ด๋ณด๋ฉด ์ด์ƒํ•œ ํ˜•ํƒœ๋กœ ์‚ฌ์ด์ฆˆ๊ฐ€ ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค. 

 

output : [sequence, batch_size, hidden_size] 

hidden : [num_layer, batch_size, hidden_size] 

 

๊ฐ’์˜ ๋ฐ˜ํ™˜์€ ์œ„์ฒ˜๋Ÿผ ๊ทœ์น™์„ ๋”ฐ๋ผ ๋ชจ์–‘์ด ๋ณ€ํ˜•๋ฉ๋‹ˆ๋‹ค. 

 

batch_size = 7

sequence = 4

input_size = 5

hidden_size =4

num_layer = 3

 

์œ„์—์„œ ์‚ฌ์šฉํ•˜๋Š” ์ธ์ž๋“ค์„ ๋ณด๋ฉด์„œ ํ•˜๋‚˜์”ฉ ๋ด๋ณด๋ฉด ๊ฐ’์ด ์ผ์น˜ํ•ฉ๋‹ˆ๋‹ค. 

 

์กฐ๊ธˆ ๋” ์ž์„ธํžˆ ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

 

output[-1], hidden[-1]

output[-1]์ด๋ž€.

output : [sequence, batch_size, hidden_size]  ์—์„œ Sequence์˜ ๊ฐ€์žฅ ๋์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. (๋ณด๋ผ์ƒ‰ ์ƒ์ž)

hidden[-1]์ด๋ž€.

hidden : [num_layer, batch_size, hidden_size]  ์—์„œ num_layer์˜ ๊ฐ€์žฅ ๋์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. (ํšŒ์ƒ‰ ์ƒ์ž)

 

๊ฒฐ๊ตญ output[-1]๊ณผ hidden[-1]์ด hT(k)๋ฅผ ์ง€์นญํ•˜๋ฏ€๋กœ, ๋‘˜์€ ๊ฐ™์€ ๊ฐ’์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค ํ•œ ๋ฒˆ ์ฐ์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

print(output[-1])
print(hidden[-1])

# ๊ฒฐ๊ณผ
tensor([[ 0.8153, -0.1394,  0.3829,  0.0173],
        [ 0.8094,  0.2289,  0.4830, -0.2243],
        [ 0.7936, -0.0426,  0.3890, -0.0643],
        [ 0.7714,  0.3094,  0.4685, -0.2558],
        [ 0.8282,  0.1141,  0.4310, -0.1885],
        [ 0.8027, -0.0607,  0.3745, -0.0249],
        [ 0.8292, -0.1473,  0.4938,  0.0935]], grad_fn=<SelectBackward>)

tensor([[ 0.8153, -0.1394,  0.3829,  0.0173],
        [ 0.8094,  0.2289,  0.4830, -0.2243],
        [ 0.7936, -0.0426,  0.3890, -0.0643],
        [ 0.7714,  0.3094,  0.4685, -0.2558],
        [ 0.8282,  0.1141,  0.4310, -0.1885],
        [ 0.8027, -0.0607,  0.3745, -0.0249],
        [ 0.8292, -0.1473,  0.4938,  0.0935]], grad_fn=<SelectBackward>)

 ์‹ค์ œ๋กœ๋„ ๋“ค์–ด์žˆ๋Š” ๊ฐ’๋“ค์ด ๊ฐ™์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 

 

 

2. batch_first=True ( default๋Š” False์ด๋‹ค. )

 

pytorch๋Š” ๋ชจ๋ธ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์„ ๋•Œ๋Š” batch_size๋ฅผ ๊ฐ€์žฅ ๋จผ์ € ์•ž์— ๋‚˜์˜ต๋‹ˆ๋‹ค. 

 

ํ•˜์ง€๋งŒ RNN์ด๋‚˜ LSTM ๊ฐ™์€ ๊ฒฝ์šฐ๋Š” batch_size๊ฐ€ ์•„๋‹Œ sequence๊ฐ€ ๊ฐ€์žฅ ๋จผ์ € ๋‚˜์˜ต๋‹ˆ๋‹ค.

 

input : [sequence, batch_size, input_size]

 

model์—์„œ ๋‚˜์˜ค๋Š” output ๋˜ํ•œ sequence๊ฐ€ ๊ฐ€์žฅ ๋จผ์ € ๋‚˜์˜ต๋‹ˆ๋‹ค.

 

output : [sequence, batch_size, hidden_size]

 

์ด๋ ‡๊ฒŒ sequence๊ฐ€ ๊ฐ€์žฅ ์•ž์—์„œ ์‚ฌ์šฉ๋˜๋ฉด ๊ฐ€๋” ๋ฐ์ดํ„ฐ ์ฐจ์›์ด ํ—ท๊ฐˆ๋ฆฝ๋‹ˆ๋‹ค. 

 

๊ทธ๋Ÿด ๋•Œ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด batch_first=True ์ž…๋‹ˆ๋‹ค. default๋Š” False์ž…๋‹ˆ๋‹ค.

 

batch_first๋ฅผ True๋กœ ์„ค์ •ํ•˜๋ฉด batch_size๊ฐ€ ์ œ์ผ ๋จผ์ž ์•ž์œผ๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค. 

 

input : [batch_size, sequence, input_size]

   

output : [batch_size, sequence, hidden_size]

 

hidden์€ ๊ทธ๋Œ€๋กœ์ž…๋‹ˆ๋‹ค.

 

์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. 

 

import torch
import torch.nn as nn


input = torch.randn(4, 7, 5)
print(input.size())

rnn_layer = nn.RNN(input_size=5, hidden_size=4, num_layers=3, batch_first=True)
print(rnn_layer)

# ๊ฒฐ๊ณผ 
# RNN(5, 4, num_layers=3, batch_first=True)

 

batch_first๋ฅผ Trueํ•˜๊ฒŒ ๋˜์–ด batch_size๊ฐ€ ๊ฐ€์žฅ ์•ž์œผ๋กœ ๊ฐ€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. 

 

๊ทธ๋ ‡๋‹ค๋ฉด ์ธ์ž๋“ค์— ์ „๋‹ฌ๋œ ๊ฐ’ ๋˜ํ•œ ๋ณ€ํ•ฉ๋‹ˆ๋‹ค.

 

batch_size = 4

sequence = 7

input_size = 5

hidden_size =4

num_layer = 3

 

์œผ๋กœ ๋ณ€๊ฒฝ๋ฉ๋‹ˆ๋‹ค.

 

print("Output size : {}".format(output.size()))
print("Hidden size : {}".format(hidden.size()))

#output : [batch_size, sequence, hidden_size]
#hidden : [num_layer, batch_size, hidden_size] 
# Output size : torch.Size([4, 7, 4])
# Hidden size : torch.Size([3, 4, 4])

๋ชจ์–‘์ด ๋‹ฌ๋ผ์กŒ์ง€๋งŒ ์‚ฌ์ด์ฆˆ ์ด์™ธ์—๋Š” ํฌ๊ฒŒ ๋ฐ”๋€Œ๋Š” ๊ฒƒ์€ ์—†์Šต๋‹ˆ๋‹ค. 

 

indexing ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ sequnece, num_layer๋ฅผ ๋˜‘๊ฐ™์ด ์ ์šฉ์‹œ์ผœ์ฃผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

 

print(output[:, -1, :])
print()
print(hidden[-1])

#๊ฒฐ๊ณผ
tensor([[-0.2105,  0.4423,  0.8115, -0.2838],
        [-0.4891,  0.1518,  0.8839, -0.3675],
        [-0.5495,  0.3221,  0.8500, -0.4782],
        [-0.5822,  0.3100,  0.7938, -0.4242]], grad_fn=<SliceBackward>)

tensor([[-0.2105,  0.4423,  0.8115, -0.2838],
        [-0.4891,  0.1518,  0.8839, -0.3675],
        [-0.5495,  0.3221,  0.8500, -0.4782],
        [-0.5822,  0.3100,  0.7938, -0.4242]], grad_fn=<SelectBackward>)

 

 

728x90
๋ฐ˜์‘ํ˜•
18์ง„์ˆ˜