'TRANSFORMER' 태그의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/01 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

글쓰기
방명록
RSS
관리

목록TRANSFORMER (1)

TechNOTE

[논문리뷰] Sparse Transformer

Sparse Transformer 간단 정리. arxiv.org/pdf/1904.10509.pdf 서론 Transformer는 Sequence Input 의 context 정보를 잘 내포하도록 학습할 수 있는 모델인데, 차지하는 메모리 용량이 매우 크다. attention layer의 memory complexity는 무려 O(n^2). input sequence의 모든 token들이 다른 token들에 대해서 attention하는 값을 구하게 되기 때문이다. 이 memory complexity를 낮추려는 거의 초기의 시도! sparse tranformer는 sparse factorization을 통해서 이 memory complexity를 낮추었다. Key Idea Factorized Self Atte..

NLP 2020. 11. 20. 13:28

Prev 1 Next

목록TRANSFORMER (1)

TechNOTE

티스토리툴바