Posts

Paper reading

Adversarial Training Methods for Semi-supervised Text Classification

Miyato et al. (ICLR 2017) 두 가지를 적용하였다. 1) Discrete data의 특징상 input을 직접 perturb하지 않고 (normalized) 임베딩을 perturb한다. 2) Original embedding과 perturbed embeddin...

Paraphrase Generation with Deep Reinforcement Learning

Li et al. (EMNLP 2018) Evaluator가 generator의 fine-tuning 단계에서 reward를 제공하도록 한다. Evaluator는 positive/negative sample에 대해 미리 binary classifier로서 훈련된 후 활용될 수도 ...

A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations (VGVAE)

Chen et al. (NAACL 2019) 문장 representation에서 의미 LV와 통사 LV를 구분하여 사용하는 VAE를 제안하였다. 또한 의미에 걸리는 paraphrase discrimination loss, 통사에 걸리는 word position loss, 그리고 ...

AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning

Guo et al. (NAACL 2019) 메인 과제의 성능을 높이는 보조 과제를 선정하는 문제를 1) task selection, 2) mixing ratio learning의 두 단계로 나누어 푼다. 1단계는 보조 과제의 집합을 고르는 Beta-Bernoulli multi-a...

Cooperative Learning of Disjoint Syntax and Semantics

Havrylov et al. (NAACL 2019) 별도의 파서 (syntax)와 합성 함수 (semantics) 모델을 동시에 훈련시킬 때 발생하는 coadaptation 문제를 해결하기 위해서, Gumbel Tree-LSTM에 SCT와 PPO를 더하고 downstream ta...

Unsupervised Recurrent Neural Network Grammars

Kim et al. (NAACL 2019) 이 논문은 amortized variational inference를 사용하여 latent tree space에 대해 marginalize하는 문제를 해소하고 RNNG를 비지도 학습한다. Generative 모델은 stackLSTM과 t...

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders (DIORA)

Drozdov et al. (NAACL 2019) 이 논문은 inside-outside를 사용하여 모든 binary (sub-)tree에 대한 score 및 representation을 계산하고, CYK로 max-scored tree를 찾는다. Constituent represe...

Structured Alignment Networks for Matching Sentences

Liu et al. (EMNLP 2018) 의미의 합성성에 주목하여 latent subtree들의 비교를 통해 두 문장의 의미 관계를 파악하고자 한다. 각 문장의 트리를 구한 후 문장 단위에서 비교하는 것이 아니라, 두 문장을 span 단위에서 비교함으로써 트리를 만들어간다 (C...

Neural Language Modeling by Jointly Learning Syntax and Lexicon (PRPN)

Shen et al. (ICLR 2018) 이 논문은 LSTM state간의 skip-connection으로 constituent간의 dependency를 표현한다. CNN을 사용하여 인접한 두 단어 노드간의 syntactic distance를 계산하고, 그에 기반해 이전의 노드...

Rainbow: Combination of DQN Extensions (Part 1)

이 글은 DeepMind의 Rainbow: Combining Improvements in Deep Reinforcement Learning 에 사용된 6가지 DQN extension들을 다룹니다. 또한 각 extension을 구현한 코드 일부에 변형/주석을 달았습니다.

Pay Less Attention with Lightweight and Dynamic Convolutions

Wu et al. (ICLR 2019) Dynamic convolution은 position-based attention이다. Depthwise convolution 기반으로, 일부 채널의 weight를 공유하고 (lightweight), 타임스텝마다 서로 다른 컨...

Topic

Automatic Evaluation Metrics for NLG

NLG (대부분 NMT)에서 사용되는 주요 evaluation metric들을 정리하였습니다. (MEWR를 제외하고는) output 시퀀스가 reference 시퀀스와 얼마나 오버랩되는가를 측정합니다.

Metrics

임성빈님의 Wasserstein GAN 수학 이해하기 를 참고하여 정리하였습니다.

Tips

Colab에서 remote server와 local runtime 연결하기

Google colab에서 로컬 런타임으로 외부 서버에 연결하려고 하는 경우, 공식 documentation대로 하면 “Unable to connect to the runtime” 에러가 뜨는 경우가 있다(많다). 아래의 방법으로 하니 안정적으로 연결이 되었다.

bo-son

Posts

Paper reading

Topic

Tips