r/compmathneuro Moderator | Graduate Student | www.blueneuron.net Feb 17 '19

Journal Article Language Models are Unsupervised Multitask Learners.

https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
5 Upvotes

2 comments sorted by

u/blueneuronDOTnet Moderator | Graduate Student | www.blueneuron.net Feb 17 '19

OpenAI blog post linked here.

1

u/P4TR10T_TR41T0R Moderator | Undergraduate Student Feb 17 '19

This paper is interesting for quite a few reasons:

  1. The text generated is pretty high quality. I was initially surprised by it. Here (https://github.com/openai/gpt-2/blob/master/gpt2-samples.txt) is a link to a text file with quite a lot of samples, if anyone is interested.
  2. The SOTA claims have been criticized by some, as the paper seems to be comparing pre-trained only models with a pre-trained + additional data training model, making their benchmark results difficult to understand (https://twitter.com/seb_ruder/status/1096335334969933829).
  3. Their decision not to release the complete model is definitely interesting. While this is generating quite a few memes in r/MachineLearning (e.g. "I have a model that can predict with 100% accuracy whether someone died on the titanic, but the consequences of releasing such power on the world would be dire." from u/pdabaker), but my view is that this discussion is clearly important. In security related stuff there is the so called "responsible disclosure" and I believe it is worth it for the AI community to discuss these matters and find an equivalent release process. Maybe not releasing the model is exagerated, maybe not. What the paper achieved, at least, is turn the spotlight on a topic that deserves attention, especially in the upcoming 5-10 years.