Topic Edit: GPT의 fine-tuning에 대해서 아시는 분들만 봐 주세요

This topic has [2] replies, 0 voices, and was last updated 2 years ago by agdsf.

Now Editing “GPT의 fine-tuning에 대해서 아시는 분들만 봐 주세요”

Name *

Password *

Topic Title (Maximum Length 80)

안녕하세요.

제가 이해하기로는, GPT 모델의 pretraining을 위하여, next token prediction task가 필요하다고 들었습니다. 예를 들어,

Input -&gt; The GPT models are general-purpose language models that can perform ... (2048 tokens)

Output-&gt; GPT models are general-purpose language models that can perform a ... (2048 tokens)

Next token prediction task를 이용하여, 우리는 pre-training을 할 수 있다고 생각을 합니다. 하지만, Q and A 문장들을 이용하여 어떻게 GPT model을 fine-tuning 하는지를 잘 모르겠습니다.

예를 들어, 저의 question과 answer가 다음과 같다고 할때,

Q) What is the GPT model?

A) The GPT models are general-purpose language models that can perform a broad range of tasks from creating original content to write code, summarizing text, and extracting data from documents.

이 경우, Question이 GPT 모델의 input이 되고, Answer가 이 모델의 output이 되어서 fine-tuning을 하나요? 아니면,

Question과 answer를 아래와 같이 연결하고, fine-tuning을 하게 되나요 (이 경우에는 next token prediction task가 되는데요)?

Input -&gt; What is the GPT model? The GPT models are general-purpose language models that can perform a broad range of tasks from creating original content to write code, summarizing text, and extracting data from documents.

Output-&gt; is the GPT model? The GPT models are general-purpose language models that can perform a broad range of tasks from creating original content to write code, summarizing text, and extracting data from documents. [Padding]

어떤 방식으로 GPT 모델을 fine-tuning 했는지 궁금합니다.

I agree to the terms of service