周末是愉快的。技术大神的周末不但是愉快的,而且是充满创新的。本文描述的是技术大神——创新奇智CTO张发恩根据兴趣用一个生成式语言模型来玩24点游戏的过程。
从0训练一个生成式语言模型(GPT-LM)解24点游戏:ChatGPT背后的生成式语言大模型玩票级实现
文/张发恩
通过加减乘除,把4个数字运算后得到24,这个游戏几乎每个人都玩过。最近大火的ChatGPT,可以根据人提供的一个“提示(问题)”,来生成高质量的文字。这儿展示如何只用Pytorch,训练一个迷你型的GPT,来解24点游戏。也就是用一个生成式语言模型来玩24点游戏。
这个语言模型运行起来后的界面如下:
文章的剩余部分复制自我的Github上一个Repo[ https://github.com/zhangfaen/24GameGPT/ ]的介绍。感兴趣的可以去Github看和git clone和run code by yourself 。
This repository contains a PyTorch implementation of GPT, which includes both training and inference. GPT is a relatively simple model, as described in the model.py file. It takes a sequence of indices as input, applies a Transformer model (as described in the Transformer paper), and generates a probability distribution over the next index in the sequence.
Language modeling (LM) refers to the use of statistical and probabilistic techniques to determine the likelihood of a specific sequence of words occurring in a sentence. LM is often used for text generation tasks. The well-known ChatGPT, for example, takes a prompt as input and generates text to complete that prompt.
In this repository (24GameGPT), I have trained a GPT model specifically to solve the 24 Game.
Note: This repo is for education purpose. The code is not performance optimized and maybe buggy. For those who want to learn Transformer and GPT LM. I recommend to use 'pdb' to follow any code that you are interested in.
The 24 Game involves using four numbers within the range of 0 to 9 and applying four different operators (+, -, *, /) to obtain a result of 24. For example, given the numbers 4, 2, 5, 3, there are multiple valid reasoning paths to achieve the desired result:
I have trained a GPT model to predict these reasoning paths based on a given prompt (in the source code, the term "problem" is used to represent a prompt). The trained model takes a prompt like "[3, 7, 5, 5]: " as input, and we expect the model to output something like "7 + 5 = 12, 5 - 3 = 2, 2 * 12 = 24".
Since we are using a language model to solve the 24 Game, we treat "[3, 7, 5, 5]: 7 + 5 = 12, 5 - 3 = 2, 2 * 12 = 24" and "[3, 5, 5, 7]: 7 + 5 = 12, 5 - 3 = 2, 2 * 12 = 24" as different data samples.
I used 40% of all the data samples (refer to 24_game_data_generator.py for information on generating all possible data samples) to train the model.
The model achieved an accuracy of about 99.1% when evaluated on all data samples.
Note: For some combinations of numbers, there may not be a valid reasoning path to obtain 24. In such cases, the model may output nonsensical results. For example, given the numbers [3, 3, 8, 8], the model might output "3 * 8 = 24, 8 - 3 = 5, 24 / 1 = 24".
To use the pre-trained 24 Game GPT model, follow these steps:
That's it! The pre-trained 24 Game GPT model will be loaded, and you can provide your own inputs. For example, you can enter "4 3 3 4", and the model may generate a reasoning path based on your input prompt.
To train a 24 Game GPT model from scratch, follow these steps:
That's it! If you have a GPU, the training process will complete in a few minutes.
The default configuration for the Transformer model is as follows: 384 embedding dimensions, 6 heads, and 6 layers.
The default configuration for the trainer is as follows: 20,000 steps, batch size of 64, and 40% of all data samples used for training.
You can modify these hyperparameters and configurations in the dataset.py and 24_game_train.py files.
Note: The 1st step is optional, as I already put the data generated by this script in my repo.
If you find this repository useful and want to cite it:
Faen Zhang, 24GameGPT, GitHub, https://github.com/zhangfaen/24GameGPT