看技术大神周末是怎么玩AIGC游戏的：从0训练一个生成式语言模型解24点游戏

作者：产业数字化加油站发布时间：2023-06-10

周末是愉快的。技术大神的周末不但是愉快的，而且是充满创新的。本文描述的是技术大神——创新奇智CTO张发恩根据兴趣用一个生成式语言模型来玩24点游戏的过程。

从0训练一个生成式语言模型（GPT-LM）解24点游戏：ChatGPT背后的生成式语言大模型玩票级实现

文/张发恩

通过加减乘除，把4个数字运算后得到24，这个游戏几乎每个人都玩过。最近大火的ChatGPT，可以根据人提供的一个“提示（问题）”，来生成高质量的文字。这儿展示如何只用Pytorch，训练一个迷你型的GPT，来解24点游戏。也就是用一个生成式语言模型来玩24点游戏。

这个语言模型运行起来后的界面如下：

文章的剩余部分复制自我的Github上一个Repo[ https://github.com/zhangfaen/24GameGPT/ ]的介绍。感兴趣的可以去Github看和git clone和run code by yourself 。

GPT (Generative Pre-trained Transformer) and Language Model

This repository contains a PyTorch implementation of GPT, which includes both training and inference. GPT is a relatively simple model, as described in the model.py file. It takes a sequence of indices as input, applies a Transformer model (as described in the Transformer paper), and generates a probability distribution over the next index in the sequence.

Language modeling (LM) refers to the use of statistical and probabilistic techniques to determine the likelihood of a specific sequence of words occurring in a sentence. LM is often used for text generation tasks. The well-known ChatGPT, for example, takes a prompt as input and generates text to complete that prompt.

In this repository (24GameGPT), I have trained a GPT model specifically to solve the 24 Game.

Note: This repo is for education purpose. The code is not performance optimized and maybe buggy. For those who want to learn Transformer and GPT LM. I recommend to use 'pdb' to follow any code that you are interested in.

Some Background Information about the 24 Game

The 24 Game involves using four numbers within the range of 0 to 9 and applying four different operators (+, -, *, /) to obtain a result of 24. For example, given the numbers 4, 2, 5, 3, there are multiple valid reasoning paths to achieve the desired result:

[4, 2, 5, 3]: 5 - 2 = 3, 3 + 3 = 6, 6 * 4 = 24
[4, 2, 5, 3]: 4 + 3 = 7, 5 + 7 = 12, 12 * 2 = 24
...
[4, 2, 5, 3]: 5 + 3 = 8, 8 + 4 = 12, 12 * 2 = 24

I have trained a GPT model to predict these reasoning paths based on a given prompt (in the source code, the term "problem" is used to represent a prompt). The trained model takes a prompt like "[3, 7, 5, 5]: " as input, and we expect the model to output something like "7 + 5 = 12, 5 - 3 = 2, 2 * 12 = 24".

Since we are using a language model to solve the 24 Game, we treat "[3, 7, 5, 5]: 7 + 5 = 12, 5 - 3 = 2, 2 * 12 = 24" and "[3, 5, 5, 7]: 7 + 5 = 12, 5 - 3 = 2, 2 * 12 = 24" as different data samples.

I used 40% of all the data samples (refer to 24_game_data_generator.py for information on generating all possible data samples) to train the model.

The model achieved an accuracy of about 99.1% when evaluated on all data samples.

Note: For some combinations of numbers, there may not be a valid reasoning path to obtain 24. In such cases, the model may output nonsensical results. For example, given the numbers [3, 3, 8, 8], the model might output "3 * 8 = 24, 8 - 3 = 5, 24 / 1 = 24".

Repository Structure

24_game_data_generator.py: Generates all possible data samples by enumerating the entire solution space.
24_game_all_data.txt: Contains all generated data samples, totaling 323,188.
dataset.py: Defines the dataset for 24_game_all_data.txt.
model.py: Defines the GPT model, including components like MultiHeadSelfAttention, Block, Transformer, etc.
utils.py: Contains some useful tools.
trainer.py: A utility tool for training the GPT model.
24_game_train.py: The main script for training a GTP model to solve the 24 Game.
out/get24/4train-6test-6layer-6head-384emb-20000steps.model.pt: A pre-trained 24 Game GTP model with approximately 10 million parameters.
24_game_play_groud.py: Loads a pre-trained 24 Game GTP model and allows user input.

Usage

To use the pre-trained 24 Game GPT model, follow these steps:

Create a new Python environment: conda create -n 24game python=3.10
Activate the environment: conda activate 24game
Clone the repository: git clone https://github.com/zhangfaen/24GameGPT.git
Navigate to the repository directory: cd 24GameGPT
Install the required dependencies: pip install -r requirements.txt
Run the 24_game_play_groud.py script: python 24_game_play_groud.py

That's it! The pre-trained 24 Game GPT model will be loaded, and you can provide your own inputs. For example, you can enter "4 3 3 4", and the model may generate a reasoning path based on your input prompt.

Training a 24 Game GPT Model from Scratch

To train a 24 Game GPT model from scratch, follow these steps:

Run the 24_game_data_generator.py script: python 24_game_data_generator.py
Run the 24_game_train.py script: python 24_game_train.py

That's it! If you have a GPU, the training process will complete in a few minutes.

The default configuration for the Transformer model is as follows: 384 embedding dimensions, 6 heads, and 6 layers.

The default configuration for the trainer is as follows: 20,000 steps, batch size of 64, and 40% of all data samples used for training.

You can modify these hyperparameters and configurations in the dataset.py and 24_game_train.py files.

Note: The 1st step is optional, as I already put the data generated by this script in my repo.

References

karpathy/nanoGPT
openai/gpt-2 (contains the model definition in TensorFlow but not the training code)
openai/image-gpt (includes some more modern modifications similar to GPT-3)
huggingface/transformers (provides a language modeling example)
Language modeling example in the transformers repository

Citation

If you find this repository useful and want to cite it:

Faen Zhang, 24GameGPT, GitHub, https://github.com/zhangfaen/24GameGPT

看技术大神周末是怎么玩AIGC游戏的：从0训练一个生成式语言模型解24点游戏

GPT (Generative Pre-trained Transformer) and Language Model

Some Background Information about the 24 Game

Repository Structure

Usage

Training a 24 Game GPT Model from Scratch

References

Citation

推荐体验

相关资讯

干货 | “ 如何从 0-1 训练出一个 ChatGPT ？”

干货丨“如何从0-1训练出一个ChatGPT？”

AIGC+游戏：一个被忽视的长赛道

在macOS下，利用AIGC：从0搭一个图文生成网站

24点游戏写好了，但是重点不是这个

近期资讯

一加 12 手机 ColorOS 15.0 开放尝鲜升级

湖南创研申请采用机器视觉定位的铜排装配机器人专利，提高铜排装配工作效率

消息称三星将关闭 50% 左右的晶圆生产线，以降低运营成本

苏州伽利略申请适用于工件深孔密封圈安装装置及方法专利，极大地提高精准度

创新不止步，动联微智能2.0全面屏POS P62打破商业支付边界！

芯光屏全面进化：海信电视U8系靠AI成为画质专家

西电-浪潮数据库创新实验室年度报告会顺利召开

号称性能超越AI PC芯片，苹果最新M4 Pro芯片跑分结果出炉

三星确认即将推出 XR 头显

台州智惠自动化科技申请深井泵组装设备专利，用于深井泵组装

推荐体验

AIGC重要产品

AI对话：类ChatGPT产品体验

好用的AI绘画工具

火热的AIGC产品

AIGC近期要闻

大公司发布的大模型产品都有哪些？

政府对AIGC的扶持政策

AIGC对就业的影响：我们会失业吗？

AIGC产业影响

AIGC对内容创作的影响

AIGC对绘画设计领域的影响

AIGC对各行各业的影响