Tuesday, May 28, 2024
HomeAIPT language model with 1.3 billion parameters, capable of generating advanced sentences!

PT language model with 1.3 billion parameters, capable of generating advanced sentences!

rinna Inc. has developed and released a 1.3 billion parameter GPT language model specialized for Japanese.

In April 2021, rinna released a medium-sized GPT-2 (GPT3-medium) Japanese model that specializes in Japanese. On August 25 of the same year, two models of GPT-2 (GPT2-small, GPT2-xsmall) and RoBERT, an improved version of BERT, were released. These models are published on GitHub, HuggingFace, etc.

And this time, we have developed a GPT language model specialized for Japanese with 1.3 billion parameters, which is larger than the models we have released so far. Released on Hugging Face under a commercially available MIT license .

The Japanese GPT released by rinna can automatically generate advanced Japanese sentences with the characteristics of general Japanese text . For example, by devising the text (text prompt) to be input to GPT, it is possible to obtain the following output for sentence generation, classification, translation, etc. (output is different text each time based on probability).

GPT Input: Miyamoto MusashiGPT output: He was a person who devoted his whole life to training and training his mind. And there are many records of Miyamoto Musashi’s disciples.
GPT Input: The worst is negative. Likes are positive. Great is positive. not goodGPT Output: Negative.
 GPT input: Hello means hello. Good morning means good morning. Thank youGPT output: means thank you. It can be a simple phrase to express your gratitude.

Features of rinna’s Japanese GPT model

  • Uses Japanese C4 , CC-100 , and Wikipedia open source data as training data
  • The model has been trained sufficiently, and its performance achieved about 14 perplexity (14 perplexity is the ability to narrow down the number of word candidates to 14 when GPT predicts the next word).
  • Developed models are published on Hugging Face under a commercially available MIT license for easy access by users
  • Various tasks (domain-specific sentence generation, classification, translation, etc.) can be realized by text prompts and fine-tuning according to the user’s purpose.

In the future, rinna will continue research on AI and develop high-performance products. In addition, in order to contribute to the research and development community, we plan to publish our research results and also plan to collaborate with other companies.

RELATED ARTICLES

What is an AI algorithm?

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Most Popular

Recent Comments