In recent years, the potential of natural language processing technology has attracted renewed attention.
Natural language processing technology, which allows machines to recognize the language used by humans (natural language), is being used as a chatbot , and has brought various benefits such as improved document search technology .
In 2020, general-purpose language models such as the automatic sentence generation AI “GPT-3” developed by the American non-profit organization Open AI attracted attention. A general-purpose language model trained on a huge amount of text data is expected to greatly improve conventional natural language processing technology and further revolutionize society.
While the use of general-purpose language models is attracting attention, on November 25, 2020, LINE announced that it would develop the world’s first super-large model specialized in Japanese and build the infrastructure necessary for its processing. Did.
This time, we interviewed Mr. Isago, Mr. Etoh, and Mr. Hashimoto from LINE AI Company, who are involved in the development of the gigantic language model.
- Impact of super-giant language model｜Concept of database changes
- Learn over 10 billion pages of Japanese data
- Barriers to Japanese correspondence | “Number of characters” problem
- Linking with text and voice | Keywords are multimodal
- In the future, we are also considering providing it to external parties | Collaborative research with professionals in various fields?
- Human words become art | Liberation from routine work
Impact of super-giant language model｜Concept of database changes
A general-purpose language model trains a large amount of language data such as newspaper articles, encyclopedias, novels, and coding, and then re-learns using a small amount of data to achieve highly accurate language processing that matches the application. It is feasible.
This makes it possible to perform various language processing (dialogue, translation, input completion, document generation, programming, etc.), and it is expected to be able to easily handle various use cases with higher accuracy than current technology . .
The general-purpose language model developed by LINE plans to use more than 175 billion parameters and more than 10 billion pages of Japanese data as learning data. By realizing this gigantic language model, in addition to utilizing LINE services such as developing new interactive AI and improving the quality of search services, we are also considering joint development with third parties and external provision of APIs .
ーーPlease tell us about the social impact that a gigantic language model will have.
Mr. Eto : This language model will have a big impact, but I think the problem is how to turn it into an application .
If it is realized, I think that the concept of database will change due to the realization of this super huge language model.
In the current database, when you enter a search word, results that are caught by that word are coming out. The language model that LINE is trying to develop is, for example, a model that will return the postal code when asked, “What is the address in Tokyo?” I believe that all the sentences spoken and written by humans so far will be included in one unified language model, and the concept of databases will change.
ーーPlease tell us about the impetus for development.
Mr. Eto : There are various reasons, but one of the most impressive was the announcement of ” GPT-3 ” developed by Open AI .
Recently, there has been a trend in the English-speaking and Chinese-speaking countries to develop their own language models, but no one has developed minority languages such as Japanese and Korean.
Therefore, in Japan, LINE decided to develop a language model.
So far, we haven’t decided on a clear exit strategy, but I think it’s a very interesting research area and a meaningful project that LINE should do from the perspective of dealing with the Japanese language.
This project was decided around May 2020 and was announced in about half a year. At present, we have purchased the hardware necessary for development and are in operation.
ーーHow was it when you first learned about this project?
Mr. Eto : Frankly speaking, I thought it was reckless. I think it was a project that freaked everyone out, with managers freaking out and engineers freaking out.
In the field of natural language processing, “GPT-3” has 175 billion parameters, and from now on, “GPT-4” and “GPT-5” are aiming for further expansion, and I think scale is becoming more meaningful. .
BERT, which was a hot topic before GPT-3, can be used to judge whether sentences have the same meaning or not. It was limited to tasks such as quantifying the relevance of sentences and sentences, but like GPT-3, when a certain amount of learning is exceeded, such as 30 billion, 50 billion, or 100 billion, automatic sentences are automatically written I believe that we can create a world that is generated.
Learn over 10 billion pages of Japanese data
ーーWhat kind of data do you assume for the 10 billion pages you want your language model to learn?
Mr. Hashimoto : The data to be learned is roughly divided into two types , ” static content ” and ” dynamic content “.
Static content is relatively stable data that does not change, such as newspapers, books, and encyclopedias.
Dynamic content is content that is updated in chronological order, such as SNS, blogs, and news articles.
However, due to the lack of volume, we are extracting and collecting data that was used for web searches.
ーーDo you have any problems after actually developing it?
Mr. Hashimoto : The learning data is everything. Since the answer changes depending on the person’s past experience and personality, it is ultimately influenced by the data actually used for learning and what is written there.
For example, if AI learns a lot of content related to political issues, religious issues, and gender discrimination issues, it will make remarks along those lines and become an AI that tends to produce radical texts.
I think the most difficult point for LINE this time is how to build an equal, smart and intelligent language model .
In addition, we need know-how to run learning calculations in parallel, so we are repeating trial and error.
ーーI don’t think there are more than 10 billion pages of digitized Japanese data.
Mr. Hashimoto : In Japan, where DX has not progressed yet, there are some difficult parts.
However, this time it will be a long-term project, so I would like to learn more and solve it in the future.
Barriers to Japanese correspondence | “Number of characters” problem
ーーDevelopment of language processing models for English-speaking countries, such as GPT3, is progressing, but please tell us about the difficulty of supporting Japanese.
Mr. Hashimoto : The number of characters is large. In the English-speaking world, most text can be expressed with a total of about 100 letters of the alphabet and symbols.
Japanese has a scale of about 3000 to 4000 characters, so that point is difficult. Therefore, we need to prepare more Japanese data, and we also have to think about ways to clear the language barrier.
Mr. Eto : There is also the issue of context .
On the other hand, there are also expectations that this gigantic language model may solve the problem of context .
In Japanese, the meaning of a word changes depending on the previous sentence or context. For example, there are times when it is difficult to know if the word “okay” is an affirmation or a denial.
In this language model, we look at all contexts, so I think there is a possibility that language processing will evolve greatly in that respect.
ーーWhat kind of problems did you face with conventional language processing?
Mr. Eto : From the perspective of the database, we have been able to answer the What, but we have not been able to answer the How and Why questions.
Also, the writing style is not unified . Until just a moment ago, I was speaking in a desu-masu body, but the conventional language model was to suddenly say, “It’s …”. In this project, that point will be unified, so I think you will be able to use it without any discomfort.
ーーHow big is your development system?
Mr. Isago : It’s about dozens of people. We don’t distribute the work, so we have very few specialists running the development business.
The development team does everything from organizing data to running algorithms and infrastructure. In parallel, the planner team is thinking about what kind of service it can be used for when the language model is created .
However, there are parts where I don’t know what I can do at the moment, so I feel like I can’t get out of the delusional fantasy stage. However, it is important to work closely with the development team while thinking about where it can be used.
Linking with text and voice | Keywords are multimodal
Mr. Eto : There has been something called symbol grounding for a long time, and my dream was to turn images into text and voice into text.
Now it is becoming possible to link images and text, and voice and text. From now on, I think we will be in a world where we can make use of this connection to generate images from text, or music from images. I think that an interesting world is coming in five years.
ーーIn terms of voice, what do you think about the connection with “LINE AiCall”?
Isago : If we can solve problems such as real-time performance, I think the idea of using smartphones as agents is a great idea.
If you say, “Please make a reservation here,” they will make a reservation without permission. I think the range of applications is very wide, but it is not yet clear when we will be able to develop a user experience that is more accurate than humans.
If I can find a field that I think I might be able to manage with just this field, I would like to start experimental efforts as soon as possible.
ーーUntil now, AI that specialized only in images or languages was the mainstream, but from now on, a multimodal world that transcends media formats is coming. The importance of planners is likely to increase in the future.
Isago : I don’t know the exit, so I’m not sure if what I’m working on is the right answer.
I think that using a language model that has just been completed in a mission-critical area such as “inquiry support” will cause various accidents.
Therefore, the key point is how many use cases can be created that are not mission-critical, such as entertainment and casual scenes, such as ” conversation with the language model is fun ” and ” generated sentences are fun “. In that sense, I think LINE is in a good position.
Also, at first, I don’t think it’s really suitable for real time. Ultimately, I would like to be able to have real-time conversations, or have sentences come back immediately, but I would like to aim for real-time responses while increasing content that can be enjoyed even in situations where there is a time lag. .
If real-time problems can be solved, I think the range of applications is very wide. I think it would be even better if we could put a reduced language model on the smartphone side and put it in as an agent.
In the future, we are also considering providing it to external parties | Collaborative research with professionals in various fields?
ーーAre you planning to provide the developed language model externally?
Isago : If the situation is finally stable, we would like to provide it externally, but we would be very grateful if you could conduct joint research with LINE before that.
Initially, we will work on the core part, but after that, I think it would be interesting if we could develop it jointly with an organization that is well-versed in each field.
Mr. Eto : Ultimately, I think that providing it as an API and asking for suggestions on how to use it like chords would be the best fit for LINE’s policy. Language models have the problem of bias, so I would be happy if you could get involved in that point from an early stage and work together on research.
Mr. Hashimoto : A world that is not envisioned by law will also come. In the case of a huge language model, issues of so-called fairness to AI, such as bias and ethics, are important, so it is necessary to participate in research together and think about it together.
ーーBy when do you plan to publish the results of your research?
Mr. Hashimoto : We are moving according to the schedule of the initial plan, so if we can increase the learning amount of the model smoothly from now on, we would like to produce results by the end of 2021.
Human words become art | Liberation from routine work
ーーDo you have an image of the world view that a gigantic language model will bring?
Mr. Hashimoto : I think the quality of the text will definitely improve . For example, in combination with EC, I think it can be applied to create copy that everyone wants to buy.
Also, I think that the concept of LINE is to simplify the generation of sentences that everyone can understand and make life easier.
Mr. Eto : It’s liberation from the existing routine writing work .
I hope I won’t have to write similar documents and applications. I believe that a world will come in which the processing of word processors and fixed form texts will completely change.
I think that when sentences are automatically generated and there is no opportunity to hit the keyboard, the words we speak will become more artistic.
And the skills that humans need will change, and the education system will change greatly . The skills required will change, just as we no longer learn calligraphy or the abacus.
Mr. Isago : I want something that automatically replies to LINE . If it can be scaled down to a model that can be used on smartphones, it will become an important feature of LINE’s current system. It’s basically end-to-end, and it’s encrypted, so it’s no use trying to put something in the middle of the route.
The agent won’t work unless you realize something that moves in front of it. It’s difficult at the moment, but I think it’s possible if a lighter model is possible.
The same goes for the art aspect that Eto-san talked about, but the content we are currently working on is personalized. We are still working on personalizing LINE NEWS and advertisements.
If dark data can be added when the language model is developed and sentences are generated, it will be easier to customize.
If you can add dark data such as the utterances you have created in your life, you will be able to create your own language model. If that is realized, I think that when I want to create my own work, I will be able to create works that stand out more with their individuality.
If AI standardizes and makes dry communication the main subject, that is not the world view that LINE wants to realize. Communication is fun, and self-expression is fun, so I would like to take on the challenge of “realizing a world where AI expands the range of expression” while improving efficiency .
Japanese is said to be the most difficult language in the world to learn. By developing a language model specialized for the Japanese language and generating highly accurate sentences, it will gradually bring about changes in various fields.
While the exit is still undecided, LINE is proceeding with an unprecedented project. I’m really looking forward to seeing what kind of world the developers and planners will work together to create.