K
Kathleen Martin
Guest
Meta’s AI lab has created a massive new language model that shares both the remarkable abilities and the harmful flaws of OpenAI’s pioneering neural network GPT-3. And in an unprecedented move for Big Tech, it is giving it away to researchers—together with details about how it was built and trained.
“We strongly believe that the ability for others to scrutinize your work is an important part of research. We really invite that collaboration,” says Joelle Pineau, a longtime advocate for transparency in the development of technology, who is now managing director at Meta AI.
Meta’s move is the first time that a fully trained large language model will be made available to any researcher who wants to study it. The news has been welcomed by many concerned about the way this powerful technology is being built by small teams behind closed doors.
“I applaud the transparency here,” says Emily M. Bender, a computational linguist at the University of Washington and a frequent critic of the way language models are developed and deployed.
“It’s a great move,” says Thomas Wolf, chief scientist at Hugging Face, the AI startup behind BigScience, a project in which more than 1,000 volunteers around the world are collaborating on an open-source language model. “The more open models the better,” he says.
Large language models—powerful programs that can generate paragraphs of text and mimic human conversation—have become one of the hottest trends in AI in the last couple of years. But they have deep flaws, parroting misinformation, prejudice, and toxic language.
In theory, putting more people to work on the problem should help. Yet because language models require vast amounts of data and computing power to train, they have so far remained projects for rich tech firms. The wider research community, including ethicists and social scientists concerned about their misuse, has had to watch from the sidelines.
Meta AI says it wants to change that. “Many of us have been university researchers,” says Pineau. “We know the gap that exists between universities and industry in terms of the ability to build these models. Making this one available to researchers was a no-brainer.” She hopes that others will pore over their work and pull it apart or build on it. Breakthroughs come faster when more people are involved, she says.
Meta is making its model, called Open Pretrained Transformer (OPT), available for non-commercial use. It is also releasing its code and a logbook that documents the training process. The logbook contains daily updates from members of the team about the training data: how it was added to the model and when, what worked and what didn’t. In more than 100 pages of notes, the researchers log every bug, crash, and reboot in a three-month training process that ran nonstop from October 2021 to January 2022.
With 175 billion parameters (the values in a neural network that get tweaked during training), OPT is the same size as GPT-3. This was by design, says Pineau. The team built OPT to match GPT-3 both in its accuracy on language tasks and in its toxicity. OpenAI has made GPT-3 available as a paid service but has not shared the model itself or its code. The idea was to provide researchers with a similar language model to study, says Pineau.
Continue reading: https://www.technologyreview.com/2022/05/03/1051691/meta-ai-large-language-model-gpt3-ethics-huggingface-transparency/
“We strongly believe that the ability for others to scrutinize your work is an important part of research. We really invite that collaboration,” says Joelle Pineau, a longtime advocate for transparency in the development of technology, who is now managing director at Meta AI.
Meta’s move is the first time that a fully trained large language model will be made available to any researcher who wants to study it. The news has been welcomed by many concerned about the way this powerful technology is being built by small teams behind closed doors.
“I applaud the transparency here,” says Emily M. Bender, a computational linguist at the University of Washington and a frequent critic of the way language models are developed and deployed.
“It’s a great move,” says Thomas Wolf, chief scientist at Hugging Face, the AI startup behind BigScience, a project in which more than 1,000 volunteers around the world are collaborating on an open-source language model. “The more open models the better,” he says.
Large language models—powerful programs that can generate paragraphs of text and mimic human conversation—have become one of the hottest trends in AI in the last couple of years. But they have deep flaws, parroting misinformation, prejudice, and toxic language.
In theory, putting more people to work on the problem should help. Yet because language models require vast amounts of data and computing power to train, they have so far remained projects for rich tech firms. The wider research community, including ethicists and social scientists concerned about their misuse, has had to watch from the sidelines.
Meta AI says it wants to change that. “Many of us have been university researchers,” says Pineau. “We know the gap that exists between universities and industry in terms of the ability to build these models. Making this one available to researchers was a no-brainer.” She hopes that others will pore over their work and pull it apart or build on it. Breakthroughs come faster when more people are involved, she says.
Meta is making its model, called Open Pretrained Transformer (OPT), available for non-commercial use. It is also releasing its code and a logbook that documents the training process. The logbook contains daily updates from members of the team about the training data: how it was added to the model and when, what worked and what didn’t. In more than 100 pages of notes, the researchers log every bug, crash, and reboot in a three-month training process that ran nonstop from October 2021 to January 2022.
With 175 billion parameters (the values in a neural network that get tweaked during training), OPT is the same size as GPT-3. This was by design, says Pineau. The team built OPT to match GPT-3 both in its accuracy on language tasks and in its toxicity. OpenAI has made GPT-3 available as a paid service but has not shared the model itself or its code. The idea was to provide researchers with a similar language model to study, says Pineau.
Continue reading: https://www.technologyreview.com/2022/05/03/1051691/meta-ai-large-language-model-gpt3-ethics-huggingface-transparency/