Annual index finds AI is ‘industrializing’ but needs better metrics and testing

Annual index finds AI is ‘industrializing’ but needs better metrics and testing

Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.

China has overtaken the United States in total number of AI research citations, fewer AI startups are receiving funding, and Congress is talking about AI more than ever. Those are three major trends highlighted in the 2021 AI Index, an annual report released today by Stanford University. Now in its fourth year, the AI Index attempts to document advances in artificial intelligence, as well as the technology’s impact on education, startups, and government policy. The report details progress in the performance of major subdomains of AI, like deep learning, image recognition, and object detection, as well as in areas like protein folding.

The AI Index is compiled by the Stanford Institute for Human-Centered Artificial Intelligence and an 11-member steering committee, with contributors from Harvard University, OECD, the Partnership on AI, and SRI International. The AI Index utilizes datasets from a range of sources, like AI research data from arXiv, funding data from Crunchbase, and surveys of groups like Black in AI and Queer in AI. A major trend also identified in the report is the industrialization of AI, said Jack Clark, head of an OECD group working on algorithm impact assessment and former policy director for OpenAI.

Other major takeaways from the report:

  • Brazil, India, Canada, Singapore, and South Africa saw the highest levels of AI hiring from 2016 to 2020, according to data provided by LinkedIn.
  • Total global investment, like private investment and mergers and acquisitions, grew 40% in 2020. But for the third year in a row, AI startup funding is going to fewer startups.
  • In 2019, about two out of three graduates with a Ph.D. in AI in North America went into industry, up from 44% in 2010.
  • The majority of AI Ph.D. graduates come from outside the United States, and four out of five stay in the country after graduating.
  • A news analysis found that AI ethics stories were among the most popular AI-related stories in 2020, including coverage of topics like Google firing Timnit Gebru and ethics initiatives introduced by the European Commission, the United Nations, and the Vatican.
  • Attendance at major AI research conferences doubled in 2020 as most groups chose to hold virtual gatherings.
  • Women made up 18% of AI Ph.D. graduates.
  • China overtook the U.S in total paper citations, but the U.S. continued a two-decade lead in citations at AI research conferences.
  • TensorFlow is still the most popular AI software library, followed by Keras and PyTorch.
  • AI-related papers on arXiv grew from roughly 5,500 in 2015 to nearly 35,000 in 2020.
  • A Queer in AI 2020 member survey found that roughly half of respondents have experienced harassment or discrimination and encountered issues around inclusiveness.
  • Academic researchers lead in total papers published worldwide. But in the U.S., corporate research ranks second, while government research ranks second in Europe and China.

In the portion of the report dedicated to progress toward technical challenges, highlights include advances in effective chemical and molecular synthesis.

The AI Index report shows progress in AI systems that can be used for surveillance, like object detection system YOLO. Considerable progress has also been made with VoxCeleb, which measures the ability to identify a voice from a dataset containing 6,000 people. The AI Index charts a decline in equal error rate of about 8% in 2017 to less than 1% in 2020.

“This metric is telling us that AI systems have gone from having an 8% equal error rate to about 0.5%, which tells you that this capability is going to be being deployed quietly across the world,” Clark said.

A panel of experts on technical progress cited AlphaFold’s ability to predict how proteins fold and GPT-3 as two of the most talked-about AI systems of 2020. Though the AI Index acknowledges few- and zero-shot learning gains made by GPT-3, it cites a paper by former Ethical AI team co-lead Timnit Gebru and others that takes a critical look at large language models and their ability to perpetuate bias. It also mentions a paper published last month by OpenAI and Stanford on the need to address large language models’ societal impact before it’s too late. In an interview with VentureBeat in 2019, AI Index founding director Yoav Shoham expressed doubts about the value of judging language models based on performance on narrow tasks.

VentureBeat has reported extensively on both of the research papers named in the index. Other reports VentureBeat has covered that were cited in the report include McKinsey’s State of AI report that found little progress among business leaders when it comes to addressing risks associated with deploying AI. Another warned about the de-democratization of AI in the age of deep learning, which the coauthors say can perpetuate inequality.

The AI Index report included a call for more benchmarks and testing in the fields of computer vision, ethics, and NLP. As demonstrated by benchmarks like GLUE and SuperGLUE, Clark said, “We’re running out of tests as fast as we can build them.” The creation of new benchmarks and testing is also an opportunity to make metrics that reflect people’s values and measure progress toward addressing grand challenges, like deforestation.

“I think one of the ways to get holistic accountability in a space is to have the same test that you run everything against, or the same set of tests. And until we have that, it’s going to be really fuzzy to talk about biases and other ethical issues in these systems, which I think would just hold us back as a community and also make it easier for people who want to pretend these issues don’t exist to continue to pretend they don’t exist or not mention them,” he said.

In previous years, the AI Index expanded to include tools like an arXiv monitor for searching preprint papers. The AI Index’s Global Vibrancy Tool, which serves up comparisons between national AI initiatives, now works for 26 countries across 23 categories.

Perhaps as interesting as what’s included in the report is what’s missing. This year, the report removed data related to progress on self-driving cars, while Clark said the report does not include information about fully autonomous weaponry, due to a lack of data.


  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Source: Read Full Article