Meta FAIR and Samaya AI teams use AI to improve Wikipedia's verifiability-AI-php.cn

Meta FAIR and Samaya AI teams use AI to improve Wikipedia's verifiability

PHPz

Release： 2023-10-24 12:21:01

forward

1339 people have browsed it

Meta FAIR 和 Samaya AI 团队利用 AI 提高维基百科的可验证性

Editor | Cabbage Leaf

Verifiability is a core content policy of Wikipedia: claims need to be supported by citations. Maintaining and improving the quality of Wikipedia references is an important challenge, and better tools are urgently needed to help humans do this job.

Here, researchers from Samaya AI and Meta FAIR show that the process of improving references can be tackled with the help of artificial intelligence (AI) powered by information retrieval systems and language models.

This neural network-based system (called SIDE here) can identify Wikipedia citations that are unlikely to support its claims and then recommend better citations from around the web. The team trained the model on existing Wikipedia references, thus learning from the contributions and combined wisdom of thousands of Wikipedia editors. Using crowdsourcing, the researchers observed that for the top 10% of citations most likely to be flagged by the system as unverifiable, people preferred the system's alternative to the originally cited reference 70% of the time plan.

To verify the system's applicability, the researchers built a demo to interact with the English Wikipedia community and found that, according to SIDE, for the same top 10% most likely to fail Validating the claim, SIDE's first citation recommendations have a preferred frequency twice as high as existing Wikipedia citations. The results show that AI-based systems can be used alongside humans to improve Wikipedia’s verifiability.

The research was titled "Improving Wikipedia verifiability with AI" and was published in "Nature Machine Intelligence" on October 19, 2023.

Meta FAIR 和 Samaya AI 团队利用 AI 提高维基百科的可验证性

Wikipedia is one of the most visited websites, with five trillion page views annually, making it one of the most important sources of knowledge today. It is therefore crucial that knowledge on Wikipedia is almost always verifiable: Wikipedia users should be able to find and confirm claims on Wikipedia using reliable external sources. To facilitate this, the Wikipedia article provides inline citations to background material supporting the claim. Readers who question Wikipedia's claims can follow these instructions and verify the information themselves.

In practice, however, this process may fail: the citation may not contain the challenged claim, or its origin may be questionable. Such statements may still be true, but the attentive reader cannot easily verify them with the information in the cited source. Assuming that Wikipedia's claims are true, its verification process is divided into two stages: (1) checking the consistency of existing sources; (2) failing that, finding new evidence.

As mentioned above, verification of Wikipedia claims requires a deep understanding of the language and mastery of online searches. To what extent can machines learn this behavior? This question is important from the perspective of basic artificial intelligence progress. For example, verification requires the ability to detect logical entailments in natural language and translate claims and their context into the best search terms for finding evidence—two long-standing problems that have been studied primarily in some degree of synthesis settings .

From a practical perspective, this is equally important. Machine validators can help Wikipedia editors flag which citations might trigger failed validation and suggest what to replace the citations with, in case they don't currently support their respective claims. This can be important: searching for potential evidence and perusing the search results takes time and a lot of cognitive effort. Integrating AI assistants into the process may help reduce both scenarios.

Meta FAIR 和 Samaya AI 团队利用 AI 提高维基百科的可验证性

Illustration: SIDE overview. (Source: Paper)

In the latest work, researchers at Samaya AI and Meta FAIR have developed SIDE, an AI-based Wikipedia citation validator. SIDE discovers claims on Wikipedia that may not be verifiable based on the current citation, and scans the network snapshot for alternatives.

Its behavior is learned from Wikipedia itself: using a curated corpus of English Wikipedia claims and their current citations, the researchers trained (1) a retriever component that translates claims and context into Symbolic and neural search queries optimized to find candidate citations in a web-scale corpus; (2) a verification model that ranks existing and retrieved citations based on their likelihood to verify a given claim.

The team uses automated metrics and human annotations to evaluate their model. To automatically measure the accuracy of the system, they examined how well SIDE recovered existing Wikipedia citations in high-quality articles (as defined by the Wikipedia featured article class).

The researchers found that in nearly 50% of cases, SIDE accurately returned the source used in Wikipedia as its best solution. It's worth noting that this doesn't mean the other 50% are wrong, just that they are not current Wikipedia sources.

The team also tested SIDE’s capabilities as a citation assistant. In user studies, they placed existing Wikipedia citations next to SIDE-generated citations. Users then evaluate how well the provided citations support the claim, and which citation from SIDE or Wikipedia is more suitable for verification.

Overall, users prefer SIDE's citations to Wikipedia's citations more than 60% of the time, and this proportion increases when SIDE associates very low verification scores with Wikipedia's citations to more than 80%.

Paper link: https://www.nature.com/articles/s42256-023-00726-1

The above is the detailed content of Meta FAIR and Samaya AI teams use AI to improve Wikipedia's verifiability. For more information, please follow other related articles on the PHP Chinese website!