DeepMind AI with built-in fact-checker makes mathematical discoveries

December 14, 2023

22 3 minutes read

[ad_1]

DeepMind’s FunSearch AI can tackle mathematical problems

alengo/Getty Images

Google DeepMind claims to have made the first ever scientific discovery with an AI chatbot by building a fact-checker to filter out useless outputs, leaving only reliable solutions to mathematical or computing problems.

Previous DeepMind achievements, such as using AI to predict the weather or protein shapes, have relied on models created specifically for the task at hand, trained on accurate and specific data. Large language models (LLMs), such as GPT-4 and Google’s Gemini, are instead trained on vast amounts of varied data to create a breadth of abilities. But that approach also makes them susceptible to “hallucination”, a term researchers use for producing false outputs.

Gemini – which was released earlier this month – has already demonstrated a propensity for hallucination, getting even simple facts such as the winners of this year’s Oscars wrong. Google’s previous AI-powered search engine even made errors in the advertising material for its own launch.

One common fix for this phenomenon is to add a layer above the AI that verifies the accuracy of its outputs before passing them to the user. But creating a comprehensive safety net is an enormously difficult task given the broad range of topics that chatbots can be asked about.

Alhussein Fawzi at Google DeepMind and his colleagues have created a generalised LLM called FunSearch based on Google’s PaLM2 model with a fact-checking layer, which they call an “evaluator”. The model is constrained to providing computer code that solves problems in mathematics and computer science, which DeepMind says is a much more manageable task because these new ideas and solutions are inherently and quickly verifiable.

The underlying AI can still hallucinate and provide inaccurate or misleading results, but the evaluator filters out erroneous outputs and leaves only reliable, potentially useful concepts.

“We think that perhaps 90 per cent of what the LLM outputs is not going to be useful,” says Fawzi. “Given a candidate solution, it’s very easy for me to tell you whether this is actually a correct solution and to evaluate the solution, but actually coming up with a solution is really hard. And so mathematics and computer science fit particularly well.”

DeepMind claims the model can generate new scientific knowledge and ideas – something LLMs haven’t done before.

To start with, FunSearch is given a problem and a very basic solution in source code as an input, then it generates a database of new solutions that are checked by the evaluator for accuracy. The best of the reliable solutions are given back to the LLM as inputs with a prompt asking it to improve on the ideas. DeepMind says the system produces millions of potential solutions, which eventually converge on an efficient result – sometimes surpassing the best known solution.

For mathematical problems, the model writes computer programs that can find solutions rather than trying to solve the problem directly.

Fawzi and his colleagues challenged FunSearch to find solutions to the cap set problem, which involves determining patterns of points where no three points make a straight line. The problem gets rapidly more computationally intensive as the number of points grows. The AI found a solution consisting of 512 points in eight dimensions, larger than any previously known.

When tasked with the bin-packing problem, where the aim is to efficiently place objects of various sizes into containers, FunSearch found solutions that outperform commonly used algorithms – a result that has immediate applications for transport and logistics companies. DeepMind says FunSearch could lead to improvements in many more mathematical and computing problems.

Mark Lee at the University of Birmingham, UK, says the next breakthroughs in AI won’t come from scaling-up LLMs to ever-larger sizes, but from adding layers that ensure accuracy, as DeepMind has done with FunSearch.

“The strength of a language model is its ability to imagine things, but the problem is hallucinations,” says Lee. “And this research is breaking that problem: it’s reining it in, or fact-checking. It’s a neat idea.”

Lee says AIs shouldn’t be criticised for producing large amounts of inaccurate or useless outputs, as this is not dissimilar to the way that human mathematicians and scientists operate: brainstorming ideas, testing them and following up on the best ones while discarding the worst.

Topics:

[ad_2]
Source link

December 14, 2023

22 3 minutes read

Клининговая компания Челябинск
24.Клининг Челябинск специализируется на профессиональной уб...
buy ig followers
Really nice experience I got my followers really fast plus s...
aviator kqEl
1. The Ultimate Aviator Games Guide juego del aviator aviato...
オナホラブドール
STPE provides this expertise nearer to fact than ever just b...
オナニーグッズ男
He now knows what I look like when I fall asleep with a shee...

DeepMind AI with built-in fact-checker makes mathematical discoveries

Experience the Art of Sushi at Noble Nori in Monticello

Unbeatable Bulk Sale on High-Quality Musical Instruments and Stage Equipment in South Fallsburg, NY

Bulk Sale of Musical Instruments and Stage Equipment in South Fallsburg, NY

ICJ’s Genocide Ruling Against Israel Has Little Immediate Effect

OLXTOTO Situs Toto Terpercaya Nomor Satu di Indonesia

Megabucks Hits for Fourth Time in Nevada in 2023

Unleash the Power of Adventure with the Beats 180XL Monster Golf Cart UTV 170cc Utility Vehicle

Explore the Power and Versatility of the 500cc Ranch Pony UTV Utility Vehicle

Discover the Versatility and Convenience of the Electric Termite Golf Cart Mini Four-Seater

Onboard Comfort and Convenience: Why Bus Charter Services Are Ideal for Groups

How Bus Charter Services Enhance Group Adventures

Unleashing the Potential of Vacation Properties » RenovateRx

5 Strategies for Overcoming Gender Bias in Entrepreneurship

From 16-Year-Old Skater to Investing in “Cash Machine”

50 Jobs That AI Will Replace In The Next 5 Years

Caesars Entertainment Paid Millions to Hackers, Now Look Like Geniuses

Convenient and Comfortable Bus Charter Service for Your Group Travel Needs

The Power of Digital Marketing for Vacation Property Success » RenovateRx

Related Articles

Tiny nematode worms can grow enormous mouths and become cannibals

‘Peaceful’ male bonobos may actually be more aggressive than chimps

See inside an endangered California condor egg just before it hatches

Are panda sex lives being sabotaged by the wrong gut microbes?