{"id":260861,"date":"2024-09-25T23:09:14","date_gmt":"2024-09-25T23:09:14","guid":{"rendered":"https:\/\/michigandigitalnews.com\/index.php\/2024\/09\/25\/ais-get-worse-at-answering-simple-questions-as-they-get-bigger\/"},"modified":"2025-06-25T17:11:08","modified_gmt":"2025-06-25T17:11:08","slug":"ais-get-worse-at-answering-simple-questions-as-they-get-bigger","status":"publish","type":"post","link":"https:\/\/michigandigitalnews.com\/index.php\/2024\/09\/25\/ais-get-worse-at-answering-simple-questions-as-they-get-bigger\/","title":{"rendered":"AIs get worse at answering simple questions as they get bigger"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"\">\n<figure class=\"ArticleImage\">\n<div class=\"Image__Wrapper\"><img fetchpriority=\"high\" decoding=\"async\" class=\"Image\" width=\"1350\" height=\"900\" alt=\"\" src=\"https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg\" sizes=\"(min-width: 1288px) 837px, (min-width: 1024px) calc(57.5vw + 55px), (min-width: 415px) calc(100vw - 40px), calc(70vw + 74px)\" srcset=\"https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=300 300w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=400 400w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=500 500w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=600 600w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=700 700w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=800 800w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=837 837w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=900 900w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1003 1003w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1100 1100w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1200 1200w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1300 1300w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1400 1400w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1500 1500w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1600 1600w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1674 1674w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1700 1700w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1800 1800w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=1900 1900w, https:\/\/images.newscientist.com\/wp-content\/uploads\/2024\/09\/25105114\/SEI_223012460.jpg?width=2006 2006w\" loading=\"eager\" fetchpriority=\"high\" data-image-context=\"Article\" data-image-id=\"2449432\" data-caption=\"Large language models are capable of answering a wide range of questions \u2013 but not always accurately\" data-credit=\"Jamie Jin\/Shutterstock\"\/><\/div><figcaption class=\"ArticleImageCaption\">\n<div class=\"ArticleImageCaption__CaptionWrapper\">\n<p class=\"ArticleImageCaption__Title\">Large language models are capable of answering a wide range of questions \u2013 but not always accurately<\/p>\n<p class=\"ArticleImageCaption__Credit\">Jamie Jin\/Shutterstock<\/p>\n<\/div>\n<\/figcaption><\/figure>\n<\/p>\n<p>Large language models (LLMs) seem to get less reliable at answering simple questions when they get bigger and learn from human feedback.<\/p>\n<p>AI developers try to improve the power of LLMs in two main ways: scaling up \u2013 giving them more training data and more computational power \u2013 and shaping up, or fine-tuning them in response to human feedback.<\/p>\n<p><a href=\"https:\/\/josephorallo.webs.upv.es\/\">Jos\u00e9 Hern\u00e1ndez-Orallo<\/a> at the Polytechnic University of Valencia, Spain, and his colleagues examined the performance of LLMs as they scaled up and shaped up. They looked at OpenAI\u2019s GPT series of chatbots, Meta\u2019s LLaMA AI models, and BLOOM, developed by a group of researchers called BigScience.<\/p>\n<p>The researchers tested the AIs by posing five types of task: arithmetic problems, solving anagrams, geographical questions, scientific challenges and pulling out information from disorganised lists.<\/p>\n<p>They found that scaling up and shaping up can make LLMs better at answering tricky questions, such as rearranging the anagram \u201cyoiirtsrphaepmdhray\u201d into \u201chyperparathyroidism\u201d. But this isn\u2019t matched by improvement on basic questions, such as \u201cwhat do you get when you add together 24427 and 7120\u201d, which the LLMs continue to get wrong.<\/p>\n<p><span class=\"js-content-prompt-opportunity\"\/><\/p>\n<p>While their performance on difficult questions got better, the likelihood that an AI system would avoid answering any one question \u2013 because it couldn\u2019t \u2013 dropped. As a result, the likelihood of an incorrect answer rose.<\/p>\n<p>The results highlight the dangers of presenting AIs as omniscient, as their creators often do, says Hern\u00e1ndez-Orallo \u2013 and which some users <a href=\"https:\/\/www.newscientist.com\/article\/2442233-using-an-ai-chatbot-or-voice-assistant-makes-it-harder-to-spot-errors\/\">are too ready to believe<\/a>. \u201cWe have an overreliance on these systems,\u201d he says. \u201cWe rely on and we trust them more than we should.\u201d<\/p>\n<p>That is a problem because AI models aren\u2019t honest about the extent of their knowledge. \u201cPart of what makes human beings super smart is that sometimes we don\u2019t realise that we don\u2019t know something that we don\u2019t know, but compared to large language models, we are quite good at realising that,\u201d says <a href=\"https:\/\/www.hertford.ox.ac.uk\/staff\/carissa-veliz\">Carissa V\u00e9liz<\/a> at the University of Oxford. \u201cLarge language models do not know the limits of their own knowledge.\u201d<\/p>\n<p>OpenAI, Meta and BigScience didn\u2019t respond to <em>New Scientist<\/em>\u2019s request for comment.<\/p>\n<section class=\"ArticleTopics\">\n<p class=\"ArticleTopics__Heading\">Topics:<\/p>\n<\/section><\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/www.newscientist.com\/article\/2449427-ais-get-worse-at-answering-simple-questions-as-they-get-bigger\/?utm_campaign=RSS%7CNSNS&#038;utm_source=NSNS&#038;utm_medium=RSS&#038;utm_content=home\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Large language models are capable of answering a wide range of questions \u2013 but not always accurately Jamie Jin\/Shutterstock Large language models (LLMs) seem<\/p>\n","protected":false},"author":1,"featured_media":260862,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[177],"tags":[],"_links":{"self":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/260861"}],"collection":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/comments?post=260861"}],"version-history":[{"count":0,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/260861\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media\/260862"}],"wp:attachment":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media?parent=260861"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/categories?post=260861"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/tags?post=260861"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}