{"id":535884,"date":"2026-03-12T16:30:29","date_gmt":"2026-03-12T16:30:29","guid":{"rendered":"https:\/\/www.newsbeep.com\/au\/535884\/"},"modified":"2026-03-12T16:30:29","modified_gmt":"2026-03-12T16:30:29","slug":"ai-doesnt-see-the-way-that-you-do-and-that-could-be-a-problem-when-it-categorizes-objects-and-scenes","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/au\/535884\/","title":{"rendered":"AI doesn\u2019t \u2018see\u2019 the way that you do, and that could be a problem when it categorizes objects and scenes"},"content":{"rendered":"<p>Even with no fur in frame, you can easily see that a photo of a hairless Sphynx cat depicts a cat. You wouldn\u2019t mistake it for an elephant. <\/p>\n<p>But many artificial intelligence vision systems would. Why? Because when AI systems learn to categorize objects, they often rely on visual cues \u2013 like <a href=\"https:\/\/doi.org\/10.48550\/arXiv.1811.12231\" rel=\"nofollow noopener\" target=\"_blank\">surface texture<\/a> or simple patterns in pixels. This tendency makes them vulnerable to getting confused by small changes that have little effect on human perception.<\/p>\n<p>A vision system aligned more closely with human perception \u2013 one that perhaps emphasizes shape, for instance \u2013 might still confuse the cat for another similarly shaped mammal, like a tiger; but it is unlikely to indicate an elephant.<\/p>\n<p>The kinds of mistakes an AI makes reveal how it organizes visual information, with potential limitations that become concerning in higher-stakes settings.<\/p>\n<p>            <a href=\"https:\/\/images.theconversation.com\/files\/722834\/original\/file-20260309-58-902y88.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" alt=\"red stop sign with stickers and graffiti\" src=\"https:\/\/www.newsbeep.com\/au\/wp-content\/uploads\/2026\/03\/file-20260309-58-902y88.jpg\" class=\"native-lazy\" loading=\"lazy\"  \/><\/a><\/p>\n<p>              Stickers and graffiti on a stop sign could serve as an adversarial attack, confusing AI in autonomous vehicles.<br \/>\n              <a class=\"source\" href=\"https:\/\/www.flickr.com\/photos\/spine\/263214639\" rel=\"nofollow noopener\" target=\"_blank\">rick\/Flickr<\/a>, <a class=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by\/4.0\/\" rel=\"nofollow noopener\" target=\"_blank\">CC BY<\/a><\/p>\n<p>Imagine an autonomous vehicle approaching a vandalized stop sign. While a human driver recognizes the sign from its shape and context, an AI that relies on pixel patterns may misclassify it, pushing the altered sign <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2307.08278\" rel=\"nofollow noopener\" target=\"_blank\">out of the category \u201csign\u201d altogether<\/a> and into a different group of images that it identifies as similar, such as a billboard, advertisement or other roadside object.<\/p>\n<p>Together, these problems point to a misalignment between how humans perceive the visual world and how AI represents it. <\/p>\n<p><a href=\"https:\/\/scholar.google.com\/citations?user=Z9t4ZG0AAAAJ&amp;hl=en&amp;oi=ao\" rel=\"nofollow noopener\" target=\"_blank\">We are experts in visual perception<\/a>, and <a href=\"https:\/\/scholar.google.com\/citations?user=820ooMMAAAAJ&amp;hl=en&amp;oi=ao\" rel=\"nofollow noopener\" target=\"_blank\">we work at the intersection of human<\/a> <a href=\"https:\/\/scholar.google.com\/citations?user=RzltjfkAAAAJ&amp;hl=en&amp;oi=ao\" rel=\"nofollow noopener\" target=\"_blank\">and machine perception<\/a>. People organize visual input into objects, meaning and relationships shaped by experience and context. AI models don\u2019t organize visual information the same way. This key difference explains why AI sometimes fails in surprising ways.<\/p>\n<p>Seeing objects, not features<\/p>\n<p>Imagine that in front of you is a small, opaque object with both straight and curved edges. But you don\u2019t see those features; you just see your coffee mug.<\/p>\n<p>Vision isn\u2019t a camera, passively recording the world. Instead, your brain rapidly turns the light your eyes absorb into objects you recognize and understand, organizing experience into <a href=\"https:\/\/mitpress.mit.edu\/9780262514620\/vision\/\" rel=\"nofollow noopener\" target=\"_blank\">structured mental representations<\/a>.<\/p>\n<p>Researchers can understand how these representations are structured by examining how people <a href=\"https:\/\/doi.org\/10.1038\/s41562-024-01980-y\" rel=\"nofollow noopener\" target=\"_blank\">judge similarity<\/a>. Your coffee mug is not like your computer, but it\u2019s similar to a glass of water despite differences in appearance. That judgment reflects how the mug is mentally represented: not just in terms of appearance, but also what the mug is used for and how it fits into everyday activities.<\/p>\n<p>            <a href=\"https:\/\/images.theconversation.com\/files\/722924\/original\/file-20260309-71-1egatg.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" alt=\"clear glass of water next to white ceramic mug in saucer on table\" src=\"https:\/\/www.newsbeep.com\/au\/wp-content\/uploads\/2026\/03\/file-20260309-71-1egatg.jpg\" class=\"native-lazy\" loading=\"lazy\"  \/><\/a><\/p>\n<p>              Very alike in how you use them; less similar in looks.<br \/>\n              <a class=\"source\" href=\"https:\/\/www.gettyimages.com\/detail\/photo\/cup-of-coffee-and-a-glass-of-water-on-wooden-table-royalty-free-image\/1489553437\" rel=\"nofollow noopener\" target=\"_blank\">Oscar Wong\/Moment via Getty Images<\/a><\/p>\n<p>Importantly, the mental organization of representations is flexible. Which aspects of an object stand out change with <a href=\"https:\/\/doi.org\/10.1016\/0010-0285(83)90012-9\" rel=\"nofollow noopener\" target=\"_blank\">context and goals<\/a>. If packing a moving box, shape and size matter most, so your mug might be placed anywhere it fits. But when putting it away in a cupboard, it goes next to other drinkware. The mug hasn\u2019t changed, only the way it is organized in your mind.<\/p>\n<p>Human visual perception is adaptive, driven by meaning and tied to how we interact with the world. <\/p>\n<p>Aligning AI with humans<\/p>\n<p>AI systems, however, organize visual input in fundamentally different ways than people \u2013 not because they are machines, but because of how narrowly they are trained. When an AI is trained to categorize a cat or an elephant, it only needs to learn which visual patterns lead to the correct label, not how those animals relate to each other or fit into the broader world.<\/p>\n<p>In contrast, humans learn within a broader context. When we learn what an elephant is, we weave that representation into the tapestry of everything else we have learned: animals, size, habitats and more. Because AI is graded only on label accuracy, it can rely on shortcuts that work in training but sometimes <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2004.07780\" rel=\"nofollow noopener\" target=\"_blank\">fail in the real world<\/a>.<\/p>\n<p>The issue of <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2310.13018\" rel=\"nofollow noopener\" target=\"_blank\">representational alignment<\/a> refers to whether AI organizes information in ways that resemble how people do. It\u2019s not to be confused with <a href=\"https:\/\/doi.org\/10.1007\/s11023-020-09539-2\" rel=\"nofollow noopener\" target=\"_blank\">value alignment<\/a>, which refers to the challenge of making sure AI systems pursue outcomes and goals that humans intend.<\/p>\n<p>Because human learning embeds new information into a web of prior knowledge, the relationships between new and existing concepts can be studied and measured. This means that representational alignment may be a solvable problem and a step toward addressing broader alignment challenges.<\/p>\n<p>One approach to representational alignment focuses on building AI systems that behave like humans on psychological tasks, allowing researchers to compare representations directly. For example, if people judge a cat as more similar to a dog than to an elephant, the goal is to build AI models that arrive at those same judgments.<\/p>\n<p>One <a href=\"https:\/\/www.proquest.com\/openview\/2c45dc4f5d789ed146fdd941d27eb99a\/1?pq-origsite=gscholar&amp;cbl=18750&amp;diss=y\" rel=\"nofollow noopener\" target=\"_blank\">promising technique<\/a> involves <a href=\"https:\/\/doi.org\/10.1007\/s42113-020-00073-z\" rel=\"nofollow noopener\" target=\"_blank\">training AI on human similarity judgments<\/a> collected in the lab. In these studies, human participants might be shown three images and asked which two objects are more similar; for example, whether a mug is more like a glass or a bowl. Including this data during training encourages AI systems to learn how objects relate to one another, producing representations that better reflect how people understand the world.<\/p>\n<p>            <a href=\"https:\/\/images.theconversation.com\/files\/722926\/original\/file-20260309-71-5a2m3p.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" alt=\"view from behind of man looking at X-rays of chest and other body parts\" src=\"https:\/\/www.newsbeep.com\/au\/wp-content\/uploads\/2026\/03\/file-20260309-71-5a2m3p.jpg\" class=\"native-lazy\" loading=\"lazy\"  \/><\/a><\/p>\n<p>              Health care providers want AI systems that flag real issues, without a lot of misses or false positives.<br \/>\n              <a class=\"source\" href=\"https:\/\/www.gettyimages.com\/detail\/photo\/doctor-examining-xray-images-on-wall-royalty-free-image\/1134298909\" rel=\"nofollow noopener\" target=\"_blank\">REB Images\/Connect Images via Getty Images<\/a><\/p>\n<p>Alignment beyond vision<\/p>\n<p>Representational alignment matters <a href=\"https:\/\/representational-alignment.github.io\/2025\/\" rel=\"nofollow noopener\" target=\"_blank\">beyond vision systems<\/a>, and AI researchers are <a href=\"https:\/\/doi.org\/10.1038\/s41586-025-09631-6\" rel=\"nofollow noopener\" target=\"_blank\">taking notice<\/a>. As AI increasingly supports high-stakes decisions, differences between how machines and humans represent the world will have real consequences, even when an AI system <a href=\"https:\/\/www.uchicagomedicine.org\/forefront\/research-and-discoveries-articles\/artificial-intelligence-models-to-analyze-cancer-images-can-take-shortcuts-that-introduce-bias-for-minority-patients\" rel=\"nofollow noopener\" target=\"_blank\">appears highly accurate<\/a>. For example, if an AI analyzing medical images learns to associate the source of an image or repeated image artifacts with disease rather than the real visual signs of the disease itself, that is obviously problematic.<\/p>\n<p>AI doesn\u2019t necessarily need to process information exactly the way people think, but training AI using principles drawn from human perception and cognition \u2013 such as similarity, context and relational structure \u2013 can lead to safer, more accurate and more ethical systems.<\/p>\n","protected":false},"excerpt":{"rendered":"Even with no fur in frame, you can easily see that a photo of a hairless Sphynx cat&hellip;\n","protected":false},"author":2,"featured_media":535885,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[256,254,255,64,63,105],"class_list":{"0":"post-535884","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-au","12":"tag-australia","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/535884","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/comments?post=535884"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/535884\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media\/535885"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media?parent=535884"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/categories?post=535884"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/tags?post=535884"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}