{"id":153278,"date":"2025-11-22T06:17:31","date_gmt":"2025-11-22T06:17:31","guid":{"rendered":"https:\/\/www.newsbeep.com\/ie\/153278\/"},"modified":"2025-11-22T06:17:31","modified_gmt":"2025-11-22T06:17:31","slug":"apple-machine-learning-research-at-neurips-2025","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ie\/153278\/","title":{"rendered":"Apple Machine Learning Research at NeurIPS 2025"},"content":{"rendered":"<p>Apple researchers advance AI and ML through fundamental research, and to support the broader research community and help accelerate progress in this field, we share much of this work through publications and engagement at conferences.<\/p>\n<p>Next month, the 39th annual <a href=\"https:\/\/neurips.cc\" target=\"_blank\" aria-label=\"Conference on Neural Information Processing Systems (NeurIPS) - Opens in a new window\" class=\"icon icon-after icon-external\" rel=\"noopener nofollow\">Conference on Neural Information Processing Systems (NeurIPS)<\/a>, will be held in San Diego, California, with a satellite event also taking place in Mexico City, Mexico. Apple is proud to once again to participate in this important event for the community and to support it with our sponsorship.<\/p>\n<p>At the main conference and associated workshops, Apple researchers will present many papers across a variety of topics in ML. As highlighted below, this includes new works <a href=\"#advancing-privacy\">advancing privacy-preserving ML<\/a>, <a href=\"#understanding-reasoning\">understanding the strengths and limitations of reasoning models<\/a>, sharing <a href=\"#gen-ai\">innovative approaches to generative AI<\/a>, and detailing <a href=\"#data-mixtures\">a principled approach to determining training data mixtures<\/a>.<\/p>\n<p>NeurIPS attendees will be able to experience <a href=\"#demos\">demonstrations of Apple\u2019s ML research<\/a> in our booth # 1103, during exhibition hours. Apple is also sponsoring and participating in a number of <a href=\"#supporting\">affinity group-hosted events<\/a> that support underrepresented groups in the ML community. A comprehensive overview of Apple\u2019s participation in and contributions to NeurIPS 2025 can be found <a href=\"https:\/\/machinelearning.apple.com\/updates\/apple-at-neurips-2025\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">here<\/a>, and a selection of highlights follow below.<\/p>\n<p>Advancing Privacy-Preserving ML<\/p>\n<p>At Apple, we believe privacy is a fundamental human right, and advancing privacy-preserving techniques in AI and ML is an important area of ongoing research. The work Apple researchers will present at NeurIPS this year includes several papers sharing progress in this area.<\/p>\n<p>Accurately estimating a discrete distribution from samples is a fundamental task in statistical ML. Measuring accuracy by the Kullback-Leibler (KL) divergence error is useful for promoting diversity and smoothness in the estimated distribution, and is important in a range of contexts, including data compression, speech recognition, and language modeling. In the Spotlight paper, <a href=\"https:\/\/machinelearning.apple.com\/research\/instance-optimality\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">Instance-Optimality for Private KL Distribution Estimation<\/a>, Apple researchers explore how to estimate probability distributions accurately while protecting privacy. The work focuses on instance-optimality &#8211; designing algorithms that adapt to each specific dataset and perform nearly as well as the best possible method for that case. The paper shares new algorithms that achieve this balance both with and without differential privacy, showing that distributions can be estimated accurately under KL error, while mathematically guaranteeing that no single person\u2019s data can be inferred.<\/p>\n<p>In differential privacy, randomizing which data points are used in computations can amplify privacy, making it more difficult to connect data to an individual. In the Spotlight paper, <a href=\"https:\/\/machinelearning.apple.com\/research\/privacy-amplification\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">Privacy Amplification by Random Allocation<\/a>, Apple researchers analyze a new sampling strategy referred to as random allocation. In this sampling scheme a user\u2019s data is used in k steps chosen randomly and uniformly from a sequence (or set) of t steps. The paper provides first theoretical guarantees and numerical estimation algorithms for this scheme. This allows for better privacy analyses (and hence better privacy-utility tradeoffs) for a host of important algorithms such as popular variants of differentially private SGD and algorithms for efficient secure aggregation, such as those presented in <a href=\"https:\/\/machinelearning.apple.com\/research\/preamble\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">PREAMBLE: Private and Efficient Aggregation via Block Sparse Vectors<\/a>, another paper that Apple researchers will present at NeurIPS this year.<\/p>\n<p>Understanding the Strengths and Limitations of Reasoning Models<\/p>\n<p>Reasoning is an important capability for AI, enabling systems to accomplish complex objectives that require planning and multiple steps &#8211; such as solving math and coding problems, as well as tasks for robots and virtual assistants. While the field has made significant progress in developing reasoning models, fundamental research that rigorously investigates the strengths and limitations of current approaches is essential to further advancing this capability for the future.<\/p>\n<p>At NeurIPS, Apple researchers will present <a href=\"https:\/\/machinelearning.apple.com\/research\/illusion-of-thinking\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity<\/a>, which explores how current AI models handle complex reasoning tasks. With controllable puzzle environments, the work systematically tests how these models\u2019 performance changes as problems increase in complexity (see <a href=\"#figure1\">Figure 1<\/a>). The paper shows that the accuracy of frontier Large Reasoning Models (LRMs) collapses beyond certain complexities, and finds that LRMs\u2019 reasoning effort increases along with the complexity of a challenge &#8211; up to a point &#8211; and then it declines, despite having a sufficient token budget. The work also compares the performance of Large Reasoning Models (LRMs) and LLMs with equal inference compute, finding that LLMs outperform LRMs for low-complexity tasks, LRMs show an advantage in medium-complexity tasks, and both types fail for high-complexity tasks. The paper provides insight into LRMs\u2019 strengths and limitations, raising crucial questions about these models\u2019 reasoning capabilities today, which may ultimately illuminate opportunities to make LRMs more capable in the future.<\/p>\n<p>One of the authors of the above paper will also deliver an <a href=\"https:\/\/neurips.cc\/virtual\/2025\/loc\/san-diego\/128665\" target=\"_blank\" aria-label=\"Expo Talk on the topic of reasoning - Opens in a new window\" class=\"icon icon-after icon-external\" rel=\"noopener nofollow\">Expo Talk on the topic of reasoning<\/a> on Tuesday, December 2, at 8:30am PST in the Upper Level Ballroom 20AB. The talk will provide a critical review of reasoning in language models, highlight why current evaluations can be misleading, and emphasize that reasoning is not just about \u201cwhat\u201d models answer, but \u201chow\u201d they solve problems.<\/p>\n<p><a href=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/main_figure_f794f49488.png\" tabindex=\"-1\" target=\"_blank\" class=\"mt-0\"><img decoding=\"async\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/main_figure_f794f49488.png\" loading=\"lazy\" class=\"bg-gray-light\"\/><\/a>Figure 1: Our setup enables verification of both final answers and intermediate reasoning traces, allowing detailed analysis of model thinking behavior.<\/p>\n<p>Innovative Approaches to Generative AI<\/p>\n<p>The industry has made impressive progress in high-resolution image generation models, but the dominant approaches also have undesirable characteristics. Diffusion models are computationally expensive in both training and inference, autoregressive generative models can be expensive at inference and require quantization that can adversely affect their output\u2019s fidelity, and hybrid models that apply autoregressive techniques directly in continuous space are complex.<\/p>\n<p>In the NeurIPS Spotlight paper, <a href=\"https:\/\/machinelearning.apple.com\/research\/starflow\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis<\/a>, Apple researchers share a scalable approach that generates comparable quality high-resolution images (see <a href=\"#figure2\">Figure 2<\/a>), without the computational cost and complexity of prior methods. This method builds on the Transformer Autoregressive Flow (<a href=\"https:\/\/machinelearning.apple.com\/research\/normalizing-flows\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">TARFlow<\/a>), which combines normalizing flows (NF) and the autoregressive transformer architecture. STARFlow produces images at resolutions and quality levels previously thought unreachable for NF models, rivaling top diffusion and autoregressive methods while maintaining exact likelihood modeling and faster inference. This work is the first successful demonstration of normalizing flows at this scale and resolution, and it shows that normalizing flows are a powerful alternative to diffusion models for AI image generation.<\/p>\n<p><a href=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/t2i_examples_variable_128d878973.jpeg\" tabindex=\"-1\" target=\"_blank\" class=\"mt-0\"><img decoding=\"async\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/t2i_examples_variable_128d878973.jpeg\" loading=\"lazy\" class=\"bg-gray-light\"\/><\/a>Figure 2: generated samples from our model with variable aspect ratios.<\/p>\n<p>As generative AI models become increasingly widely used, efficient methods to control their generations &#8211; for example to ensure they produce safe content or provide users with the ability to explore style changes &#8211; are becoming increasingly important. Ideally, these methods should maintain output quality, and not require a large amount of data or computational cost at training or inference time.<\/p>\n<p>Apple researchers have <a href=\"https:\/\/machinelearning.apple.com\/research\/transporting-activations\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">previously demonstrated<\/a> that an effective and efficient approach to this challenge is intervening exclusively on model activations, with the goal of correcting distributional differences between activations seen when using prompts from a source vs. a target set (e.g. toxic and non-toxic sentences). At NeurIPS, Apple researchers will present <a href=\"https:\/\/machinelearning.apple.com\/research\/end-to-end-learning\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">LinEAS: End-to-end Learning of Activation Steering with a Distributional Loss,which describes linear end-to-end activation steering (LinEAS)<\/a>, an approach trained with a global loss that accounts simultaneously for all layer-wise distributional shifts (see <a href=\"#figure3\">Figure 3<\/a>). LinEAS only requires a handful of unpaired samples to be effective, and beats similar baselines on toxicity mitigation in language models. Its global optimization allows including a sparsity regularization, resulting in more precise and targeted interventions that are effective while preserving the base model fluency. This method is modality-agnostic is shown to outperform existing activation-steering methods at mitigating and including new concepts at the output of single-step text-to-image generation models.<\/p>\n<p><a href=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/Lin_EAS_fig_1_fc2912b6dc.jpg\" tabindex=\"-1\" target=\"_blank\" class=\"mt-0\"><img decoding=\"async\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/Lin_EAS_fig_1_fc2912b6dc.jpg\" loading=\"lazy\" class=\"bg-gray-light\"\/><\/a>Figure 3:  LinEAS learns lightweight maps to steer pretrained model activations. With LinEAS, we gain fine-grained control on text-to-image generation to induce precise styles (in the figure) or remove objects. The same procedure also allows controlling LLMs.<\/p>\n<p>A Principled Approach to Determining Training Data Mixtures<\/p>\n<p>Large foundation models are typically trained on data from multiple domains, and the data mixture &#8211; the proportion of each domain used in training &#8211; plays a critical role in model performance. The standard approach to selecting this mixture relies on trial and error, which becomes impractical for large-scale pretraining.<\/p>\n<p>At NeurIPS, Apple researchers will present <a href=\"https:\/\/machinelearning.apple.com\/research\/optimal-data-mixtures\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">Scaling Laws for Optimal Data Mixtures<\/a>, which provides a better approach to this fundamental challenge. The paper shares a systematic method to determine the optimal data mixture for any target domain using scaling laws (see <a href=\"#figure4\">Figure 4<\/a>).  The scaling laws predicts the loss of a model of size N  trained with D tokens with a mixture h . The paper shows that these scaling laws are universal, and demonstrates their predictive power for large-scale pretraining of large language models (LLMs), native multimodal models (NMMs), and large vision models (LVMs). It also shows that these scaling laws can extrapolate to new data mixtures and across scales: their parameters can be accurately estimated using a few small-scale training runs, and used to estimate the performance at larger scales and unseen domain weights. The scaling laws allow practitioners to derive the optimal domain weights for any target domain under a given training budget (N, D), providing a principled alternative to costly trial-and-error methods.<\/p>\n<p><a href=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/Screenshot_2025_09_15_at_18_03_19_1c2361c269.png\" tabindex=\"-1\" target=\"_blank\" class=\"mt-0\"><img decoding=\"async\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/Screenshot_2025_09_15_at_18_03_19_1c2361c269.png\" loading=\"lazy\" class=\"bg-gray-light\"\/><\/a>Figure 4: Left: We derive scaling laws that predict the loss of a model as a function of model size N, number of training tokens D, and the domain weights used to train the model (represented by the color of each point). The scaling law is fitted with small-scale runs with different domain weights, and used to predict accurately the loss of large-scale models trained with new, unseen domain weights. Right: We find the data mixture scaling law based on small-scale experiments (e.g., below 1B parameters) and use it to predict the optimal data mixture at larger scales (e.g., 8B parameters). Both our additive and joint laws lead to similar performance, and better than other mixtures (in the gray area).<\/p>\n<p>Demonstrating ML Research in the Apple Booth<\/p>\n<p>During exhibition hours, NeurIPS attendees will be able to interact with live demos of Apple ML research in booth # 1103. These include:<\/p>\n<p><a href=\"https:\/\/mlx-framework.org\" target=\"_blank\" aria-label=\"MLX - Opens in a new window\" class=\"icon icon-after icon-external\" rel=\"noopener nofollow\">MLX<\/a> &#8211; an open source array framework designed for Apple silicon that enables fast and flexible ML and scientific computing on Apple hardware. The framework is optimized for Apple silicon\u2019s unified memory architecture and leverages both the CPU and GPU. Visitors will be able to experience two MLX demos:<\/p>\n<p>Image generation with a large diffusion model on an iPad Pro with M5 chip<br \/>\nDistributed compute with MLX and Apple silicon: Visitors will be able to explore text and code generation with a 1 trillion-parameter model running in Xcode on a cluster of four Mac Studios equipped with M3 Ultra chips, each operating with 512 GBs of unified memory.<\/p>\n<p><a href=\"https:\/\/machinelearning.apple.com\/research\/fastvlm-efficient-vision-encoding\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">FastVLM<\/a> &#8211; a family of mobile-friendly vision language models, built using MLX. These models use a mix of CNN and Transformer architectures for vision encoding designed specifically for processing high-resolution images. Together, they demonstrate a strong approach that achieves an optimal balance between accuracy and speed. Visitors will get to experience a real-time visual question-and-answer demo on iPhone 17 Pro Max.<\/p>\n<p>Supporting the ML Research Community<\/p>\n<p>Apple is committed to supporting underrepresented groups in the ML community, and we are proud to again sponsor several affinity groups hosting events onsite at NeurIPS 2025 in San Diego, including Women in Machine Learning (WiML) (<a href=\"https:\/\/sites.google.com\/wimlworkshop.org\/wimlworkshopneurips2025\/home\" target=\"_blank\" aria-label=\"workshop - Opens in a new window\" class=\"icon icon-after icon-external\" rel=\"noopener nofollow\">workshop<\/a> on December 2), LatinX in AI (<a href=\"https:\/\/www.latinxinai.org\/neurips-2025?srsltid=AfmBOooG5eApVP7J5eUKAD_MRCdWkgyWe9CiYQWPX2Lg4odXUH4K7Jql\" target=\"_blank\" aria-label=\"workshop - Opens in a new window\" class=\"icon icon-after icon-external\" rel=\"noopener nofollow\">workshop<\/a> on December 2), and Queer in AI (<a href=\"https:\/\/www.queerinai.com\/neurips-2025\" target=\"_blank\" aria-label=\"workshop and evening social - Opens in a new window\" class=\"icon icon-after icon-external\" rel=\"noopener nofollow\">workshop and evening social<\/a> on December 4). In addition to supporting these workshops with sponsorship, Apple employees will also be participating at each of these, as well as other events taking place during the conference.<\/p>\n<p>Learn More about Apple ML Research at NeurIPS 2025<\/p>\n<p>This post highlights just a handful of the works Apple ML researchers will present at NeurIPS 2025, and a comprehensive overview and schedule of our participation can be found <a href=\"https:\/\/machinelearning.apple.com\/updates\/apple-at-neurips-2025\" class=\"more\" rel=\"nofollow noopener\" target=\"_blank\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"Apple researchers advance AI and ML through fundamental research, and to support the broader research community and help&hellip;\n","protected":false},"author":2,"featured_media":85030,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[220,218,219,61,60,80],"class_list":{"0":"post-153278","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-ie","12":"tag-ireland","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts\/153278","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/comments?post=153278"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts\/153278\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/media\/85030"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/media?parent=153278"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/categories?post=153278"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/tags?post=153278"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}