{"id":354333,"date":"2026-03-23T20:23:09","date_gmt":"2026-03-23T20:23:09","guid":{"rendered":"https:\/\/www.newsbeep.com\/il\/354333\/"},"modified":"2026-03-23T20:23:09","modified_gmt":"2026-03-23T20:23:09","slug":"startup-gimlet-labs-is-solving-the-ai-inference-bottleneck-in-a-surprisingly-elegant-way","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/il\/354333\/","title":{"rendered":"Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way"},"content":{"rendered":"<p id=\"speakable-summary\" class=\"wp-block-paragraph\">Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a startup that solve the AI inference bottleneck problem in an astute way. The round was led by Menlo Ventures.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">The company, <a rel=\"nofollow noopener\" href=\"https:\/\/gimletlabs.ai\" target=\"_blank\">Gimlet\u202fLabs<\/a>, has created what it claims is the first and only \u201cmulti-silicon inference cloud\u201d which is software that allows an AI workload to be simultaneously run across diverse types of hardware. It can split an AI app\u2019s work across both traditional CPUs and AI-tuned GPUs, as well as high-memory systems.\u00a0\u00a0<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe basically run across whatever different hardware that\u2019s available,\u201d Asgar told TechCrunch.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">A single agent may chain together multiple steps, and each \u201crequires different hardware: Inference is compute-bound; decode is memory-bound; and tool calls are network-bound,\u201d writes lead investor, Menlo\u2019s Tim Tully, in a blog post about the funding.\u00a0\u00a0<\/p>\n<p class=\"wp-block-paragraph\">No chip yet does it all, but as new hardware gets rolled out, and aging GPUs get redeployed, \u201cthe multi-silicon fleet is ready \u2014 it\u2019s just missing the software layer to make it work.\u201d That\u2019s what Tully believes Gimlet\u202fLabs\u202foffers.<\/p>\n<p class=\"wp-block-paragraph\">If the current deploy-more-compute trend continues, <a rel=\"nofollow noopener\" href=\"https:\/\/www.mckinsey.com\/industries\/technology-media-and-telecommunications\/our-insights\/the-cost-of-compute-a-7-trillion-dollar-race-to-scale-data-centers\" target=\"_blank\">McKinsey estimates<\/a> data center spending will tally nearly $7 trillion by 2030. Asgar says that apps are only using the existing hardware already deployed \u201csomewhere between 15 to 30 percent\u201d of the time.\u00a0\u00a0<\/p>\n<p class=\"wp-block-paragraph\">\u201cAnother way to think about this: you\u2019re wasting hundreds of billions of dollars because you\u2019re just leaving idle resources,\u201d he said. \u201cOur goal was basically to try to figure out how you can get AI workloads to be 10x more efficient than ever, today.\u201d\u00a0<\/p>\n<p>Techcrunch event<\/p>\n<p>\n\t\t\t\t\t\t\t\t\tSan Francisco, CA<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t|<br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\tOctober 13-15, 2026\n\t\t\t\t\t\t\t<\/p>\n<p class=\"wp-block-paragraph\">So he and his cofounders, Michelle Nguyen, Omid Azizi, and Natalie Serrino, set about building orchestration software that slices up agentic workloads so that they can be simultaneous spread across all kinds of hardware.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">Gimlet Labs claims it reliably speeds AI inference up by 3x to 10x for the same cost and power. Gimlet\u202fsays it can even slice the underlying model so that it runs across different architectures, using the best chip for each portion of the model.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">The company has already partnered with chip makers NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix.\u00a0\u00a0<\/p>\n<p class=\"wp-block-paragraph\">Gimlet\u2019s product, delivered either as software or through an API to its own Gimlet Cloud, isn\u2019t for the rank-and-file AI app developer. It\u2019s for the largest AI model labs and data centers.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">The company publicly launched <a href=\"https:\/\/finance.yahoo.com\/news\/gimlet-labs-emerges-stealth-8-120300188.html\" rel=\"nofollow noopener\" target=\"_blank\">in October<\/a> with, it said, eight-figure revenues out of the gate (so at least $10 million). Asgar said that his customer base has more than doubled in the last four months and now includes a major model maker and an extremely large cloud computing company, although he declined to name them.\u00a0\u00a0<\/p>\n<p class=\"wp-block-paragraph\">The cofounders had previously worked together at Pixie, a startup that created an open source observability tool for Kubernetes. Pixie was <a href=\"https:\/\/techcrunch.com\/2020\/12\/10\/new-relic-acquires-kubernetes-observability-platform-pixie-labs\/\" rel=\"nofollow noopener\" target=\"_blank\">acquired<\/a> by New Relic in 2020, just two months after it launched with a $9 million Series A led by Benchmark. (Pixie\u2019s tech is now part of the open source org that oversees Kubernetes.)\u00a0\u00a0<\/p>\n<p class=\"wp-block-paragraph\">After Asgar randomly ran into Tully about a year ago and also received angel investments from Stanford professors, VCs started calling. After launch, a term sheet landed on Asgar\u2019s desk. When VCs heard Asgar was looking at offers, \u201cwe got a pretty big swarm of funding,\u201d and the round was quickly oversubscribed, he said.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">With the previous seed, the startup has now raised a total of $92 million, including from a slew of angels like Sequoia\u2019s Bill Coughran, Stanford Professor Nick McKeown, former CEO of VMware Raghu Raghuram and Intel CEO Lip-Bu Tan. The company currently employs 30 people.<\/p>\n<p class=\"wp-block-paragraph\">Other investors include Factory, who led the seed, Eclipse Ventures, Prosperity7 and Triatomic.<\/p>\n","protected":false},"excerpt":{"rendered":"Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a&hellip;\n","protected":false},"author":2,"featured_media":354334,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[345,343,344,85,46,125],"class_list":{"0":"post-354333","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-il","12":"tag-israel","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts\/354333","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/comments?post=354333"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/posts\/354333\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/media\/354334"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/media?parent=354333"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/categories?post=354333"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/il\/wp-json\/wp\/v2\/tags?post=354333"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}