🍦 AI Just Solved Math Problems Humans Couldn't For 80 Years
openai's reasoning model, deepmind's alphaproof, and anthropic's mythos each cracked a conjecture that's been open since 1946. mathematicians are not handling it well
hey everyone, foma ice cream guy is back with some more ai news, so this week the pope wrote to the ai industry to slow down, of course everyone listened and spacex didnt just immediately approached cursor with $60 billion, then three frontier labs simultaneously claimed to have solved an 80-year-old erdős conjecture and made the math community furious, anthropic raised $30 billion to keep paying elon’s compute bill, and starbucks fired its ai inventory robot for being bad at counting milk. anywayyyyyy, here are the top stories.
pope leo xiv published an entire encyclical telling the ai industry to take a cold shower
pope leo xiv dropped his first encyclical and the topic was, of all things, ai. magnifica humanitas: on safeguarding the human person in the time of artificial intelligence runs the full vatican playbook — autonomous weapons are “beyond human control,” governments should slow ai development, ai data should not be solely privately owned, workers’ rights, kids’ rights, and ai companies should “reduce competition” with each other in the name of humanity. five separate newsletters led with this, including one whose subject line was literally “ai vs the pope.” and yes, anthropic’s chris olah was reportedly in the room when it dropped, which is the kind of detail you cannot make up — a frontier-lab interpretability researcher attending a papal text-drop on whether his employer is contributing to spiritual decline. local pontiff reportedly thrilled to be the only public figure in 2026 willing to issue a binding moral statement on agentic systems, noting that the ten commandments were also originally written before transformer architectures. the funniest part is that “reduce competition between ai companies” is the one ask the labs will absolutely comply with, just not in the way he meant it.
spacex is trying to buy cursor for $60 billion and cursor just hit $3b in arr
cursor is reportedly in talks to be acquired by spacex for $60 billion, days after announcing it crossed $3 billion in annualized revenue. xai has already been renting compute from spacex’s colossus cluster to train cursor’s models, and now elon would like to just buy the customer instead of selling to it. the same week, elon also confirmed grok v9-medium finished training on “significant amounts of cursor data,” which is one way to phrase “we trained our model on the editor we are about to acquire.” composer 2.5 was last week’s story; this week the bigger plot is that the most successful ai-native dev tool of the cycle might end up living inside a rocket company that owns a model lab that owns a social network. the actual product surface stays the same. the ownership chart looks like a mind map drawn by someone who took ketamine. somewhere a sequoia partner is updating the cursor entry on his portfolio page from “growth-stage” to “vertically integrated under one billionaire.”
openai, deepmind, and anthropic all claimed to solve erdős this week and the math community is fuming
three frontier labs released math wins on the same week. openai’s internal reasoning model disproved the erdős unit-distance conjecture — an 80-year-old open problem in discrete geometry — by applying algebraic number theory to a geometric question nobody asked it to. deepmind’s alphaproof nexus (gemini 3.1 pro paired with the lean formal proof assistant) autonomously solved nine open erdős problems and proved 44 open oeis conjectures, at “a few hundred dollars” of inference cost per problem, two of them unsolved for 56 years. anthropic’s claude mythos also took a swing at the same unit-distance conjecture using parallel hypothesis exploration. then a math grad student took to twitter to call openai’s announcement “exceedingly tacky and in bad taste,” noting the proof was previously considered “unapproachable” and that turning it into a launch tweet was, somehow, the actual cultural innovation. this is the moment math became another benchmark — three labs racing to claim the same 80-year-old conjecture, the proofs themselves now competing with the press releases, and the discipline that took its time for a century learning what it feels like to have an exec deck attached to its open problems. the era of patient mathematics ended on a tuesday.
anthropic is raising $30b at a $900b valuation while spending $1.25b a month on someone else’s gpus
anthropic is reportedly raising over $30 billion at a $900 billion post-money valuation, the largest single round in tech history. and they need every cent of it: anthropic is spending $1.25 billion a month renting compute from xai's colossus facility, and is in active talks to use microsoft's maia 200 inference chips for claude workloads. the second-place lab is now both the most expensive private company in the world and the largest known tenant of its biggest competitor's data center. it's also negotiating to put its own production traffic on chips made by the company that funds openai. the same week, the inference middle layer set its own records — fireworks reportedly raising at $15b, baseten at $11b, and modal, exa, and turbopuffer all crossing the unicorn line on the same news cycle, which is the bill anthropic and everyone else has to keep paying. the chip war, the talent war, and the model war have collapsed into one thing: a circular pile of money where every frontier lab is somehow paying every other frontier lab. somewhere a corporate-ai-strategy consultant is drawing a venn diagram with one circle. the actual takeaway for builders is simpler — claude code is so big that anthropic is willing to lease its core compute from elon musk to keep shipping it, which should tell you something about who is actually winning the dev-tool half of this race.
alibaba’s qwen3.7 max ran for 35 hours straight and speaks anthropic’s api protocol
qwen3.7 max dropped with a 1-million-token context window, 35 hours of continuous autonomous reasoning, and — the actually-interesting bit — native support for the anthropic api protocol. the alibaba team didn't make claude code users switch SDKs; they just made qwen pretend to be claude on the wire so claude code, claude agent sdk, and any tool already wired to anthropic can swap the endpoint and run on a chinese model instead. on artificial analysis it's reportedly tied with gpt-5.4 xhigh and ahead of gemini 3.5 flash; on code arena: frontend it debuted at #4. the geopolitical play is louder than the model itself: the open-weight chinese frontier just made compatibility with anthropic the default surface for using it. the moat the api economy thought it had — "our protocol, our customers, our retention" — turns out to be a config flag. somewhere a sales engineer at openai is realizing the next time a customer says "we built on the anthropic SDK," it might mean nothing about which model is actually answering the request.
xai shipped grok build cli and finished training a 1.5-trillion-parameter model on cursor data
xai launched grok build in beta — a terminal coding agent with a cli, a plan mode that shows the agent's plan before it executes, and image+video generation wired into the same loop. it's the fifth major terminal coding agent to ship into a market that already has claude code, codex cli, gemini cli, and antigravity. days later elon confirmed grok v9-medium, a 1.5-trillion-parameter foundation model, had finished training and would be public in 2-3 weeks, "incorporating significant amounts of cursor data" for programming tasks. so to summarize: xai built a coding agent, trained a frontier model on the editor that ships the leading coding agent, and is reportedly trying to buy that editor outright for $60 billion. the strategy is no longer "compete on intelligence" — it's "buy the entire feedback loop." everyone else is benchmarking on swe-bench. xai is benchmarking on m&a.
anthropic’s project glasswing has logged 10,000 critical vulns and mythos 1 is going public
the security-agent arms race we covered last week now has a stat line. anthropic published the first project glasswing update: 10,000+ high- or critical-severity vulnerabilities found in essential open-source software in roughly one month of running mythos and partner harnesses. mythos 1 is preparing for broader public release, anthropic is starting to publish the discovered vulns publicly, and qualifying customers are getting access to the security tool stack — threat-model builders, vulnerability harnesses, the works. last month "ai for security" was four launches looking for a use case. this month it's a number — five digits of working exploits a single lab surfaced before lunch. the soc team's job, the bug-bounty program, and the entire concept of a software vendor's "responsible disclosure window" are about to get rewritten by a single agent run. the only real question is whether mythos 1 ships to enterprise before the first state actor figures out how to fork the open-source version.
hark raised $700m at $6b with nvidia, amd, intel, and qualcomm all on the cap table
brett adcock — the same brett who founded figure to put humanoid robots in factories — raised $700 million at a $6 billion post-money valuation for hark, his "personal intelligence" lab. the cap table is the actual story: nvidia, amd, intel,
a 65-line karpathy-inspired claude.md is the most-starred ai tool on github right now
forrest chang took four observations andrej karpathy posted on x about how llms fail at coding — silent assumptions, overcomplicated solutions, scope creep, vague execution — and turned them into a 65-line
hermes agent, the open-source claude-code rival that uses skill.md, just shipped on ios
nous research's hermes agent — the open-source agent harness that's been quietly eating dev-twitter share for the last month — passed openclaw on the daily-token-consumption leaderboard and shipped on ios this week, letting you remote-control your self-hosted ai from your phone. it integrates with about 20 messaging services, runs on a wide variety of llms locally or in the cloud, uses the same
spotify and universal music shipped a paid ai-remix add-on so fans can legally cover taylor swift
spotify and universal music group struck a licensing deal that lets paying premium subscribers generate ai-powered covers and remixes of participating umg artists' songs, with revenue sharing back to artists and songwriters. universal also separately renewed its tiktok deal the same week, with new measures to scrub unauthorized ai music. for two years the music industry's official position on ai-generated covers was "we are suing." this week the position became "we are charging for it." the suno and udio lawsuits are still live, but the labels just demonstrated the actual playbook: don't ban it, license it, gate it behind a premium tier, and put your name on the receipt. ai-generated covers were always going to win — the only question was who collected the rent. somewhere a lawyer who spent eighteen months drafting takedown notices is realizing his next slide deck is titled "monetization framework."
starbucks fired its ai inventory robot for being bad at counting milk
starbucks is discontinuing its automated counting ai tool — codename nomadgo — across more than 11,000 north american stores after nine months of operation, and is going back to baristas counting milk and beverage stock by hand. the cited reason: the ai was inaccurate at counting products. local barista reportedly thrilled to learn that after nine months of training a computer-vision system to recognize the difference between a half-gallon of oat milk and a half-gallon of whole milk, the machine has been demoted in favor of a person with eyes and a clipboard. this is what every "agentic transformation" deck looks like at the 18-month mark — a major brand quietly rolling back the pilot, reverting the workflow, and burying the announcement on a tuesday afternoon. the funniest part is that the same week starbucks gave up on ai counting cartons, anthropic's mythos discovered ten thousand cve-grade software vulnerabilities. it turns out an ai agent will absolutely find a 0-day in a linux kernel, but cannot reliably tell two jugs of milk apart. this is the entire 2026 ai economy in one fact pattern.













