Unbiased AI-powered news
Managers at AI startups direct engineers to use different models depending on task difficulty. Coinbase CEO Brian Armstrong projected that 80 percent of workloads will shift to cheaper models within 12 to 18 months. Model router adoption among firms rose from 1 percent last year to 5 percent this year.
forbes.comMorgan Linton, chief technology officer of AI startup Bold Metrics, tells his 16 engineers twice a week which models to apply to specific tasks. Business Insider spoke with the Lake Tahoe-based executive 50 minutes before a team standup. He planned to direct one group to Claude Fable on low settings and another to GPT-5.5 on high settings.
A third group uses Cursor with Composer 2.5 and reports totally perfect results. Linton said the approach removes the need for hard token caps. "My team is getting to use the best stuff, but they're using it a lot more efficiently," he said.
The practice reflects a broader shift after companies reviewed employee AI bills and moved away from encouraging maximum usage of any single model. Coinbase CEO Brian Armstrong posted on X on June 7 that 80 percent of workloads will run on 99 percent cheaper models within 12 to 18 months.
The remaining 20 percent of workloads will stay on the latest models where higher performance matters most, he stated.
Chris Maconi, cofounder of Huntsville-based AI startup Hechura, runs operations with a human-in-the-loop approach and avoids overnight autonomous bots. He started an OpenClaw setup with cheaper Gemini models before switching to Anthropic's Haiku. "I'm not afraid to go and try some of these lower-end models to see if they can provide the intelligence that we need," Maconi said.
Tanvi Pisal, a 29-year-old Big Tech user-experience designer, pays for the basic $20-per-month Claude Pro package in addition to a company ChatGPT subscription. She now designs interfaces in Figma first, then uploads screenshots to Claude with instructions to preserve the UI while building functionality. "Doing this design-first process really helps me save tokens," she said.
Alejandra Thomas, a New York City software engineer and tech content creator, tests every new model and reserves lighter or no models for simple tasks. Ed Stevens, CEO of AI sales company Scoot, said his engineers select one model, test it for several months, and switch if a cheaper option delivers comparable results.
Dan Ariely, a behavioral economics professor at Duke University, compared token budgets to old cellphone minute plans that prompted users to make unnecessary calls near month-end.
"Tokens create a model of scarcity where people can't use as much as they want," he said. Ara Kharazian, Ramp's lead economist, reported that the share of firms using model routers increased from around 1 percent last year to 5 percent this year. San Francisco investment firm BlockSpaceForce routes requests through OpenRouter, Fireworks, and Together AI.
Its managing partner Spencer Yang said models have improved at assessing task complexity and recommending cheaper options first. David Gilmore, who runs routing startup Rayline, said many clients experience a "FOMO moment" with new models before receiving large API bills that prompt a switch to lower-cost alternatives.
ndtv.comFrench President Emmanuel Macron and Indian Prime Minister Narendra Modi have met with technology executives this year to discuss data center and cloud infrastructure projects. The two leaders hosted separate events that produced investment commitments from several companies.
Mark Zuckerberg told employees Thursday that development of AI agent technology has fallen behind internal targets. The company also paused a mandatory employee monitoring program last month after a leak and cut 10 percent of its workforce in May.
thenextweb.comMeta released Pocket, an app that lets users generate and share interactive mini games through text prompts. The app first appeared on the App Store and Google Play on June 29, 2026, though it remained unavailable for download in the United States as of July 2.