Data Crossroads #71
How Mark Zuckerberg has fully rebuilt Meta around Llama
Sharon Goldman writing for Fortune:
It was the summer of 2023, and the question at hand was whether to release a Llama into the wild.
The Llama in question wasn’t an animal: Llama 2 was the follow-up release of Meta’s generative AI model—a would-be challenger to OpenAI’s GPT-4. The first Llama had come out a few months earlier. It had originally been intended only for researchers, but after it leaked online, it caught on with developers, who loved that it was free—unlike the large language models (LLMs) from OpenAI, Google, and Anthropic—as well as state-of-the-art. Also unlike those rivals, it was open source, which meant researchers, developers, and other users could access the underlying code and its “weights” (which determine how the model processes information) to use, modify, or improve it.
Yann LeCun, Meta’s chief AI scientist, and Joelle Pineau, VP of AI research and head of Meta’s FAIR (Fundamental AI Research) team, wanted to give Llama 2 a wide open-source release. They felt strongly that open-sourcing Llama 2 would enable the model to become more powerful more quickly, at a lower cost. It could help the company catch up in a generative AI race in which it was seen as lagging badly behind its rivals, even as the company struggled to recover from a pivot to the metaverse whose meager offerings and cheesy, legless avatars had underwhelmed investors and customers.
But there were also weighty reasons not to take that path. Once customers got accustomed to a free product, how could you ever monetize it? And as other execs pointed out in debates on the topic, the legal repercussions were potentially ugly: What if someone hijacked the model to go on a hacking spree? It didn’t help that two earlier releases of Meta open-source AI products had backfired badly, earning the company tongue-lashings from everyone from scientists to U.S. senators.
It would fall to CEO Mark Zuckerberg, Meta’s founder and controlling shareholder, to break the deadlock. Zuckerberg has long touted open-source technology (Facebook itself was built on open-source software), but he likes to gather all opinions; he spoke to “everybody who was either for, anti, or in the middle” on the open-source question, recalls Ahmad Al-Dahle, Meta’s head of generative AI. But in the end it was Zuckerberg himself, LeCun says, who made the final decision to release Llama 2 as an open source model: “He said, ‘Okay, we’re just going to do it.’” On July 18, 2023, Meta released Llama 2 “free for research and commercial use.”
In a post on his personal Facebook page, Zuckerberg doubled down on his decision. He emphasized his belief that open-source drives innovation by enabling more developers to build with a given technology. “I believe it would unlock more progress if the ecosystem were more open,” he wrote.
Llama has to be one of the best general open weights LLMs.
Nvidia’s CEO defends his moat as AI labs change how they improve their AI models
Maxwell Zeff writing for TechCrunch:
Nvidia raked in more than $19 billion in net income during the last quarter, the company reported on Wednesday, but that did little to assure investors that its rapid growth would continue. On its earnings call, analysts prodded CEO Jensen Huang about how Nvidia would fare if tech companies start using new methods to improve their AI models.
The method that underpins OpenAI’s o1 model, or “test-time scaling,” came up quite a lot. It’s the idea that AI models will give better answers if you give them more time and computing power to “think” through questions. Specifically, it adds more compute to the AI inference phase, which is everything that happens after a user hits enter on their prompt.
Nvidia’s CEO was asked whether he was seeing AI model developers shift over to these new methods and how Nvidia’s older chips would work for AI inference.
Huang indicated that o1, and test-time scaling more broadly, could play a larger role in Nvidia’s business moving forward, calling it “one of the most exciting developments” and “a new scaling law.” Huang did his best to ensure investors that Nvidia is well-positioned for the change.
The Nvidia CEO’s remarks aligned with what Microsoft CEO Satya Nadella said onstage at a Microsoft event on Tuesday: o1 represents a new way for the AI industry to improve its models.
This is a big deal for the chip industry because it places a greater emphasis on AI inference. While Nvidia’s chips are the gold standard for training AI models, there’s a broad set of well-funded startups creating lightning-fast AI inference chips, such as Groq and Cerebras. It could be a more competitive space for Nvidia to operate in.
Despite recent reports that improvements in generative models are slowing, Huang told analysts that AI model developers are still improving their models by adding more compute and data during the pretraining phase.
With trillions of dollars at stake, this is going to be very interesting to follow. Plus, more interesting is the direction the industry is taking.
Microsoft quietly assembles the largest AI agent ecosystem—and no one else is close
Matt Marshall writing for VentureBeat:
Microsoft has quietly built the largest enterprise AI agent ecosystem, with over 100,000 organizations creating or editing AI agents through its Copilot Studio since launch – a milestone that positions the company ahead in one of enterprise tech’s most closely watched and exciting segments.
“That’s a lot faster than we thought, and it’s a lot faster than any other kind of cutting edge technology we’ve released,” Charles Lamanna, Microsoft’s executive responsible for the company’s agent vision, told VentureBeat. “And that was like a 2x growth in just a quarter.”
The rapid adoption comes as Microsoft significantly expands its agent capabilities. At its Ignite conference starting today, the company announced it will allow enterprises to use any of the 1,800 large language models (LLMs) in the Azure catalog within these agents – a significant move beyond its exclusive reliance on OpenAI’s models. The company also unveiled autonomous agents that can work independently, detecting events and orchestrating complex workflows with minimal human oversight. (See our full coverage of today’s Microsoft’s agent announcements here.)
These AI agents – software that can reason and perform specific business tasks using generative AI – are emerging as a powerful tool for enterprise automation and productivity. Microsoft’s platform enables organizations to build these agents for tasks ranging from customer service to complex business process automation, while maintaining enterprise-grade security and governance.
It’s still early days for agents, especially when many of them are chained together.
Apple Working on 'LLM Siri' for 2026 Launch
Juli Clover writing for MacRumors:
A chatbot version of Siri would be able to hold ongoing conversations, much like ChatGPT. Apple wants customers to be able to better converse with the personal assistant, with Siri responding more like a human. The use of large language models will also allow Siri to perform much more complex tasks, which Apple has to rely on OpenAI's ChatGPT for in iOS 18.2.
Apple is working on improving what Siri can do in and between apps with Apple Intelligence in iOS 18, and that will lay some of the groundwork for the updated version of Siri. For that functionality, Apple will use a first-generation Apple LLM to evaluate requests to determine whether the existing Siri infrastructure should be used, or if a second LLM that's able to handle more complex requests should be queried.
Apple is testing the new Siri in a separate app on iPhones, iPads, and Macs, but it will eventually replace the current version of Siri. The Siri update could be announced as soon as 2025, likely as part of the June Worldwide Developers Conference that will see Apple unveil iOS 19.
While Siri will be previewed early, Apple does not intend to launch the update until several months after it is unveiled. As of now, Apple is aiming for a spring 2026 launch date, but Apple's plans could change.
LLMs are actually Siri’s only hope. See also: How a big Siri upgrade could make Apple Intelligence the game-changer we've been waiting for
Amazon to invest another $4 billion in Anthropic, OpenAI’s biggest rival
Hayden Field writing for CNBC:
Amazon on Friday announced it would invest an additional $4 billion in Anthropic, the artificial intelligence startup founded by ex-OpenAI research executives.
The new funding brings the tech giant’s total investment to $8 billion, though Amazon will retain its position as a minority investor.
Amazon Web Services will also become Anthropic’s “primary cloud and training partner,” according to a blog post.
Amazon still does not have it’s own foundational model. Looks more and more like Anthropic is to Amazon what OpenAI is to Microsoft.
Leaked Amazon documents identify critical flaws in the delayed AI reboot of Alexa
Jason Del Rey writing for Fortune:
Amazon’s race to create an AI-based successor to its voice assistant Alexa has hit more snags after a series of earlier setbacks over the past year. Employees have found there is too much of a delay between asking the technology for something and the new Alexa providing a response or completing a task.
The problem, known as latency, is a critical shortcoming, employees said in an internal memo from earlier this month obtained by Fortune. If released as is, customers could become frustrated and the product—a particularly critical one to Amazon as it tries to keep up in the crucial battle to launch blockbuster consumer AI products—could end up as a failure, some employees fear.
“Latency remains a critical issue requiring significant improvements,” before the new version of Alexa could launch, the memo said.
The latency problem—a common challenge when building complex generative AI applications—is just one of several concerns that Amazon employees cited in internal communications over the last few months that were viewed by Fortune. They show the hurdles that Amazon must overcome to ultimately release the updated Alexa, a major priority at the company because it could open a new door to selling subscriptions to access the new voice assistant and may supercharge sales of Amazon’s Echo line of smart devices.
The updated Alexa is also a key barometer for Amazon’s standing in the race among Big Tech companies to dominate consumer-facing AI and for the financial windfall that’s expected to come with it. Currently, many industry observers consider the company to be trailing its peers like Google parent Alphabet and Microsoft, along with buzzy newcomers such as OpenAI and Perplexity AI in creating breakthrough generative AI applications for consumers.
I have to say Alexa is know for how quickly it can respond. If latency is an issue, then it’s a blocker for sure.
Perplexity introduces a shopping feature for Pro users in the US
Ivan Mehta writing for TechCrunch:
AI-powered search engine Perplexity is venturing into e-commerce. On Monday, the company debuted a new shopping feature for its paid customers in the U.S. which offers shopping recommendations within Perplexity’s search results as well as the ability to place an order without going to a retailer’s website.
With the move, Perplexity is taking on Google and Amazon, intending to capture a portion of shopping search results.
For shopping-related search queries, the tool presents users with visual cards that have details of the product, pricing, and seller info, a short description, and the pros and cons of the item in question. Users can click or tap on the card to read more information, including reviews and detailed key features.
[…]
Apart from Big Tech, startups like Daydream, Deft, and Remark have also raised millions of dollars from venture capitalists to build AI-powered shopping searches.
Amazon debuted AI-powered assistant Rufus earlier this year in the U.S. and expanded to other countries late last month. After its Prime Day sales event in July, the company noted that Rufus helped “millions” of customers find the right items. Google also primed its Shopping tab with AI in October for better search results.
With the onset of large language models, companies working in the e-commerce industry have realized that there is an opportunity to suggest better options as these models can parse user queries and match items from the catalog using organized and unorganized data.
The promise of this new wave of search is that e-commerce search has been bad for years, but now you can write long sentences to describe an item you need, and AI will do the work for you.
The future of search is going to be very interesting. But, this is interesting because a lot of search volume is actually ecommerce-related.
Mistral unleashes Pixtral Large and upgrades Le Chat into full-on ChatGPT competitor
Carl Franzen writing for VentureBeat:
Mistral, the French startup that made waves last year with a record-setting seed funding amount for Europe, has launched a slew of updates today including a new, large foundational model named Pixtral Large.
The company is further upgrading its free web-chased chatbot, Le Chat, adding image generation, web search, and an interactive “canvas,” matching the features of and turning it into a more serious and direct competitor to OpenAI’s ChatGPT.
As Mistral AI CEO and co-founder Arthur Mensch wrote on his account on the social network X, “At Mistral, we’ve grown aware that to create the best AI experience, one needs to co-design models and product interfaces. Pixtral was trained with high-impact front-end applications in mind and is a good example of that.”
Users who want to try out the new Le Chat features will need to enable them as beta features on the web interface. Note that Le Chat access does require a free Mistral, Google, or Microsoft account to use.
This is mere catch-up.