From Wikipedia, the free encyclopedia
Mistral AI
Company typePrivate
IndustryArtificial intelligence
Founded28 April 2023
Founders
  • Arthur Mensch
    (Co-Founder & CEO)
  • Guillaume Lample
    (Co-Founder & Chief Scientist)
  • Timothée Lacroix
    (Co-Founder & CTO)
Headquarters
Paris
,
France
Products
  • Mistral 7B
  • Mixtral 8x7B
  • Mistral Medium
  • Mistral Large
  • Mixtral 8x22B
Website mistral.ai

Mistral AI is a French company selling artificial intelligence (AI) products. It was founded in April 2023 by previous employees of Meta Platforms and Google DeepMind. [1] The company raised €385 million in October 2023, [2] and in December 2023, it was valued at more than $2 billion. [3] [4] [5]

It produces open source large language models, [6] citing the foundational importance of open-source software, and as a response to proprietary models. [7]

As of March 2024, two models have been published and are available as weights. [8] Three more models, Small, Medium and Large, are available via API only. [9] [10]

History

Mistral AI was co-founded in April 2023 by Arthur Mensch, Guillaume Lample and Timothée Lacroix.[ citation needed]

Prior to co-founding Mistral AI, Arthur Mensch worked at Google DeepMind which is Google's artificial intelligence laboratory, while Guillaume Lample and Timothée Lacroix worked at Meta Platforms. [11] The co-founders met while students at École polytechnique. Mistral is named for a strong wind that blows in France. [12]

In June 2023, the start-up carried out a first fundraising of €105 million ($117 million) with investors including the American fund Lightspeed Venture Partners, Eric Schmidt, Xavier Niel and JCDecaux. The valuation is then estimated by the Financial Times at €240 million ($267 million).

On 27 September 2023, the company made its language processing model “Mistral 7B” available under the free Apache 2.0 license. This model has 7 billion parameters, a small size compared to its competitors.

On 10 December 2023, Mistral AI announced that it had raised €385 million ($428 million) as part of its second fundraising. This round of financing notably involves the Californian fund Andreessen Horowitz, BNP Paribas and the software publisher Salesforce. [13]

On 11 December 2023, the company released the Mixtral 8x7B model with 46.7 billion parameters but using only 12.9 billion per token thanks to the mixture of experts architecture. The model masters 5 languages (French, Spanish, Italian, English and German) and outperforms, according to its developers' tests, the "LLama 2 70B" model from Meta. A version trained to follow instructions and called “Mixtral 8x7B Instruct” is also offered. [14]

On 26 February 2024, Microsoft announced a new partnership with the company to expand its presence in the rapidly evolving artificial intelligence industry. Under the agreement, Mistral's rich language models will be available on Microsoft's Azure cloud, while the multilingual conversational assistant "Le Chat" will be launched in the style of ChatGPT. [15]

On 10 April 2024, the company released the mixture of expert models, Mixtral 8x22B, offering high performance on various benchmarks compared to other open models.[ citation needed]

On 16 April 2024, reporting revealed that Mistral was in talks to raise €500 million, a deal that would more than double its current valuation to at least €5 billion. [16]

Models

Open Weight Models

Mistral 7B

Mistral 7B is a 7.3B parameter language model using the transformers architecture. Officially released on September 27, 2023, via a BitTorrent magnet link, [17] and Hugging Face. [18] The model was released under the Apache 2.0 license. The release blog post claimed the model outperforms LLaMA 2 13B on all benchmarks tested, and is on par with LLaMA 34B on many benchmarks tested. [19]

Mistral 7B uses grouped-query attention (GQA), which is a variant of the standard attention mechanism. Instead of computing attention over all the hidden states, it computes attention over groups of hidden states. [20]

Both a base model and "instruct" model were released with the later receiving additional tuning to follow chat-style prompts. The fine-tuned model is only intended for demonstration purposes, and does not have guardrails or moderation built-in. [19]

Mixtral 8x7B

Much like Mistral's first model, Mixtral 8x7B was released via a BitTorrent link posted on Twitter on December 9, 2023, [6] and later Hugging Face and a blog post were released two days later. [14]

Unlike the previous Mistral model, Mixtral 8x7B uses a sparse mixture of experts architecture. The model has 8 distinct groups of "experts", giving the model a total of 46.7B usable parameters. [21] [22] Each single token can only use 12.9B parameters, therefore giving the speed and cost that a 12.9B parameter model would incur. [14]

Mistral AI's testing shows the model beats both LLaMA 70B, and GPT-3.5 in most benchmarks. [23]

In March 2024, research conducted by Patronus AI comparing performance of LLMs on a 100-question test with prompts to generate text from books protected under U.S. copyright law found that Open AI's GPT-4, Mixtral, Meta AI's LLaMA-2, and Anthropic's Claude2 generated copyrighted text verbatim in 44%, 22%, 10%, and 8% of responses respectively. [24] [25]

Mixtral 8x22B

Similar to Mistral's previous open models, Mixtral 8x22B was released via a BitTorrent link on Twitter on April 10, 2024, with a release on Hugging Face soon after.[ citation needed]

API-Only Models

Unlike Mistral 7B, Mixtral 8x7B and Mixtral 8x22B, the following models are closed-source and only available through the Mistral API. [26]

Mistral Large

Mistral Large was launched on February 26, 2024, and Mistral claims it is second in the world only to OpenAI's GPT-4.

It is fluent in English, French, Spanish, German, and Italian, with Mistral claiming understanding of both grammar and cultural context, and provides coding capabilities. As of early 2024, it is Mistral's flagship AI. [27] It is also available on Microsoft Azure.

Mistral Medium

Mistral Medium is trained in various languages including English, French, Italian, German, Spanish and code with a score of 8.6 on MT-Bench. [28] It is ranked in performance above Claude and below GPT-4 on the LMSys ELO Arena benchmark. [29]

The number of parameters, and architecture of Mistral Medium is not known as Mistral has not published public information about it.

Mistral Small

Like the Large model, Small was launched on February 26, 2024. It is intended to be a light-weight model for low latency, with better performance than Mixtral 8x7B. [30]

References

  1. ^ "France's unicorn start-up Mistral AI embodies its artificial intelligence hopes". Le Monde.fr. 2023-12-12. Retrieved 2023-12-16.
  2. ^ Metz, Cade (10 December 2023). "Mistral, French A.I. Start-Up, Is Valued at $2 Billion in Funding Round". The New York Times.
  3. ^ Fink, Charlie. "This Week In XR: Epic Triumphs Over Google, Mistral AI Raises $415 Million, $56.5 Million For Essential AI". Forbes. Retrieved 2023-12-16.
  4. ^ "A French AI start-up may have commenced an AI revolution, silently". Hindustan Times. December 12, 2023.
  5. ^ "French AI start-up Mistral secures €2bn valuation". ft.com Financial Times.
  6. ^ a b "Buzzy Startup Just Dumps AI Model That Beats GPT-3.5 Into a Torrent Link". Gizmodo. 2023-12-12. Retrieved 2023-12-16.
  7. ^ "Bringing open AI models to the frontier". Mistral AI. 27 September 2023. Retrieved 4 January 2024.
  8. ^ "Open-weight models and Mistral AI Large Language Models". docs.mistral.ai. Retrieved 2024-01-04.
  9. ^ "Endpoints and Mistral AI Large Language Models". docs.mistral.ai.
  10. ^ "Endpoints and benchmarks | Mistral AI Large Language Models". docs.mistral.ai. Retrieved 2024-03-06.
  11. ^ "France's unicorn start-up Mistral AI embodies its artificial intelligence hopes". Le Monde.fr. 12 December 2023.
  12. ^ Journal, Sam Schechner | Photographs by Edouard Jacquinet for The Wall Street. "The 9-Month-Old AI Startup Challenging Silicon Valley's Giants". WSJ. Retrieved 2024-03-31.
  13. ^ "Mistral lève 385 M€ et devient une licorne française - le Monde Informatique". 11 December 2023.
  14. ^ a b c "Mixtral of experts". mistral.ai. 2023-12-11. Retrieved 2024-01-04.
  15. ^ Bableshwar (2024-02-26). "Mistral Large, Mistral AI's flagship LLM, debuts on Azure AI Models-as-a-Service". techcommunity.microsoft.com. Retrieved 2024-02-26.
  16. ^ "Mistral in talks to raise €500mn at €5bn valuation". www.ft.com. Retrieved 2024-04-19.
  17. ^ Goldman, Sharon (2023-12-08). "Mistral AI bucks release trend by dropping torrent link to new open source LLM". VentureBeat. Retrieved 2024-01-04.
  18. ^ Coldewey, Devin (27 September 2023). "Mistral AI makes its first large language model free for everyone". TechCrunch. Retrieved 4 January 2024.
  19. ^ a b "Mistral 7B". mistral.ai. Mistral AI. 27 September 2023. Retrieved 4 January 2024.
  20. ^ Jiang, Albert Q.; Sablayrolles, Alexandre; Mensch, Arthur; Bamford, Chris; Chaplot, Devendra Singh; Casas, Diego de las; Bressand, Florian; Lengyel, Gianna; Lample, Guillaume (2023-10-10). "Mistral 7B". arXiv: 2310.06825v1 [ cs.CL].
  21. ^ "Mixture of Experts Explained". huggingface.co. Retrieved 2024-01-04.
  22. ^ Marie, Benjamin (2023-12-15). "Mixtral-8x7B: Understanding and Running the Sparse Mixture of Experts". Medium. Retrieved 2024-01-04.
  23. ^ Franzen, Carl (2023-12-11). "Mistral shocks AI community as latest open source model eclipses GPT-3.5 performance". VentureBeat. Retrieved 2024-01-04.
  24. ^ Field, Hayden (March 6, 2024). "Researchers tested leading AI models for copyright infringement using popular books, and GPT-4 performed worst". CNBC. Retrieved March 6, 2024.
  25. ^ "Introducing CopyrightCatcher, the first Copyright Detection API for LLMs". Patronus AI. March 6, 2024. Retrieved March 6, 2024.
  26. ^ "Pricing and rate limits | Mistral AI Large Language Models". docs.mistral.ai. Retrieved 2024-01-22.
  27. ^ AI, Mistral (2024-02-26). "Au Large". mistral.ai. Retrieved 2024-03-06.
  28. ^ AI, Mistral (2023-12-11). "La plateforme". mistral.ai. Retrieved 2024-01-22.
  29. ^ "LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys". huggingface.co. Retrieved 2024-01-22.
  30. ^ AI, Mistral (2024-02-26). "Au Large". mistral.ai. Retrieved 2024-03-06.