2024-02-24 03:12:01
American AI chip start-up Groq has become popular recently. Groq recently developed a language processing unit (LPU) chip. Its large language model (LLM) inference performance is 18 times faster than Microsoft’s Azure cloud service powered by Huida GPU. May become a competitor of Huida GPU.
(Previous summary: Buterin: It’s time to add complex functions to L1 to reduce the burden on L2! I look forward to AI helping to catch bugs)
(Background supplement: Envy NVIDIA stock to make huge profits? Pay attention to the potential encrypted AI token project)
AI chip company Groq has recently attracted widespread attention on social media. Groq claims to have achieved “the fastest large language model speed in the world.” In a demonstration video that went viral on social media, its chatbot showed an astonishing response speed, almost instantly. Generation makes the current version of ChatGPT, Gemini, Grok… and a series of AI chatbots look sluggish.
The first public demo using Groq: a lightning-fast AI Answers Engine.
It writes factual, cited answers with hundreds of words in less than a second.
More than 3/4 of the time is spent searching, not generating!
The LLM runs in a fraction of a second. pic.twitter.com/QaDXixgSzp
— Matt Shumer (@mattshumer_) February 19, 2024
Recent benchmarks conducted at ArtificialAnalysis.aitestAmong them, Groq far outperforms the other eight competitors in several key performance indicators such as throughput and total response time. Groq can generate regarding 247 tokens per second. In comparison, Microsoft Azure generates regarding 18 tokens per second. Therefore, if ChatGPT is run on Groq, the generation speed will be greatly increased by 13 times.
Source: ArtificialAnalysis.ai
Cryptopolitan reportGroq can achieve such results because it develops a new AI chip “Language Processing Unit” (LPU), which uses self-developed chips to run the AI language interfaces of other open source models. LPU is used to solve the problems of old technologies such as CPU and GPU. limit.
When faced with the huge computing requirements of large language models (LLM), traditional processing architectures often cannot meet the needs, but Groq uses the new tensor flow processor (TPS) architecture to implement LLM operations. TPS and LPU rely on their fast reasoning , the advantage of reducing power consumption is expected to change the way data is processed.
The LPU is designed for deterministic AI operations and breaks away from the traditional Single Instruction Multiple Data (SIMT) model of the GPU. This shift can improve performance and reduce energy consumption. It makes LPU a more environmentally friendly choice in the future,
Senior Risk Architect k_zer0s TweetGroq’s LPU is regarding 20 times faster than the GPU. Because inference runs use much less data than model training, the LPU is more energy-efficient. Compared with Huida GPUs for inference tasks, the LPU draws data from external memory. Less data is read and less power is consumed.
Groq’s LPU is faster than Nvidia GPUs, handling requests and responding more quickly.
Groq’s LPUs don’t need speedy data delivery like Nvidia GPUs do because they don’t have HBM in their system. They use SRAM, which is regarding 20 times faster than what GPUs use. Since inference… pic.twitter.com/mLKn81KzhP
— k_zer0s (@k_zer0s) February 19, 2024
Open for free trial
Currently Groq developmentFree trialthe official website provides 3 AI chatbots with different LLM models, including Llama 2 by Meta, Mixtral-8x7B and Mistral 7B by Mistal AI. Users can experience chatbots with fast response speeds powered by LPU for free.
📍Related reports📍
Huida’s financial report is released: revenue increased by 265% year-on-year, exceeding expectations, and Nvidia rose 9% following the market closed! AI concept coins are rising (TAO, RNDR, AGIX..)
SoftBank Masayoshi Son launches “Izanagi” project to fight with NVIDIA!New AI chip company plans to raise US$100 billion
The female stock goddess’ favorite is not NVIDIA or Tesla. Ark Investment’s largest holding is this crypto company.
1708860037
#startup #Groq #launches #language #chip #LPUSpeed #rolls #Huida #GPU #LLM #response #time #disappears #DongZu #DongTun #influential #blockchain #news #media