Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Meta to release largest Llama 3 model on July 23 [405B] (theinformation.com)
30 points by htrp on July 12, 2024 | hide | past | favorite | 14 comments


Meta’s strategy seems pretty obvious: dumping so no company can build a moat or successful business around LLM supremacy.


wonderful. keep at it, Meta. they've given me so much hope this past year. they're heading off a future where an oligopoly of massive tech companies gets AGI first, engages in regulatory capture and locks the doors behind them on the most important technology in a century. that was absolutely OpenAI's stated goal, even. LeCunn is truly doing the Lord's work dumping model weights.


I’ve personally run the Llama 3 models locally on a Mac Studio with 128GB of ram, and found it (and other open source models) to perform highly erratically locally.

And while I’ve also encountered quality control issues with the open source models in “professionally hosted” settings as well (e.g., Groq, OpenRouter), it was much worse locally.

Sentences ending abruptly and non-sensical/unrelated words were a common problem.


Those were still quantized right? My understanding is that LLaMA3 does not quantize well at all.


I run llama 3 8B q 4 k m locally, and it's amazing! This is my system prompt:

You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.


At least they put all the relevant information into the title, so that it is not necessary to actually read the article.


Announcements about announcements. I thought it was against HN policy.


Can anyone speak to the practical differences between the 70B model and this?

What text-based cases have their efficacy greatly enhanced by this?


But are they releasing the weights?


I believe so, presuming it’s released the same as 70B recently was. IIRC, I had to consent on Hugging Face to their policy on “research and non-commercial use only” before downloading the model.

https://github.com/ggerganov/llama.cpp/discussions/4576


Well, that's why I am asking: Facebook has not made any legal binding promise to release the weights, and people just assume/hope they will release weights. But OP's 'releasing it' will cover anything from a blog post to an API to a chatbot service in Whatapp. The exact date matters far less than whether there is any new information on the weights being released.


Excited


I can’t pass the paywall. Does the article state whether it will be under the same license as the smaller models?


Nah, it’s a “brief” (as they refer to their blog-like posts) article, so it was short. But it appears Meta is releasing it the same as their other open models.

“Meta Platforms plans to release the largest version of its open-source Llama 3 model on July 23, according to a Meta employee. This version, with 405 billion parameters, or the "settings" that determine how Al models respond to questions, will also be multimodal, meaning that it will be able to understand and generate images and text, The Information previously reported.

Meta in April released two smaller models within the Llama 3 family, with 8 billion and 70 billion parameters, which developers quickly embraced. The earlier launches served to build excitement for the largest Llama 3 model, which was expected to be launched around now. Its release comes about a year after the launch of Llama 2. Meta declined to comment.”




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: