A new study was conducted regarding the results of different language models. The main outcome is: size doesn’t always matter.
Large lang models (LLM) are trained on up to 530 billion parameters which results in significant cost effetcs. The study shows, that models with much smaller parameters like chinchilla (70 billion parameters) outperform ther colleagues, especially when raising training tokens.
The conclusion we can draw from this are:
- it is indeed possible to use only publicliy avalable data to train a perfectly working language model. AI is going to stay, regardless the licensing-wars we will see with OpenAi etc.
- It is possible for companies to add their own „language“ to existing models at a doable pricetag
- You should not stick to one model, buzt be flexible and interchangable with the results by testing, testing, testing.
(Dall-E prompt for header picture: "Create a picture where the language model "Goliath" is being beaten by the language model "Chinchilla", make that a fantasy picture and Goliath being a big, fat bear, as where Chinchilla is a very strong mouse.")