Model Quantization - 検索 News

1 日

「Gemma 4」のメモリ消費を大幅削減する「QAT」、品質はそのまま ...

米Google DeepMindは6月5日（現地時間）、オープンモデル「Gemma 4」ファミリーの「Quantization-Aware ...

3 日

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

You can now download Gemma 4 models with quantization-aware training to reduce the amount of mobile memory required to 1GB.

1 日on MSN

GoogleがスマホやノートPCでAIをローカル実行するための省メモリ化技術「QAT」をGemma 4に導入、Gemma 4 E2Bがわずか0.84GBのメモリで動作

AIを実行するには大容量のメモリが必要であり、AIモデル側のメモリ使用量を削減する技術として「量子化」が広く用いられています。新たに、Googleが「学習段階で量子化をシミュレートする」というアプローチを採用した省メモリ版Gemma ...

InfoWorld

What is model quantization? Smaller, faster LLMs

Reducing the precision of model weights can make deep neural networks run faster in less GPU memory, while preserving model accuracy. If ever there were a salient example of a counter-intuitive ...

19 日

Cohere cracks lossless quantization and native citations with first full Apache 2.0 ...

Using special tags embedded in the output, the model directly links every factual claim it makes to the specific source ...

PC Watch on MSN

Google、メモリ1GB未満で品質劣化少ないGemma 4が動くQATモデル無償提供

Google DeepMindは6月5日、大規模言語モデル「Gemma 4」のメモリ要件を削減しつつ、性能を最大化する「QAT(Quantization-Aware Training)」最適化チェックポイントをリリースした。Hugging ...

TechCrunch

A popular technique to make AI more efficient has drawbacks

One of the most widely used techniques to make AI models more efficient, quantization, has limits — and the industry could be fast approaching them. In the context of AI, quantization refers to ...

GIGAZINE

Dynamic quantization model that reduces the size of DeepSeek-R1 by up to 80% is now available

DeepSeek-R1, released by a Chinese AI company, has the same performance as OpenAI's inference model o1, but its model data is open source. Unsloth, an AI development team run by two brothers, Daniel ...

一部の結果でアクセス不可の可能性があるため、非表示になっています。

アクセス不可の結果を表示する