Deepscaler - 検索 News

DeepScaler Tiny 1.5B DeepSeek R1 Clone Beats OpenAI o1-Preview at Maths

A research team at Berkeley has introduced an innovative artificial intelligence model, DeepScaler, that challenges traditional assumptions about AI performance. With a modest size of just 1.5 billion ...

GitHub

jacovanwyk/DeepScaleR_1.5B

DeepScaleR is an open-source project to fully democratize reinforcement learning (RL) for LLMs and reproduce DeepSeek R1 and OpenAI O1/O3 at scale on real tasks. For all releases, we open source all ...

note

【生成AIニュース】『Magic 1-For-1』『VidCRAFT3』『Enhance-A-Video』『Huginn ...

過去のニュースのアーカイブになりますが、困った時に使えるようなAIをご紹介しています。他にもバージョンアップした物なども最新情報でご紹介している物の詳細情報なども載っています。月額ではなく買い切りのマガジンなので、一度買って ...

PR TIMES

数学推論とマルチタスクに特化した超小型LLM「QwQ-32B-Distill-Qwen-1.5B ...

Axcxept株式会社は本日、Multitask 性能と数学推論性能を、わずか2日間の強化学習で、飽和状態だった性能をさらに向上させた超小型言語モデル（LLM）『QwQ-32B-Distill-Qwen-1.5B-Alpha』をオープンソースで公開しました。本モデルは、deepseek-aiの長考モデル、DeepSeek-R1 ...

GitHub

grpo-deepscaler.md

uv run examples/run_grpo.py --config=examples/configs/recipes/llm/grpo-deepscaler-1.5b-8K.yaml uv run examples/run_grpo.py --config=examples/configs/recipes/llm/grpo ...

marktechpost

This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with ...

Large language models have transformed how machines comprehend and generate text, especially in complex problem-solving areas like mathematical reasoning. These systems, known as R1-like models, are ...

一部の結果でアクセス不可の可能性があるため、非表示になっています。

アクセス不可の結果を表示する