A Reinforcement Learning-based orchestrator for a multi-agent code generation system, enhanced with Retrieval-Augmented Generation (RAG) for improved code quality. The RL agent learns to optimally ...
This repo is an implementation of an RL attribution project. Unlike supervised learning, the plocy decides what data to collect - so attribution is more complicated. The main thing it is trying to ...