In a new study, Redwood Research, a research lab for AI alignment, has unveiled that large language models (LLMs) can master "encoded reasoning," a form of steganography. This intriguing phenomenon ...
Chinese tech giant Baidu has unveiled a breakthrough in artificial intelligence that could make language models more reliable and trustworthy. Researchers at the company have created a novel ...
This study establishes a novel framework for systematically evaluating the moral reasoning capabilities of large language models (LLMs) as they increasingly integrate into critical societal domains.
We appreciate Sorin et al. for highlighting critical considerations for future red teaming of large language models (LLMs) in healthcare. We agree that analyzing only final answers overlooks failures ...