Publications

SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration

Joseph M. Cavanagh, Kunyang Sun, Andrew Gritsevskiy, Dorian Bagni, Thomas D. Bannister, Teresa Head-Gordon

Link: https://arxiv.org/abs/2409.02231, or cvnd.sh/sl

Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits

Andis Draguns*, Andrew Gritsevskiy*, Sumeet Ramesh Motwani, Charlie Rogers-Smith, Jeffrey Ladish, Christian Schroeder de Witt

Link: https://arxiv.org/abs/2406.02619, or cvnd.sh/backdoor

REBUS: A Robust Evaluation Benchmark of Understanding Symbols

Andrew Gritsevskiy, Arjun Panickssery, Aaron Kirtland, Derik Kauffman, Hans Gundlach, Irina Gritsevskaya, Joe Cavanagh, Jonathan Chiang, Lydia La Roux, Michelle Hung

Link: https://arxiv.org/abs/2401.05604, or cvnd.sh/rebus

Inverse Scaling: When Bigger Isn't Better

Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim, Samuel R. Bowman, Ethan Perez

Link: https://arxiv.org/abs/2306.09479, or cvnd.sh/invs

Predicting the Future of AI with AI: High-quality link prediction in an exponentially growing knowledge network

Mario Krenn, Lorenzo Buffoni, Bruno Coutinho, Sagi Eppel, Jacob Gates Foster, Andrew Gritsevskiy, Harlin Lee, Yichao Lu, Joao P. Moutinho, Nima Sanjabi, Rishi Sonthalia, Ngoc Mai Tran, Francisco Valente, Yangxinyu Xie, Rose Yu, Michael Kopp

Link: https://www.nature.com/articles/s42256-023-00735-0, or cvnd.sh/futureai

An Unstructured Mesh Approach to Nonlinear Noise Reduction for Coupled Systems

Aaron Kirtland, Jonah Botvinick-Greenhouse, Marianne DeBrito, Megan Osborne, Casey Johnson, Robert S. Martin, Samuel J. Araki, Daniel Q. Eckhardt

Link: https://arxiv.org/abs/2209.05944, or cvnd.sh/mesh

Follow us on Google Scholar!