(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–11 of 11 results for author: Schrimpf, M

.
  1. arXiv:2406.15109  [pdf, other

    cs.CL cs.LG

    Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network

    Authors: Badr AlKhamissi, Greta Tuckute, Antoine Bosselut, Martin Schrimpf

    Abstract: Large Language Models (LLMs) have been shown to be effective models of the human language system, with some models predicting most explainable variance of brain activity in current datasets. Even in untrained models, the representations induced by architectural priors can exhibit reasonable alignment to brain data. In this work, we investigate the key architectural components driving the surprisin… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Preprint

  2. arXiv:2312.00575  [pdf, other

    cs.CL

    Instruction-tuning Aligns LLMs to the Human Brain

    Authors: Khai Loong Aw, Syrielle Montariol, Badr AlKhamissi, Martin Schrimpf, Antoine Bosselut

    Abstract: Instruction-tuning is a widely adopted method of finetuning that enables large language models (LLMs) to generate output that more closely resembles human responses to natural language queries, in many cases leading to human-level performance on diverse testbeds. However, it remains unclear whether instruction-tuning truly makes LLMs more similar to how humans process language. We investigate the… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  3. Beyond linear regression: mapping models in cognitive neuroscience should align with research goals

    Authors: Anna A. Ivanova, Martin Schrimpf, Stefano Anzellotti, Noga Zaslavsky, Evelina Fedorenko, Leyla Isik

    Abstract: Many cognitive neuroscience studies use large feature sets to predict and interpret brain activity patterns. Feature sets take many forms, from human stimulus annotations to representations in deep neural networks. Of crucial importance in all these studies is the mapping model, which defines the space of possible relationships between features and neural data. Until recently, most encoding and de… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: Accepted at Neurons, Brain, Data, and Theory

    Journal ref: Neurons, Behavior, Data analysis, and Theory, 2022

  4. arXiv:2007.04954  [pdf, other

    cs.CV cs.GR cs.LG cs.RO

    ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation

    Authors: Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund, David Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh H. McDermott, Daniel L. K. Yamins

    Abstract: We introduce ThreeDWorld (TDW), a platform for interactive multi-modal physical simulation. TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments. Unique properties include: real-time near-photo-realistic image rendering; a library of objects and environments, and routines for their customization; generative procedu… ▽ More

    Submitted 28 December, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Oral Presentation at NeurIPS 21 Datasets and Benchmarks Track. Project page: http://www.threedworld.org

  5. arXiv:1912.04783  [pdf, other

    cs.LG cs.CV stat.ML

    Frivolous Units: Wider Networks Are Not Really That Wide

    Authors: Stephen Casper, Xavier Boix, Vanessa D'Amario, Ling Guo, Martin Schrimpf, Kasper Vinken, Gabriel Kreiman

    Abstract: A remarkable characteristic of overparameterized deep neural networks (DNNs) is that their accuracy does not degrade when the network's width is increased. Recent evidence suggests that developing compressible representations is key for adjusting the complexity of large networks to the learning task at hand. However, these compressible representations are poorly understood. A promising strand of r… ▽ More

    Submitted 31 May, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 2021

  6. arXiv:1909.06161  [pdf, other

    cs.CV cs.LG cs.NE eess.IV q-bio.NC

    Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs

    Authors: Jonas Kubilius, Martin Schrimpf, Kohitij Kar, Ha Hong, Najib J. Majaj, Rishi Rajalingham, Elias B. Issa, Pouya Bashivan, Jonathan Prescott-Roy, Kailyn Schmidt, Aran Nayebi, Daniel Bear, Daniel L. K. Yamins, James J. DiCarlo

    Abstract: Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categoriz… ▽ More

    Submitted 28 October, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: NeurIPS 2019 (Oral). Code available at https://github.com/dicarlolab/neurips2019

  7. arXiv:1904.09330  [pdf, other

    cs.NE

    Continual Learning with Self-Organizing Maps

    Authors: Pouya Bashivan, Martin Schrimpf, Robert Ajemian, Irina Rish, Matthew Riemer, Yuhai Tu

    Abstract: Despite remarkable successes achieved by modern neural networks in a wide range of applications, these networks perform best in domain-specific stationary environments where they are trained only once on large-scale controlled data repositories. When exposed to non-stationary learning environments, current neural networks tend to forget what they had previously learned, a phenomena known as catast… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

    Comments: Continual Learning Workshop - NeurIPS 2018

  8. arXiv:1712.07316  [pdf, other

    cs.CL cs.LG stat.ML

    A Flexible Approach to Automated RNN Architecture Generation

    Authors: Martin Schrimpf, Stephen Merity, James Bradbury, Richard Socher

    Abstract: The process of designing neural architectures requires expert knowledge and extensive trial and error. While automated architecture search may simplify these requirements, the recurrent neural network (RNN) architectures generated by existing methods are limited in both flexibility and components. We propose a domain-specific language (DSL) for use in automated architecture search which can produc… ▽ More

    Submitted 19 December, 2017; originally announced December 2017.

  9. arXiv:1706.02240  [pdf

    q-bio.NC cs.AI cs.CV cs.LG

    Recurrent computations for visual pattern completion

    Authors: Hanlin Tang, Martin Schrimpf, Bill Lotter, Charlotte Moerman, Ana Paredes, Josue Ortega Caro, Walter Hardesty, David Cox, Gabriel Kreiman

    Abstract: Making inferences from partial information constitutes a critical aspect of cognition. During visual perception, pattern completion enables recognition of poorly visible or occluded objects. We combined psychophysics, physiology and computational models to test the hypothesis that pattern completion is implemented by recurrent computations and present three pieces of evidence that are consistent w… ▽ More

    Submitted 6 April, 2018; v1 submitted 7 June, 2017; originally announced June 2017.

  10. arXiv:1703.08245  [pdf, other

    cs.LG cs.CV

    On the Robustness of Convolutional Neural Networks to Internal Architecture and Weight Perturbations

    Authors: Nicholas Cheney, Martin Schrimpf, Gabriel Kreiman

    Abstract: Deep convolutional neural networks are generally regarded as robust function approximators. So far, this intuition is based on perturbations to external stimuli such as the images to be classified. Here we explore the robustness of convolutional neural networks to perturbations to the internal weights and architecture of the network itself. We show that convolutional networks are surprisingly robu… ▽ More

    Submitted 23 March, 2017; originally announced March 2017.

    Comments: under review at ICML 2017

  11. arXiv:1611.08903  [pdf, other

    cs.LG stat.ML

    Should I use TensorFlow

    Authors: Martin Schrimpf

    Abstract: Google's Machine Learning framework TensorFlow was open-sourced in November 2015 [1] and has since built a growing community around it. TensorFlow is supposed to be flexible for research purposes while also allowing its models to be deployed productively. This work is aimed towards people with experience in Machine Learning considering whether they should use TensorFlow in their environment. Sever… ▽ More

    Submitted 27 November, 2016; originally announced November 2016.

    Comments: Seminar Paper