synopsis
About the workshop.

Representation learning
Representation learning is the current de-facto research paradigm in Computer Vision, building on deep learning methods trained on large datasets to learn visual representations. In this ECCV workshop we explore what makes representation learning research “scientific”. Other scientific research fields revolve around empirical theories, e.g.: the theory of evolution (Darwin, 1859) in Biology, the theory of relativity (Einstein, 1916) in Physics, and dual process theory (Kahneman, 2011) in Psychology. In representation learning, however, the role of empirical theories is less clear. This workshops promotes empirical theory research in representation learning. While doing so, it explores questions such as: what is empirical theory? What empirical theories do we (implicitly) have? How can we encourage rigorous empirical research methods? How can we build empirical theories?
Machine learning theory vs empirical theory
Empirical theory differs from what is typically called theory in machine learning. In machine learning, computational learning theory (Shalev-Shwartz, 2014) is rigorously math driven, exploring mathematical performance bounds, time complexity, and feasibility of learning. It has important and deep mathematical results including theorems about no-free-lunch (Wolpert & Macready, 1997) and universal approximation (Cybenko, 1989), VC-theory (Vapnik, 1995), PAC-learning (Valiant, 1984), Rademacher complexity (Shalev-Shwartz, 2014), the neural tangent kernel (Jacot et al., 2021), etc.. In representation learning, however, models are trained by searching through a high dimensional non-convex solution space while navigating an exponential search space of hyper-parameters (Choi et al., 2019); (Feurer & Hutter, 2019), which precludes standard ML practices for hyper-parameter tuning like e.g. (nested) cross-validation (Krstajic et al., 2014) which is prohibitively computationally expensive due to the large datasets involved, and thus simply not done in practice (Bouthillier et al., 2021); (Bouthillier et al., 2019). Precise, rigorous, theoretical optimality is found as too restrictive and seen as a practically unnecessary artificial hurdle. Optimality is replaced with empirical ‘good enough’ approximate optimizations. This workshop does not focus on mathematical proofs, but in contrast, focuses on empirical experimental evidence.
Empirical evidence and benchmarking in representation learning
Empirical evidence is crucial in representation learning. Many papers experimentally demonstrate that a method can be engineered to boldly improve upon existing state of the art benchmark scores. That such a benchmark-breaking method even exists is valuable existential empirical evidence and propels the field (Hardt, 2025); (Krizhevsky, 2014); (Russakovsky et al., 2015). Often, however, it is not clear where the improvements originate (de Boer et al., 2023); (Bouthillier et al., 2021); (Lucic et al., 2018); (Musgrave et al., 2020); (Musgrave et al., 2021). Without such understanding, these empirical results lack empirical theoretical hypotheses: a clear, causal, link to the reasons that underlie the improvement. If, for example, an improvement is due to better tuned hyper-parameters, this does not increase our understanding, because we already know that better tuning hyper-parameters helps (Anand et al., 2020); (Bouthillier et al., 2019); (Brigato et al., 2021); (Picard, 2021). What, for example, would make it interesting here, is if the method for finding these hyper-parameters would generalize to other work, and how this hypothesis can be empirically justified. In this workshop, we aim to go beyond individual systems that work well, and instead aim for empirical theory: findings that generalize beyond idiosyncratic combinations of datasets, hyper-parameter settings and accidental optimization minima. We promote hypothesis-driven empirical research that gives insight, and breaking SOTA is neither sufficient nor necessary.
Empirical theory: rigor in experimental evidence
With empirical theory, we aim for a sweet spot between theoretical mathematical models on one side, and purely empirical benchmark-breaking systems on the other side. It’s about tracing the sources of empirical gain (Lipton & Steinhardt, 2019), and explicitly providing experimental, hypothesis-driven, empirical evidence that separates explanation from speculation (Lipton & Steinhardt, 2019). It is about understanding the training and the evaluating of deep learning models, their design and their components, optimizers, losses, and how this generalizes over problem and datasets types. We aim for understanding when and what makes a method applicable in other work. Such type of research, of course, already existed in the broader representation learning literature. Examples include the shift-invariance of CNNs (Chaman & Dokmanic, 2021); (Kayhan & Gemert, 2020); (Zhang, 2019), and their kernel size (Ding et al., 2022); (Grabinski et al., 2023); (Tomen & van Gemert, 2021), residual connections variants (Greff et al., 2017); (Veit et al., 2016); (Zhu et al., 2024), summing or multiplying activations (Ma et al., 2024), gating (Qiu et al., 2025), registers in transformers (Darcet et al., 2024) (Jiang et al., 2025); (Shi et al., 2026), object-centric learning (Rubinstein et al., 2025), subliminal learning (Schrodi et al., 2025) and even empirical reproducibility (Bouthillier et al., 2019); (Pineau et al., 2021); (Raff, 2019); (Yildiz et al., 2021), and many more. Beyond such insight-driven papers, there is work on explicit empirical theory building, including empirical neural scaling laws (Bahri et al., 2024), the lottery ticket hypothesis (Frankle & Carbin, 2018) (Pinson, 2026), the Platonic representation hypothesis (Huh et al., 2024), etc. This workshop shines a spotlight on such work, aiming to incentivize empirical insights, and empirical theory building for the entire field of representation learning.
Relation to other workshops and initiatives
Pre-registration in machine learning was explored at NeurIPS in 2011 in the Pre-registration workshop: An alternative publication model for machine learning research (Albanie et al., 2021). The ML-Retrospectives, Surveys & Meta-Analyses workshop (Yadav & et al., 2020) hosted at NeurIPS 2019, ICML 2020, NeurIPS 2020 and the recent Metascience for Machine Learning (Hung et al., 2025) initiative are related to meta-science for representation learning. The recent Mechanistic Interpretability Workshop (Nanda & et al., 2025) at ICML 2024 and NeurIPS 2025 and the Workshop on Scientific Methods for Understanding Deep Learning (Kadkhodaie & et al., 2026) at ICLR 2025/2026 are great examples of what we aim to achieve, albeit now in representation learning. We will make use of pre-registration to separate interesting questions from outcomes, and are inspired by meta-science.
We here aim to foster, and build a community for understanding-based research that is currently scattered over multiple venues, and mixed in with improvement-based research.
Bibliography
2026
- Vision Transformers Need More Than RegistersIn Proceedings of the IEEE/CVF International Conference on Computer Vision, 2026
- It’s not a Lottery, it’s a Race: Understanding How Gradient Descent Adapts the Network’s Capacity to the Task2026
-
2025
-
- Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-FreearXiv preprint arXiv:2505.06708, 2025
- Vision Transformers Don’t Need Trained RegistersIn The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
- Are we done with object-centric learning?arXiv preprint arXiv:2504.07092, 2025
- Towards Understanding Subliminal Learning: When and How Hidden Biases TransferarXiv preprint arXiv:2509.23886, 2025
-
-
2024
- Hyper-ConnectionsIn The Thirteenth International Conference on Learning Representations, 2024
- Rewrite the StarsIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2024
- Vision Transformers Need RegistersIn International Conference on Learning Representations (ICLR), 2024
- Explaining neural scaling lawsProceedings of the National Academy of Sciences, 2024
- Position: The platonic representation hypothesisIn Forty-first International Conference on Machine Learning, 2024
2023
- Is there progress in activity progress prediction?In 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2023
- Fix your downsampling asap! be natively more robust via aliasing and spectral artifact free poolingarXiv preprint arXiv:2307.09804, 2023
2022
- Scaling up your kernels to 31x31: Revisiting large kernel design in cnnsIn Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022
2021
- Neural tangent kernel: convergence and generalization in neural networksIn Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, 2021
- Accounting for variance in machine learning benchmarksProceedings of Machine Learning and Systems, 2021
- Unsupervised domain adaptation: A reality checkarXiv preprint arXiv:2111.15672, 2021
- Tune It or Don’t Use It: Benchmarking Data-Efficient Image ClassificationIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct 2021
- Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer visionArXiv, 2021
- Truly Shift-Invariant Convolutional Neural NetworksIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2021
- Spectral leakage and rethinking the kernel size in cnnsIn Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021
- Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program)Journal of machine learning research, 2021
- ReproducedPapers. org: Openly teaching and structuring machine learning reproducibilityIn International Workshop on Reproducible Research in Pattern Recognition, 2021
- The pre-registration workshop: An alternative publication model for machine learning research2021
2020
- A metric learning reality checkIn European Conference on Computer Vision, 2020
- Black magic in deep learning: How human skill impacts network trainingarXiv preprint arXiv:2008.05981, 2020
- On translation invariance in cnns: Convolutional layers can exploit absolute spatial locationIn Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020
-
2019
- On empirical comparisons of optimizers for deep learningarXiv preprint arXiv:1910.05446, 2019
- Hyperparameter optimizationIn Automated machine learning: Methods, systems, challenges, 2019
- Unreproducible Research is ReproducibleIn Proceedings of the 36th International Conference on Machine Learning, 2019
- Unreproducible Research is ReproducibleIn Proceedings of the 36th International Conference on Machine Learning, 2019
- Research for practice: troubling trends in machine-learning scholarshipCommunications of the ACM, 2019
- Making convolutional networks shift-invariant againIn International conference on machine learning, 2019
- A Step Toward Quantifying Independently Reproducible Machine Learning Research2019
2018
- Are gans created equal? a large-scale studyAdvances in neural information processing systems, 2018
- The lottery ticket hypothesis: Finding sparse, trainable neural networksarXiv preprint arXiv:1803.03635, 2018
2017
- Highway and Residual Networks learn Unrolled Iterative EstimationIn International Conference on Learning Representations, 2017
2016
- Residual networks behave like ensembles of relatively shallow networksAdvances in neural information processing systems, 2016
2015
- ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision (IJCV), 2015
2014
- Understanding Machine Learning2014
- Cross-validation pitfalls when selecting and assessing regression and classification modelsJournal of cheminformatics, 2014
- One weird trick for parallelizing convolutional neural networksarXiv preprint arXiv:1404.5997, 2014
2011
- Thinking, Fast and Slow2011
1997
- No Free Lunch Theorems for OptimizationIEEE Transactions on Evolutionary Computation, 1997
1995
- The nature of statistical learning theory1995
1989
- Approximation by superpositions of a sigmoidal functionIn Mathematics of Control, Signals and Systems,, 1989
1984
- A theory of the learnableCommun. ACM, 1984
1916
- The Foundation of the General Theory of Relativity1916
1859
- On the Origin of the Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life1859