User profiles for Sepp Hochreiter

Sepp Hochreiter

Institute for Machine Learning, Johannes Kepler University Linz
Verified email at ml.jku.at
Cited by 149688

Long short-term memory

S Hochreiter, J Schmidhuber - Neural computation, 1997 - ieeexplore.ieee.org
Learning to store information over extended time intervals by recurrent backpropagation takes
a very long time, mostly because of insufficient, decaying error backflow. We briefly review …

The vanishing gradient problem during learning recurrent neural nets and problem solutions

S Hochreiter - International Journal of Uncertainty, Fuzziness and …, 1998 - World Scientific
Recurrent nets are in principle capable to store past inputs to produce the currently desired
output. Because of this property recurrent nets are used in time series prediction and process …

Gans trained by a two time-scale update rule converge to a local nash equilibrium

…, B Nessler, S Hochreiter - Advances in neural …, 2017 - proceedings.neurips.cc
Generative Adversarial Networks (GANs) excel at creating realistic images with complex
models for which maximum likelihood is infeasible. However, the convergence of GAN training …

Fast and accurate deep network learning by exponential linear units (elus)

DA Clevert, T Unterthiner, S Hochreiter - arXiv preprint arXiv:1511.07289, 2015 - arxiv.org
We introduce the "exponential linear unit" (ELU) which speeds up learning in deep neural
networks and leads to higher classification accuracies. Like rectified linear units (ReLUs), …

Self-normalizing neural networks

…, T Unterthiner, A Mayr, S Hochreiter - Advances in neural …, 2017 - proceedings.neurips.cc
Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural
language processing via recurrent neural networks (RNNs). However, success stories of …

[PDF][PDF] Gradient flow in recurrent nets: the difficulty of learning long-term dependencies

S Hochreiter, Y Bengio, P Frasconi, J Schmidhuber - 2001 - researchgate.net
Recurrent networks (crossreference Chapter 12) can, in principle, use their feedback
connections to store representations of recent input events in the form of activations. The most …

[HTML][HTML] DeepTox: toxicity prediction using deep learning

…, G Klambauer, T Unterthiner, S Hochreiter - Frontiers in …, 2016 - frontiersin.org
The Tox21 Data Challenge has been the largest effort of the scientific community to
compare computational methods for toxicity prediction. This challenge comprised 12,000 …

LSTM can solve hard long time lag problems

S Hochreiter, J Schmidhuber - Advances in neural …, 1996 - proceedings.neurips.cc
Standard recurrent nets cannot deal with long minimal time lags between relevant signals.
Several recent NIPS papers propose alter (cid: 173) native methods. We first show: problems …

Flat minima

S Hochreiter, J Schmidhuber - Neural computation, 1997 - direct.mit.edu
We present a new algorithm for finding low-complexity neural networks with high
generalization capability. The algorithm searches for a “flat” minimum of the error function. A flat …

Learning to learn using gradient descent

S Hochreiter, AS Younger, PR Conwell - Artificial Neural Networks …, 2001 - Springer
This paper introduces the application of gradient descent methods to meta-learning. The
concept of “meta-learning”, ie of a system that improves or discovers a learning algorithm, has …