User profiles for Jacob Steinhardt
Jacob SteinhardtStanford University Verified email at cs.stanford.edu Cited by 16707 |
Natural adversarial examples
…, K Zhao, S Basart, J Steinhardt… - Proceedings of the …, 2021 - openaccess.thecvf.com
We introduce two challenging datasets that reliably cause machine learning model performance
to substantially degrade. The datasets are collected with a simple adversarial filtration …
to substantially degrade. The datasets are collected with a simple adversarial filtration …
Certified defenses against adversarial examples
While neural networks have achieved high accuracy on standard image classification
benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations …
benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations …
Unsolved problems in ml safety
Machine learning (ML) systems are rapidly increasing in size, are acquiring new capabilities,
and are increasingly deployed in high-stakes settings. As with other powerful technologies, …
and are increasingly deployed in high-stakes settings. As with other powerful technologies, …
The malicious use of artificial intelligence: Forecasting, prevention, and mitigation
This report surveys the landscape of potential security threats from malicious uses of AI, and
proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the …
proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the …
Concrete problems in AI safety
Rapid progress in machine learning and artificial intelligence (AI) has brought increasing
attention to the potential impacts of AI technologies on society. In this paper we discuss one …
attention to the potential impacts of AI technologies on society. In this paper we discuss one …
Grounding representation similarity through statistical testing
To understand neural network behavior, recent works quantitatively compare different
networks' learned representations using canonical correlation analysis (CCA), centered kernel …
networks' learned representations using canonical correlation analysis (CCA), centered kernel …
Robust moment estimation and improved clustering via sum of squares
We develop efficient algorithms for estimating low-degree moments of unknown distributions
in the presence of adversarial outliers and design a new family of convex relaxations for k-…
in the presence of adversarial outliers and design a new family of convex relaxations for k-…
Measuring massive multitask language understanding
We propose a new test to measure a text model's multitask accuracy. The test covers 57 tasks
including elementary mathematics, US history, computer science, law, and more. To attain …
including elementary mathematics, US history, computer science, law, and more. To attain …
Certified defenses for data poisoning attacks
J Steinhardt, PWW Koh… - Advances in neural …, 2017 - proceedings.neurips.cc
Abstract Machine learning systems trained on user-provided data are susceptible to data
poisoning attacks, whereby malicious users inject false training data with the aim of corrupting …
poisoning attacks, whereby malicious users inject false training data with the aim of corrupting …
The many faces of robustness: A critical analysis of out-of-distribution generalization
We introduce four new real-world distribution shift datasets consisting of changes in image
style, image blurriness, geographic location, camera operation, and more. With our new …
style, image blurriness, geographic location, camera operation, and more. With our new …