Google Scholar

User profiles for Mengdi Wang

Mengdi Wang

Center for Statistics & Machine Learning, ECE, Princeton University

Verified email at princeton.edu

Cited by 5375

[PDF] arxiv.org

Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions

M Wang, EX Fang, H Liu - Mathematical Programming, 2017 - Springer

Classical stochastic gradient methods are well suited for minimizing expected-value objective
functions. However, they do not apply to the minimization of a nonlinear function involving …

Save Cite Cited by 257 Related articles All 16 versions

[PDF] mlr.press

Model-based reinforcement learning with value-targeted regression

…, Z Jia, C Szepesvari, M Wang… - … on Machine Learning, 2020 - proceedings.mlr.press

This paper studies model-based reinforcement learning (RL) for regret minimization. We focus
on finite-horizon episodic RL where the transition model $ P $ belongs to a known family …

Save Cite Cited by 302 Related articles All 9 versions View as HTML

[HTML] plos.org

[HTML][HTML] Vascularized human cortical organoids (vOrganoids) model cortical development in vivo

Y Shi, L Sun, M Wang, J Liu, S Zhong, R Li, P Li… - PLoS …, 2020 - journals.plos.org

Modeling the processes of neuronal progenitor proliferation and differentiation to produce
mature cortical neuron subtypes is essential for the study of human brain development and the …

Save Cite Cited by 242 Related articles All 14 versions Cached

[PDF] mlr.press

Minimax-optimal off-policy evaluation with linear function approximation

Y Duan, Z Jia, M Wang - International Conference on …, 2020 - proceedings.mlr.press

This paper studies the statistical theory of off-policy evaluation with function approximation
in batch data reinforcement learning problem. We consider a regression-based fitted Q-…

Save Cite Cited by 155 Related articles All 6 versions View as HTML

[PDF] mlr.press

Sample-optimal parametric q-learning using linearly additive features

L Yang, M Wang - International conference on machine …, 2019 - proceedings.mlr.press

Consider a Markov decision process (MDP) that admits a set of state-action features, which
can linearly express the process’s probabilistic transition model. We propose a parametric Q-…

Save Cite Cited by 324 Related articles All 10 versions View as HTML

[PDF] arxiv.org

1xn pattern for pruning convolutional neural networks

…, Y Li, B Chen, F Chao, M Wang… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org

Though network pruning receives popularity in reducing the complexity of convolutional
neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as …

Save Cite Cited by 42 Related articles All 8 versions

[PDF] neurips.cc

Near-optimal time and sample complexities for solving Markov decision processes with a generative model

A Sidford, M Wang, X Wu, L Yang… - Advances in Neural …, 2018 - proceedings.neurips.cc

In this paper we consider the problem of computing an $\epsilon $-optimal policy of a discounted
Markov Decision Process (DMDP) provided we can only access its transition function …

Save Cite Cited by 234 Related articles All 7 versions View as HTML

[PDF] arxiv.org

Approximation methods for bilevel programming

S Ghadimi, M Wang - arXiv preprint arXiv:1802.02246, 2018 - arxiv.org

In this paper, we study a class of bilevel programming problem where the inner objective
function is strongly convex. More specifically, under some mile assumptions on the partial …

Save Cite Cited by 204 Related articles All 2 versions View as HTML

[PDF] mlr.press

Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound

L Yang, M Wang - International Conference on Machine …, 2020 - proceedings.mlr.press

Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the
state-action space is large. A common practice is to parameterize the high-dimensional …

Save Cite Cited by 303 Related articles All 6 versions View as HTML

[PDF] aaai.org

Visual adversarial examples jailbreak aligned large language models

…, K Huang, A Panda, P Henderson, M Wang… - Proceedings of the …, 2024 - ojs.aaai.org

Warning: this paper contains data, prompts, and model outputs that are offensive in nature.
Recently, there has been a surge of interest in integrating vision into Large Language Models …

Save Cite Cited by 41 Related articles All 2 versions View as HTML

Cite

Advanced search

Saved to My library

User profiles for Mengdi Wang

Mengdi Wang

Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions

Model-based reinforcement learning with value-targeted regression

[HTML][HTML] Vascularized human cortical organoids (vOrganoids) model cortical development in vivo

Minimax-optimal off-policy evaluation with linear function approximation

Sample-optimal parametric q-learning using linearly additive features

1xn pattern for pruning convolutional neural networks

Near-optimal time and sample complexities for solving Markov decision processes with a generative model

Approximation methods for bilevel programming

Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound

Visual adversarial examples jailbreak aligned large language models

Related searches