User profiles for Mengdi Wang

Mengdi Wang

Center for Statistics & Machine Learning, ECE, Princeton University
Verified email at princeton.edu
Cited by 5375

Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions

M Wang, EX Fang, H Liu - Mathematical Programming, 2017 - Springer
Classical stochastic gradient methods are well suited for minimizing expected-value objective
functions. However, they do not apply to the minimization of a nonlinear function involving …

Model-based reinforcement learning with value-targeted regression

…, Z Jia, C Szepesvari, M Wang… - … on Machine Learning, 2020 - proceedings.mlr.press
This paper studies model-based reinforcement learning (RL) for regret minimization. We focus
on finite-horizon episodic RL where the transition model $ P $ belongs to a known family …

[HTML][HTML] Vascularized human cortical organoids (vOrganoids) model cortical development in vivo

Y Shi, L Sun, M Wang, J Liu, S Zhong, R Li, P Li… - PLoS …, 2020 - journals.plos.org
Modeling the processes of neuronal progenitor proliferation and differentiation to produce
mature cortical neuron subtypes is essential for the study of human brain development and the …

Minimax-optimal off-policy evaluation with linear function approximation

Y Duan, Z Jia, M Wang - International Conference on …, 2020 - proceedings.mlr.press
This paper studies the statistical theory of off-policy evaluation with function approximation
in batch data reinforcement learning problem. We consider a regression-based fitted Q-…

Sample-optimal parametric q-learning using linearly additive features

L Yang, M Wang - International conference on machine …, 2019 - proceedings.mlr.press
Consider a Markov decision process (MDP) that admits a set of state-action features, which
can linearly express the process’s probabilistic transition model. We propose a parametric Q-…

1xn pattern for pruning convolutional neural networks

…, Y Li, B Chen, F Chao, M Wang… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
Though network pruning receives popularity in reducing the complexity of convolutional
neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as …

Near-optimal time and sample complexities for solving Markov decision processes with a generative model

A Sidford, M Wang, X Wu, L Yang… - Advances in Neural …, 2018 - proceedings.neurips.cc
In this paper we consider the problem of computing an $\epsilon $-optimal policy of a discounted
Markov Decision Process (DMDP) provided we can only access its transition function …

Approximation methods for bilevel programming

S Ghadimi, M Wang - arXiv preprint arXiv:1802.02246, 2018 - arxiv.org
In this paper, we study a class of bilevel programming problem where the inner objective
function is strongly convex. More specifically, under some mile assumptions on the partial …

Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound

L Yang, M Wang - International Conference on Machine …, 2020 - proceedings.mlr.press
Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the
state-action space is large. A common practice is to parameterize the high-dimensional …

Visual adversarial examples jailbreak aligned large language models

…, K Huang, A Panda, P Henderson, M Wang… - Proceedings of the …, 2024 - ojs.aaai.org
Warning: this paper contains data, prompts, and model outputs that are offensive in nature.
Recently, there has been a surge of interest in integrating vision into Large Language Models …