User profiles for Mengdi Wang
Mengdi WangCenter for Statistics & Machine Learning, ECE, Princeton University Verified email at princeton.edu Cited by 5375 |
Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions
Classical stochastic gradient methods are well suited for minimizing expected-value objective
functions. However, they do not apply to the minimization of a nonlinear function involving …
functions. However, they do not apply to the minimization of a nonlinear function involving …
Model-based reinforcement learning with value-targeted regression
This paper studies model-based reinforcement learning (RL) for regret minimization. We focus
on finite-horizon episodic RL where the transition model $ P $ belongs to a known family …
on finite-horizon episodic RL where the transition model $ P $ belongs to a known family …
[HTML][HTML] Vascularized human cortical organoids (vOrganoids) model cortical development in vivo
Y Shi, L Sun, M Wang, J Liu, S Zhong, R Li, P Li… - PLoS …, 2020 - journals.plos.org
Modeling the processes of neuronal progenitor proliferation and differentiation to produce
mature cortical neuron subtypes is essential for the study of human brain development and the …
mature cortical neuron subtypes is essential for the study of human brain development and the …
Minimax-optimal off-policy evaluation with linear function approximation
This paper studies the statistical theory of off-policy evaluation with function approximation
in batch data reinforcement learning problem. We consider a regression-based fitted Q-…
in batch data reinforcement learning problem. We consider a regression-based fitted Q-…
Sample-optimal parametric q-learning using linearly additive features
Consider a Markov decision process (MDP) that admits a set of state-action features, which
can linearly express the process’s probabilistic transition model. We propose a parametric Q-…
can linearly express the process’s probabilistic transition model. We propose a parametric Q-…
1xn pattern for pruning convolutional neural networks
Though network pruning receives popularity in reducing the complexity of convolutional
neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as …
neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as …
Near-optimal time and sample complexities for solving Markov decision processes with a generative model
In this paper we consider the problem of computing an $\epsilon $-optimal policy of a discounted
Markov Decision Process (DMDP) provided we can only access its transition function …
Markov Decision Process (DMDP) provided we can only access its transition function …
Approximation methods for bilevel programming
In this paper, we study a class of bilevel programming problem where the inner objective
function is strongly convex. More specifically, under some mile assumptions on the partial …
function is strongly convex. More specifically, under some mile assumptions on the partial …
Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound
Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the
state-action space is large. A common practice is to parameterize the high-dimensional …
state-action space is large. A common practice is to parameterize the high-dimensional …
Visual adversarial examples jailbreak aligned large language models
Warning: this paper contains data, prompts, and model outputs that are offensive in nature.
Recently, there has been a surge of interest in integrating vision into Large Language Models …
Recently, there has been a surge of interest in integrating vision into Large Language Models …
Related searches
- mengdi wang princeton
- mengdi wang lin yang
- mengdi wang markov decision process
- hao lu mengdi wang
- mengdi wang kth
- mengdi wang chongqing university
- mengdi wang bilevel optimization
- mengdi wang stochastic composite
- mengdi wang composite optimization
- mengdi wang bertsekas
- mengdi wang markov decision
- mengdi wang saddle point
- mengdi wang primal dual
- mengdi wang scgd
- mengdi wang compositional optimization
- mengdi wang wainwright