<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RobotWhisperer</title>
	<atom:link href="http://robotwhisperer.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://robotwhisperer.org</link>
	<description>... the website of the Learning, Artificial Intelligence, and Robotics Laboratory (LAIRLab) at Carnegie Mellon led by Drew Bagnell</description>
	<lastBuildDate>Mon, 18 Jul 2011 14:44:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Congrats Kevin and Brian! ICML 2011 Best Paper Award</title>
		<link>http://robotwhisperer.org/uncategorized/congrats-kevin-and-brian-icml-2011-best-paper-award/</link>
		<comments>http://robotwhisperer.org/uncategorized/congrats-kevin-and-brian-icml-2011-best-paper-award/#comments</comments>
		<pubDate>Mon, 18 Jul 2011 14:37:25 +0000</pubDate>
		<dc:creator>lairlab</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=275</guid>
		<description><![CDATA[Computational Rationalization: The Inverse Equilibrium Problem Abstract:Modeling the purposeful behavior of imperfect agents from a small number of observations is a challenging task. When restricted to the single-agent decision-theoretic setting, inverse optimal control techniques assume that observed behavior is an approximately optimal solution to an unknown decision problem. These techniques learn a utility function that [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://www.ri.cmu.edu/publication_view.html?pub_id=6841"><strong>Computational Rationalization: The Inverse Equilibrium Problem</strong></a></p>
<p>Abstract:Modeling the purposeful behavior of imperfect agents from a small number of observations is a challenging task. When restricted to the single-agent decision-theoretic setting, inverse optimal control techniques assume that observed behavior is an approximately optimal solution to an unknown decision problem. These techniques learn a utility function that explains the example behavior and can then be used to accurately predict or imitate future behavior in similar observed or unobserved situations. In this work, we consider similar tasks in competitive and cooperative multi-agent domains. Here, unlike single-agent settings, a player cannot myopically maximize its reward &#8212; it must speculate on how the other agents may act to influence the game&#8217;s outcome. Employing the game-theoretic notion of regret and the principle of maximum entropy, we introduce a technique for predicting and generalizing behavior, as well as recovering a reward function in these domains.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/congrats-kevin-and-brian-icml-2011-best-paper-award/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Welcome to new post-doctoral fellows</title>
		<link>http://robotwhisperer.org/uncategorized/welcome-to-new-post-doctoral-fellows/</link>
		<comments>http://robotwhisperer.org/uncategorized/welcome-to-new-post-doctoral-fellows/#comments</comments>
		<pubDate>Mon, 18 Jul 2011 14:44:07 +0000</pubDate>
		<dc:creator>lairlab</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=278</guid>
		<description><![CDATA[Moslem Kazemi: working on perception and force guided manipulation. Working with Nancy Pollard and Drew Bagnell. Kris Katani: working with Martial Hebert and Drew Bagnell on activity prediction. Paul Vernaza: working with Drew Bagnell on compressed information space reasoning.]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://sites.google.com/site/moslemk/">Moslem Kazemi</a>: working on perception and force guided manipulation. Working with Nancy Pollard and Drew Bagnell.<br />
<a href="www.cs.cmu.edu/~kkitani/">Kris Katani</a>: working with Martial Hebert and Drew Bagnell on activity prediction.<br />
<a href="www.seas.upenn.edu/~vernaza/">Paul Vernaza</a>: working with Drew Bagnell on compressed information space reasoning.</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/welcome-to-new-post-doctoral-fellows/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preprint: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning</title>
		<link>http://robotwhisperer.org/uncategorized/254/</link>
		<comments>http://robotwhisperer.org/uncategorized/254/#comments</comments>
		<pubDate>Thu, 17 Mar 2011 16:25:31 +0000</pubDate>
		<dc:creator>tommyliu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=254</guid>
		<description><![CDATA[Stéphane Ross Geoffrey J. Gordon J. Andrew Bagnell, Carnegie Mellon University To Appear in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), 2011 Link to Paper Abstract: Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Stéphane Ross Geoffrey J. Gordon J. Andrew Bagnell,<br />
Carnegie Mellon University</p>
<p>To Appear in Proceedings of the 14th International Conference on<br />
<em> Artificial Intelligence and Statistics (AISTATS)</em>, 2011</p>
<p><a href="http://www.cs.cmu.edu/~sross1/publications/Ross-AIStats11-NoRegret.pdf">Link to Paper</a></p>
<p>Abstract: Sequential prediction problems such as imitation learning, where<br />
future observations depend on previous predictions (actions), violate the<br />
common i.i.d. assumptions made in statistical learning. This leads to poor<br />
performance in theory and often in practice. Some recent approaches<br />
provide stronger guarantees in this setting, but remain somewhat<br />
unsatisfactory as they train either non-stationary or stochastic policies<br />
and require a large number of iterations. In this paper, we propose a new<br />
iterative algorithm, which trains a stationary deterministic policy, that<br />
can be seen as a no regret algorithm in an online learning setting. We<br />
show that any such no regret algorithm, combined with additional reduction<br />
assumptions, must find a policy with good performance under the<br />
distribution of observations it induces in such sequential settings. We<br />
demonstrate that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling<br />
problem.</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/254/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preprint: Maximum Causal Entropy Correlated Equilibria for Markov Games</title>
		<link>http://robotwhisperer.org/uncategorized/preprint-maximum-causal-entropy-correlated-equilibria-for-markov-games/</link>
		<comments>http://robotwhisperer.org/uncategorized/preprint-maximum-causal-entropy-correlated-equilibria-for-markov-games/#comments</comments>
		<pubDate>Tue, 22 Feb 2011 16:16:42 +0000</pubDate>
		<dc:creator>tommyliu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=247</guid>
		<description><![CDATA[Brian D. Ziebart, J. Andrew Bagnell, Anind K. Dey Carnegie Mellon University To appear at International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011). Link to Paper Motivated by a machine learning perspective&#124;that game theoretic equilibria constraints should serve as guidelines for predicting agents&#8217; strategies, we introduce maximum causal entropy correlated equilibria (MCECE), a [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img src="file:///C:/Users/tianyul/AppData/Local/Temp/moz-screenshot-1.png" alt="" /><img src="file:///C:/Users/tianyul/AppData/Local/Temp/moz-screenshot-2.png" alt="" /><a href="http://robotwhisperer.org/wp-content/uploads/2011/02/Untitled.png"><img class="size-medium wp-image-249 alignleft" title="MarkovianGame" src="http://robotwhisperer.org/wp-content/uploads/2011/02/Untitled-300x192.png" alt="" width="144" height="92" /></a>Brian D. Ziebart, J. Andrew Bagnell, Anind K. Dey<br />
Carnegie Mellon University<br />
To appear at <em> International Conference on Autonomous Agents and             Multiagent Systems (AAMAS 2011).</em></p>
<p><a href="http://www.cs.cmu.edu/~bziebart/publications/maxcausalent-correlated.pdf"><cite>Link to Paper</cite></a></p>
<p>Motivated by a machine learning perspective|that game theoretic<br />
equilibria constraints should serve as guidelines for<br />
predicting agents&#8217; strategies, we introduce maximum causal<br />
entropy correlated equilibria (MCECE), a novel solution<br />
concept for general-sum Markov games. In line with this<br />
perspective, a MCECE strategy prole is a uniquely-dened<br />
joint probability distribution over actions for each game<br />
state that minimizes the worst-case prediction of agents&#8217; actions<br />
under log-loss. Equivalently, it maximizes the worstcase<br />
growth rate for gambling on the sequences of agents&#8217;<br />
joint actions under uniform odds. We present a convex optimization<br />
technique for obtaining MCECE strategy proles<br />
that resembles value iteration in nite-horizon games. We<br />
assess the predictive benets of our approach by predicting<br />
the strategies generated by previously proposed correlated<br />
equilibria solution concepts, and compare against those previous<br />
approaches on that same prediction task.<em></em></p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/preprint-maximum-causal-entropy-correlated-equilibria-for-markov-games/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Amusing New Dodge Commercial</title>
		<link>http://robotwhisperer.org/uncategorized/amusing-new-dodge-commercial/</link>
		<comments>http://robotwhisperer.org/uncategorized/amusing-new-dodge-commercial/#comments</comments>
		<pubDate>Sun, 20 Feb 2011 22:32:49 +0000</pubDate>
		<dc:creator>tommyliu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=230</guid>
		<description><![CDATA[Dodge has released an amusing new commercial referencing self-driving cars and other robotics advances. Well, those of us on Team Robot take it as a compliment.]]></description>
			<content:encoded><![CDATA[<p></p><p>Dodge has released an amusing new commercial referencing self-driving cars and other robotics advances. Well, those of us on Team Robot take it as a compliment.</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/amusing-new-dodge-commercial/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preprint: 3-D Scene Analysis via Sequenced Predictions over Points and Regions</title>
		<link>http://robotwhisperer.org/uncategorized/preprint-3-d-scene-analysis-via-sequenced-predictions-over-points-and-regions/</link>
		<comments>http://robotwhisperer.org/uncategorized/preprint-3-d-scene-analysis-via-sequenced-predictions-over-points-and-regions/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 22:17:54 +0000</pubDate>
		<dc:creator>dmunoz</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=216</guid>
		<description><![CDATA[3-D Scene Analysis via Sequenced Predictions over Points and Regions Xuehan Xiong, Daniel Munoz, J. Andrew Bagnell, Martial Hebert To appear: ICRA 2011. Preprint (pdf) We address the problem of understanding scenes from 3-D laser scans via per-point assignment of semantic labels. In order to mitigate the difficulties of using a graphical model for modeling [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><strong><a href="http://robotwhisperer.org/wp-content/uploads/2011/01/icra11.png"><img class="alignleft size-full wp-image-213" title="ICRA 2011" src="http://robotwhisperer.org/wp-content/uploads/2011/01/icra11.png" alt="" width="164" height="100" /></a>3-D Scene Analysis via Sequenced Predictions over Points and Regions</strong><em><br />
Xuehan Xiong, Daniel Munoz, J. Andrew Bagnell, Martial Hebert</em><br />
To appear: ICRA 2011.<a href="http://www.ri.cmu.edu/pub_files/2011/5/xiong_icra_11.pdf"><br />
Preprint (pdf)</a></p>
<p>We address the problem of understanding scenes from 3-D laser scans  via  per-point assignment of semantic labels. In order to mitigate the   difficulties of using a graphical model for modeling the contextual   relationships among the 3-D points, we instead propose a multi-stage   inference procedure to capture these relationships. More specifically,   we train this procedure to use point cloud statistics and learn   relational information (e.g., tree-trunks are below vegetation) over   fine (point-wise) and coarse (region-wise) scales. We evaluate our   approach on three different datasets, that were obtained from different   sensors, and demonstrate improved performance.</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/preprint-3-d-scene-analysis-via-sequenced-predictions-over-points-and-regions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Merry Christmas and Happy Holidays!</title>
		<link>http://robotwhisperer.org/uncategorized/merry-christmas-and-happy-holidays/</link>
		<comments>http://robotwhisperer.org/uncategorized/merry-christmas-and-happy-holidays/#comments</comments>
		<pubDate>Sat, 11 Dec 2010 16:42:06 +0000</pubDate>
		<dc:creator>lairlab</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=199</guid>
		<description><![CDATA[Andy, our ARM (Autonomous Robot Manipulation) wishes you a merry christmas! Mihail Pivtoraiko, our motion planning for manipulation expert, demonstrates Andy&#8217;s current dexterity.]]></description>
			<content:encoded><![CDATA[<p></p><p>Andy, our ARM (Autonomous Robot Manipulation) wishes you a merry christmas! <a href="http://www.cs.cmu.edu/~mihail/">Mihail Pivtoraiko</a>, our motion planning for manipulation expert, demonstrates Andy&#8217;s current dexterity.</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/merry-christmas-and-happy-holidays/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preprint: Stacked Hierarchical Labeling</title>
		<link>http://robotwhisperer.org/uncategorized/preprint-stacked-hierarchical-labeling/</link>
		<comments>http://robotwhisperer.org/uncategorized/preprint-stacked-hierarchical-labeling/#comments</comments>
		<pubDate>Mon, 05 Jul 2010 04:17:15 +0000</pubDate>
		<dc:creator>dmunoz</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=180</guid>
		<description><![CDATA[Stacked Hierarchical Labeling Daniel Munoz, J. Andrew Bagnell, Martial Hebert To appear: ECCV 2010. Preprint (pdf) In this work we propose a hierarchical approach for labeling semantic objects and regions in scenes. Our approach is reminiscent of early vision literature in that we use a decomposition of the image in order to encode relational and [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><strong>Stacked Hierarchical Labeling</strong><em><br />
Daniel Munoz, J. Andrew Bagnell, Martial Hebert</em><br />
To appear: ECCV 2010.<a href="../wp-content/uploads/2010/07/munoz_eccv_10.pdf"><br />
Preprint  (pdf)</a></p>
<p>In this work we propose a hierarchical approach for labeling semantic  objects and regions in scenes. Our approach is reminiscent of early  vision literature in that we use a decomposition of the image in order  to encode relational and spatial information. In contrast to much  existing work on structured prediction for scene understanding, we  bypass a global probabilistic model and instead directly train a  hierarchical inference <em>procedure</em> inspired by the message passing  mechanics of some approximate inference procedures in graphical models.  This approach mitigates both the theoretical and empirical difficulties  of learning probabilistic models when exact inference is intractable. In  particular, we draw from recent work in machine learning and break the  complex inference process into a hierarchical series of simple machine  learning subproblems. Each subproblem in the hierarchy is designed to  capture the image and contextual statistics in the scene. This hierarchy  spans coarse-to-fine regions and explicitly models the mixtures of  semantic labels that may be present due to imperfect segmentation. To  avoid cascading of errors and overfitting, we train the learning  problems in sequence to ensure robustness to likely errors earlier in  the inference sequence and leverage the stacking approach developed by  Cohen <em>et al.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/preprint-stacked-hierarchical-labeling/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Congrats to Brian! Best Paper Runner up: Modeling Interaction via the Principle of Maximum Causal Entropy</title>
		<link>http://robotwhisperer.org/uncategorized/modeling-interaction-via-the-principle-of-maximum-causal-entropy/</link>
		<comments>http://robotwhisperer.org/uncategorized/modeling-interaction-via-the-principle-of-maximum-causal-entropy/#comments</comments>
		<pubDate>Sun, 20 Jun 2010 01:42:03 +0000</pubDate>
		<dc:creator>lairlab</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[entropy]]></category>
		<category><![CDATA[learning]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[preprint]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=162</guid>
		<description><![CDATA[Modeling Interaction via the Principle of Maximum Causal Entropy ICML runner up for best student paper by Brian Ziebart, J. Andrew Bagnell, and Anind Dey. @inproceedings{bziebart-maxcausalent, author = {Brian D. Ziebart and J. Andrew Bagnell and Anind K. Dey}, title = {Modeling Interaction via the Principle of Maximum Causal Entropy}, year = {2010}, booktitle = [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://robotwhisperer.org/wp-content/uploads/2010/06/maxCausalEnt.pdf"><strong>Modeling Interaction via the Principle of Maximum Causal Entropy</strong></a><br />
ICML runner up for best student paper by Brian Ziebart, J. Andrew Bagnell, and Anind Dey. </p>
<p>@inproceedings{bziebart-maxcausalent,<br />
   author = {Brian D. Ziebart and J. Andrew Bagnell<br />
            and Anind K. Dey},<br />
   title = {Modeling Interaction via the Principle of Maximum Causal Entropy},<br />
   year = {2010},<br />
   booktitle = {International Conference on Machine Learning}<br />
}</p>
<blockquote><p>The principle of maximum entropy provides a powerful framework for statistical models of joint, conditional, and marginal distributions. However, there are many important distributions with elements of interaction and feedback where its applicability has not been established. This work presents the principle of maximum causal entropy  &#8212; an approach based on causally conditioned probabilities that can appropriately model the availability and influence of sequentially revealed side information. Using this principle, we derive Maximum Causal  Entropy  Influence Diagrams, a new probabilistic graphical framework for modeling decision making in settings with latent information, sequential interaction, and feedback. We describe the theoretical advantages of this model and demonstrate its applicability for statistically framing inverse optimal control and decision prediction tasks. </p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/modeling-interaction-via-the-principle-of-maximum-causal-entropy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preprint: Reinforcement Planning: RL for Optimal Planners</title>
		<link>http://robotwhisperer.org/uncategorized/preprint-reinforcement-planning-rl-for-optimal-planners/</link>
		<comments>http://robotwhisperer.org/uncategorized/preprint-reinforcement-planning-rl-for-optimal-planners/#comments</comments>
		<pubDate>Tue, 20 Apr 2010 18:01:47 +0000</pubDate>
		<dc:creator>mzucker</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robotwhisperer.org/?p=158</guid>
		<description><![CDATA[Reinforcement Planning: RL for Optimal Planners Matt Zucker and J. Andrew Bagnell PDF Search based planners such as A* and Dijkstra&#8217;s algorithm are proven methods for guiding today&#8217;s robotic systems. Although such planners are typically based upon a coarse approximation of reality, they are nonetheless valuable due to their ability to reason about the future, [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="alignleft" title="rp-mmaze-test2.png" src="http://robotwhisperer.org/wp-content/uploads/2010/04/test2-150.png" alt="Marble maze test board used to verify Reinforcement Planning method." width="150" height="124" /> <strong>Reinforcement Planning: RL for Optimal Planners</strong><br />
<em>Matt Zucker and J. Andrew Bagnell</em></p>
<p><a href="http://robotwhisperer.org/wp-content/uploads/2010/04/rp-techreport.pdf">PDF</a></p>
<p>Search based planners such as A* and Dijkstra&#8217;s algorithm are proven methods for guiding today&#8217;s robotic systems. Although  such planners are typically based upon a coarse approximation of reality, they are nonetheless valuable due to their ability to reason about the future, and to generalize to previously unseen scenarios. However, encoding the desired behavior of a system into the underlying cost function used by the planner can be a tedious and error-prone task. We introduce <em>Reinforcement Planning</em>, which extends gradient based reinforcement learning algorithms to automatically learn useful cost functions for optimal planners. Reinforcement Planning presents several advantages over other learning applications involving planners in that it is not limited by the expertise of a human demonstrator, and that it also recognizes that the domain of the planner is a simplified model of the world. We demonstrate the effectiveness of our method in learning to solve a noisy physical simulation of the well-known &#8220;marble maze&#8221; toy.</p>
]]></content:encoded>
			<wfw:commentRss>http://robotwhisperer.org/uncategorized/preprint-reinforcement-planning-rl-for-optimal-planners/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

