DescriptionOne of the significant challenges when solving optimization problems is addressing possible inaccurate or inconsistent function evaluations. Surprisingly and interestingly, this problem is far from trivial even in one of the most basic possible settings: evaluating which of two options is better when the values of the two options are random variables (a stochastic dilemma). Problems in this space have often been studied in the statistics, operations research and computer-science communities under the name of "multi-armed bandits". While most of the previous work has focused on dealing with noise in an online setting, in this dissertation, I will focus on offline optimization where the goal is to return a reasonable solution with high probability using a finite number of samples. I will discuss a set of problem settings of increasing complexity that allow one to derive formal algorithmic bounds. I will point to and discuss interesting connections between stochastic optimization and noisy data annotation, a problem where the goal is to identify the label of an object from a series of noisy evaluations. As a first contribution, I will introduce and formally analyze a set of novel algorithms that improve the state of the art and provide new insights for solving the stochastic optimization and noisy data-annotation problems. I will then formally prove a novel result: That a widely used derivative-free optimization algorithm (the cross-entropy method) is optimizing for quantiles instead of expectation in stochastic optimization settings. I will back up the theoretical claims on the optimization side with experimental results in a set of non-trivial planning and reinforcement-learning domains. Finally, I will discuss the application of the above algorithms for solving noisy data-annotation problems in a setting involving real crowdsourcing experiments.