This event has ended. Visit the official site or create your own event on Sched.
Click here to return to main conference site. For a one page, printable overview of the schedule, see this.
View analytic
Wednesday, June 29 • 11:06am - 11:24am
Resource-Aware Scheduling Strategies for Parallel Machine Learning R Programs though RAMBO

Log in to save this to your schedule and see who's attending!

We present resource-aware scheduling strategies for parallel R programs leading to efficient utilization of parallel computer architectures by estimating resource demands. We concentrate on applications that consist of independent tasks. The R programming language is increasingly used to process large data sets in parallel, which requires a high amount of resources. One important application is parameter tuning of machine learning algorithms where evaluations need to be executed in parallel to reduce runtime. Here, resource demands of tasks heavily vary depending on the algorithm configuration. Running such an application in a naive parallel way leads to inefficient resource utilization and thus to long runtimes. Therefore, the R package “parallel” offers a scheduling strategy, called “load balancing”. It dynamically allocates tasks to worker processes. This option is recommended when tasks have widely different computation times or if computer architectures are heterogeneous. We analyzed memory and CPU utilization of parallel applications with our TraceR profiling tool and found that the load balancing mechanism is not sufficient for parallel tasks with high variance in resource demands. A scheduling strategy needs to know resource demands of a task before execution to efficiently map applications to available resources. Therefore, we build a regression model to estimate resource demands based on previous evaluated tasks. Resource estimates like runtime are then used to guide our scheduling strategies. Those strategies are integrated in our RAMBO (Resource-Aware Model-Based Optimization) Framework. Compared to standard mechanisms of the parallel package our approach yields improved resource utilization.

avatar for Dirk  Eddelbuettel

Dirk Eddelbuettel

Debian and R Projects

avatar for Helena  Kotthaus

Helena Kotthaus

Department of Computer Science 12, TU Dortmund University, Dortmund, Germany

Wednesday June 29, 2016 11:06am - 11:24am
Econ 140 579 Serra Mall, Stanford, CA 94305

Attendees (66)