TU Berlin

Database Systems and Information Management GroupSS15

Logo FG DIMA-new  65px

Page Content

to Navigation

Talks DIMA Research Seminar

Talks SS15
4.15 pm
EN 719
Frank McSherry
"Scalability! But at what COST?"

Frank McSherry


Scalability! But at what COST?


Abstract: Many distributed graph processing systems are built with scalability in mind. The more machines you add, the faster they can go. But how fast do they actually go?  We used measurements from a recent evaluation of several popular graph processing frameworks (Gonzalez et al, OSDI 2014) and found that the reported running times for all systems, for all datasets, for all problems, were slower than a single thread running on the speaker’s laptop. We claim that performance evaluation in the current crop of scalable systems is deeply lacking. Rather than evaluate scalability, we challenge systems builders to report the COST, or Configuration that Outperforms a Single Thread. This metric indicates the cross-over point at which a scalable system’s existence is first justified. Our experience indicates that the COST of many systems for most problems is surprisingly high, and in some cases unbounded.

This work is joint with Michael Isard and Derek Murray


Frank McSherry is an independent researcher interested in data-parallel computation. Before its dissolution, Frank was a senior research at Microsoft Research SVC, where he lead the Naiad dataflow project and co-invented differential privacy. He is currently deeply enamored of building performant scalable systems in Rust.


Quick Access

Schnellnavigation zur Seite über Nummerneingabe