The diverse and non-trivial challenges of parallelism in data analytics require computing infrastructures that go beyond the demand of traditional simulation-based sciences. The growing data volume and complexity have outpaced the processing capacity of single-node machines in these areas, making massively parallel systems an indispensable tool. However, programming on high-performance computing (HPC) systems poses significant productivity and scalability challenges. It is important to introduce an abstraction layer that provides programming flexibility and productivity while ensuring high system performance. As we enter the post-Moore’s Law era, effective programming of specialized architectures is critical for improved performance in HPC. As large-scale systems become more heterogeneous, their efficient use for new, often irregular and communication-intensive data analysis computation becomes increasingly complex. In this talk, we discuss how sparse linear algebra can be used to achieve performance and scalability on extreme-scale systems while maintaining productivity for emerging data-intensive scientific challenges.