Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format JL Greathouse, M Daga SC'14: Proceedings of the International Conference for High Performance …, 2014 | 270 | 2014 |
On the efficacy of a fused CPU+ GPU processor (or APU) for parallel computing M Daga, AM Aji, W Feng 2011 Symposium on Application Accelerators in High-Performance Computing …, 2011 | 190 | 2011 |
Exploiting coarse-grained parallelism in B+ tree searches on an APU M Daga, M Nutter 2012 SC Companion: High Performance Computing, Networking Storage and …, 2012 | 54 | 2012 |
Structural agnostic SpMV: Adapting CSR-adaptive for irregular matrices M Daga, JL Greathouse 2015 IEEE 22nd International conference on high performance computing (HiPC …, 2015 | 52 | 2015 |
MIOpen: An open source library for deep learning primitives J Khan, P Fultz, A Tamazov, D Lowell, C Liu, M Melesse, ... arXiv preprint arXiv:1910.00078, 2019 | 44 | 2019 |
Efficient breadth-first search on a heterogeneous processor M Daga, M Nutter, M Meswani 2014 IEEE International Conference on Big Data (Big Data), 373-382, 2014 | 42 | 2014 |
Architecture-aware mapping and optimization on a 1600-core gpu M Daga, T Scogland, W Feng 2011 IEEE 17th International Conference on Parallel and Distributed Systems …, 2011 | 42 | 2011 |
Bounding the effect of partition camping in GPU kernels AM Aji, M Daga, W Feng Proceedings of the 8th ACM International Conference on Computing Frontiers, 1-10, 2011 | 38 | 2011 |
Implementing directed acyclic graphs with the heterogeneous system architecture S Puthoor, AM Aji, S Che, M Daga, W Wu, BM Beckmann, G Rodgers Proceedings of the 9th Annual Workshop on General Purpose Processing using …, 2016 | 30 | 2016 |
Efficient sparse matrix-vector multiplication on parallel processors M Daga, JL Greathouse US Patent 9,697,176, 2017 | 27 | 2017 |
Exploring parallel programming models for heterogeneous computing systems M Daga, ZS Tschirhart, C Freitag 2015 IEEE international symposium on workload characterization, 98-107, 2015 | 26 | 2015 |
clsparse: A vendor-optimized open-source sparse blas library JL Greathouse, K Knox, J Poła, K Varaganti, M Daga Proceedings of the 4th International Workshop on OpenCL, 1-4, 2016 | 24 | 2016 |
An n log n Generalized Born Approximation R Anandakrishnan, M Daga, AV Onufriev Journal of Chemical Theory and Computation 7 (3), 544-559, 2011 | 24 | 2011 |
Towards accelerating molecular modeling via multi-scale approximation on a GPU M Daga, W Feng, T Scogland 2011 IEEE 1st International Conference on Computational Advances in Bio and …, 2011 | 13 | 2011 |
Graph matching for optimized deep network processing M Breternitz, M Daga US Patent App. 15/498,943, 2018 | 9 | 2018 |
Method and apparatus for performing a search operation on heterogeneous computing systems M Daga US Patent 10,031,947, 2018 | 5 | 2018 |
On the performance, energy, and power of data-access methods in heterogeneous computing systems R Kalidas, M Daga, K Krommydas, W Feng 2015 IEEE International Parallel and Distributed Processing Symposium …, 2015 | 5 | 2015 |
Multi-dimensional characterization of electrostatic surface potential computation on graphics processors M Daga, W Feng BMC bioinformatics 13, 1-12, 2012 | 5 | 2012 |
CampProf: a visual performance analysis tool for memory bound GPU kernels AM Aji, M Daga, W Feng | 5 | 2010 |
Architecture-Aware Optimization on a 1600-core Graphics Processor M Daga, TRW Scogland, W Feng | 4 | 2011 |