Follow
Gunho Park
Gunho Park
Verified email at postech.ac.kr
Title
Cited by
Cited by
Year
Lut-gemm: Quantized matrix multiplication based on luts for efficient inference in large-scale generative language models
G Park, B Park, M Kim, S Lee, J Kim, B Kwon, SJ Kwon, B Kim, Y Lee, ...
arXiv preprint arXiv:2206.09557, 2023
682023
Design and analysis of approximate compressors for balanced error accumulation in mac operator
G Park, J Kung, Y Lee
IEEE Transactions on Circuits and Systems I: Regular Papers 68 (7), 2950-2961, 2021
362021
Simplified Compressor and Encoder Designs for Low-Cost Approximate Radix-4 Booth Multiplier
G Park, J Kung, Y Lee
IEEE Transactions on Circuits and Systems II: Express Briefs 70 (3), 1154-1158, 2022
72022
No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
JY Yang, B Kim, J Bae, B Kwon, G Park, E Yang, SJ Kwon, D Lee
arXiv preprint arXiv:2402.18096, 2024
12024
Energy-Efficient RISC-V-Based Vector Processor for Cache-Aware Structurally-Pruned Transformers
JG Min, D Kam, Y Byun, G Park, Y Lee
2023 IEEE/ACM International Symposium on Low Power Electronics and Design …, 2023
12023
TF-MVP: Novel Sparsity-Aware Transformer Accelerator with Mixed-Length Vector Pruning
E Yoo, G Park, JG Min, SJ Kwon, B Park, D Lee, Y Lee
2023 60th ACM/IEEE Design Automation Conference (DAC), 1-6, 2023
2023
Sparsity-Aware Memory Interface Architecture using Stacked XORNet Compression for Accelerating Pruned-DNN Models
Y Byun, S Moon, B Park, SJ Kwon, D Lee, G Park, E Yoo, JG Min, Y Lee
Proceedings of Machine Learning and Systems 5, 2023
2023
nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models
G Park, B Park, SJ Kwon, B Kim, Y Lee, D Lee
arXiv preprint arXiv:2206.09557, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–8