Takuya Yoshioka

Cited by

	All	Since 2019
Citations	10436	7633
h-index	46	38
i10-index	129	93

2100

1050

525

1575

2008200920102011201220132014201520162017201820192020202120222023202455 55 82 83 125 201 235 280 400 562 633 712 774 1141 1522 2036 1435

Public access

View all

2 articles

3 articles

available

not available

Based on funding mandates

Co-authors

Tomohiro NakataniNTT Communication Science LaboratoriesVerified email at ieee.org
Keisuke KinoshitaResearch Scientist at GoogleVerified email at ieee.org
Marc DelcroixNTT Communication Science LaboratoriesVerified email at ieee.org
Shoko ArakiNTT Communication Science LaboratoriesVerified email at ieee.org
Masakiyo FujimotoSenior researcher, National Institute of Information and Communications TechnologyVerified email at nict.go.jp
Nobutaka ItoUniversity of Tokyo, Japan (formerly NTT)Verified email at k.u-tokyo.ac.jp
Armin SehrOTH RegensburgVerified email at oth-regensburg.de
Roland MaasSr. Science Manager at AmazonVerified email at amazon.com
Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
Takaaki HoriAppleVerified email at apple.com
Hiroshi G OkunoProfessor Emeritus, Kyoto University, Adjunct Researcher, Waseda UniversityVerified email at nue.org
Takuya HiguchiAppleVerified email at apple.com
Atsushi NakamuraGraduate School of Natural Sciences, Nagoya City UniversityVerified email at ieee.org
Yotaro KuboGoogle SpeechVerified email at ieee.org
Chengzhu Yu （俞承柱）AmazonVerified email at amazon.com
Mehrez SoudenSr. Manager, Apple Inc.Verified email at gatech.edu
Hirokazu KameokaSenior Distinguished Researcher at NTT, Adjunct Associate Professor at NIIVerified email at hco.ntt.co.jp
Mark GalesCambridge UniversityVerified email at eng.cam.ac.uk

Takuya Yoshioka

AssemblyAI

Verified email at assemblyai.com - Homepage

speech recognition speech enhancement speaker diarization machine learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Wavlm: Large-scale self-supervised pre-training for full stack speech processing S Chen, C Wang, Z Chen, Y Wu, S Liu, Z Chen, J Li, N Kanda, T Yoshioka, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1505-1518, 2022	1209	2022
Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation Y Luo, Z Chen, T Yoshioka ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	735	2020
Speech dereverberation based on variance-normalized delayed linear prediction T Nakatani, T Yoshioka, K Kinoshita, M Miyoshi, BH Juang IEEE Transactions on Audio, Speech, and Language Processing 18 (7), 1717-1731, 2010	486	2010
The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech K Kinoshita, M Delcroix, T Yoshioka, T Nakatani, E Habets, ... 2013 IEEE Workshop on Applications of Signal Processing to Audio and …, 2013	459	2013
A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research K Kinoshita, M Delcroix, S Gannot, EA P. Habets, R Haeb-Umbach, ... EURASIP Journal on Advances in Signal Processing 2016, 1-19, 2016	406	2016
Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition T Yoshioka, A Sehr, M Delcroix, K Kinoshita, R Maas, T Nakatani, ... IEEE Signal Processing Magazine 29 (6), 114-126, 2012	331	2012
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings S Watanabe, M Mandel, J Barker, E Vincent, A Arora, X Chang, ... arXiv preprint arXiv:2004.09249, 2020	307	2020
Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening T Yoshioka, T Nakatani IEEE Transactions on Audio, Speech, and Language Processing 20 (10), 2707-2720, 2012	305	2012
The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices T Yoshioka, N Ito, M Delcroix, A Ogawa, K Kinoshita, M Fujimoto, C Yu, ... 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU …, 2015	261	2015
Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise T Higuchi, N Ito, T Yoshioka, T Nakatani 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016	258	2016
Continuous speech separation: Dataset and analysis Z Chen, T Yoshioka, L Lu, T Zhou, Z Meng, Y Luo, J Wu, X Xiao, J Li ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	205	2020
Blind separation and dereverberation of speech mixtures by joint optimization T Yoshioka, T Nakatani, M Miyoshi, HG Okuno IEEE Transactions on Audio, Speech, and Language Processing 19 (1), 69-84, 2010	196	2010
Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation T Nakatani, T Yoshioka, K Kinoshita, M Miyoshi, BH Juang 2008 IEEE International Conference on Acoustics, Speech and Signal …, 2008	188	2008
Icassp 2023 deep noise suppression challenge H Dubey, A Aazami, V Gopal, B Naderi, S Braun, R Cutler, A Ju, ... IEEE Open Journal of Signal Processing, 2024	170	2024
End-to-end microphone permutation and number invariant multi-channel speech separation Y Luo, Z Chen, N Mesgarani, T Yoshioka ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	166	2020
Multi-channel overlapped speech recognition with location guided speech extraction network Z Chen, X Xiao, T Yoshioka, H Erdogan, J Li, Y Gong 2018 IEEE Spoken Language Technology Workshop (SLT), 558-565, 2018	136	2018
Multi-microphone neural speech separation for far-field multi-talker speech recognition T Yoshioka, H Erdogan, Z Chen, F Alleva 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018	133	2018
Continuous speech separation with conformer S Chen, Y Wu, Z Chen, J Wu, J Li, T Yoshioka, C Wang, S Liu, M Zhou ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	130	2021
Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB challenge M Delcroix, T Yoshioka, A Ogawa, Y Kubo, M Fujimoto, N Ito, K Kinoshita, ... Reverb workshop, 2014	127	2014
Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR T Higuchi, N Ito, S Araki, T Yoshioka, M Delcroix, T Nakatani IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (4), 780-793, 2017	124	2017

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors