|
Juri GanitkevitchPhD StudentCenter for Language and Speech Processing Johns Hopkins University |
About Me
I'm a second-year PhD student at the Computer Science Department of the Johns Hopkins University. More
precisely, I work at the Center
for Language and Speech Processing.
My advisor is Chris Callison-Burch. I also frequently consult with Ben Van Durme, Matt Post, Alexandre Klementiev, and Adam Lopez.
I am primarily interested in large-scale statistical natural language transformation models and applications, including paraphrasing, text-to-text generation, and machine translation. I work on grammar acquisition, as well as decoding approaches and algorithms. I'm also curious about efficient processing of vast amounts of data, particularly randomized and approximative algorithms, probabilistic data structures and online methods. I'm quite convinced that semi-supervised learning is a pretty good idea.
I hold a Master's degree in Computer Science from JHU as well as a Diplom (equivalent to a Master's) of Computer Science from RWTH Aachen University. My Diplom thesis project as well as some prior research assistant work was done at the Human Language Technology and Pattern Recognition Group, advised by Sasa Hasan and Hermann Ney.
I also spent a year as a visiting Master's student at the ENST/Télécom Paris and took a year off school to work with the Voice Technology Group at IBM Germany Research & Development as a full-time intern. While working on my Diplom thesis I held a part-time software engineering position with Nuance Communications in Aachen, Germany.
I interned with the Google Translate team in Mountain View twice (Summers 2010 and 2011), where I worked with Ashish Venugopal, David Talbot, and Jakob Uszkoreit.
My legal name, as per my passport, is Jurij Ganitkevic. It's the result of an unfortunate transliteration accident and I much prefer the old spelling of my name that you see above. I continue to use it in publications, and generally wherever I can get away with it.
My advisor is Chris Callison-Burch. I also frequently consult with Ben Van Durme, Matt Post, Alexandre Klementiev, and Adam Lopez.
I am primarily interested in large-scale statistical natural language transformation models and applications, including paraphrasing, text-to-text generation, and machine translation. I work on grammar acquisition, as well as decoding approaches and algorithms. I'm also curious about efficient processing of vast amounts of data, particularly randomized and approximative algorithms, probabilistic data structures and online methods. I'm quite convinced that semi-supervised learning is a pretty good idea.
I hold a Master's degree in Computer Science from JHU as well as a Diplom (equivalent to a Master's) of Computer Science from RWTH Aachen University. My Diplom thesis project as well as some prior research assistant work was done at the Human Language Technology and Pattern Recognition Group, advised by Sasa Hasan and Hermann Ney.
I also spent a year as a visiting Master's student at the ENST/Télécom Paris and took a year off school to work with the Voice Technology Group at IBM Germany Research & Development as a full-time intern. While working on my Diplom thesis I held a part-time software engineering position with Nuance Communications in Aachen, Germany.
I interned with the Google Translate team in Mountain View twice (Summers 2010 and 2011), where I worked with Ashish Venugopal, David Talbot, and Jakob Uszkoreit.
My legal name, as per my passport, is Jurij Ganitkevic. It's the result of an unfortunate transliteration accident and I much prefer the old spelling of my name that you see above. I continue to use it in publications, and generally wherever I can get away with it.
Projects
- I'm involved in the Joshua decoder, an open-source statistical machine translation system developed at JHU and written in Java. We're trying to make it easily accessible. Have a go.
- I also do some work on the cdec decoder, another open-source statistical machine translation system. This one is written by Chris Dyer at UMD College Park.
- Another neat project I'm involved in is Jonny Weese's Thrax, a MapReduce grammar extractor for SCFGs (it does both Hiero and SAMT). It's open-source as well, so come and lend a hand.
- Feel free to check on my most recent misadventures on GitHub.
Publications
- Learning
Sentential Paraphrases from Bilingual Parallel Corpora for
Text-to-Text Generation
J. Ganitkevitch, C. Callison-Burch, C. Napoles, and B. Van Durme
In Proceedings of EMNLP; Edinburgh, United Kingdom, July 2011. - Watermarking the Outputs of Structured
Prediction with an Application in Statistical Machine Translation
A. Venugopal, J. Uszkoreit, D. Talbot, F. Och, and J. Ganitkevitch
In Proceedings of EMNLP; Edinburgh, United Kingdom, July 2011. - Paraphrastic Sentence Compression with
a Character-based Metric: Tightening without Deletion
C. Napoles, C. Callison-Burch, J. Ganitkevitch and B. Van Durme
In Proceedings of Workshop on Monolingual Text-To-Text Generation; Portland, USA, June 2011. - Joshua 3.0: Syntax-based Machine
Translation with the Thrax Grammar Extractor
J. Weese, J. Ganitkevitch, C. Callison-Burch, M. Post, and A. Lopez
In Proceedings of the Sixth Workshop on Statistical Machine Translation; Edinburgh, United Kingdom, July 2011. - cdec: A Decoder, Alignment, and
Learning Framework for Finite-State and Context-Free Translation
Models
C. Dyer, A. Lopez, J. Ganitkevitch, J. Weese, F. Ture, P. Blunsom, H. Setiawan, V. Eidelman, and P. Resnik
In Proceedings of ACL, Software Demonstrations; Uppsala, Sweden, July 2010. - Joshua 2.0: A Toolkit for
Parsing-Based Machine Translation with Syntax, Semirings,
Discriminative Training and Other Goodies
Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, A. Irvine, L. Schwartz, W. Thornton, Z. Wang, J. Weese, and O. Zaidan
In Proceedings of the Fifth Workshop on Statistical Machine Translation; Uppsala, Sweden, July 2010. - An Enriched MT
Grammar for Under $100
O. Zaidan and J. Ganitkevitch
In Proceedings of the Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk; Los Angeles, USA, June 2010. - Demonstration of Joshua: An
Open Source Toolkit for Parsing-Based Machine Translation
Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, S. Khudanpur, L. Schwartz, W. Thronton, J. Weese, and O. Zaidan
In Proceedings of ACL/IJCNLP, Software Demonstrations; Suntec, Singapore, August 2009. - Joshua: An Open Source
Toolkit for Parsing-Based Machine Translation
Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, S. Khudanpur, L. Schwartz, W. Thronton, J. Weese, and O. Zaidan
In Proceedings of the Fourth Workshop on Statistical Machine Translation; Athens, Greece, March 2009. - Triplet Lexicon Models
for Statistical Machine Translation
Sasa Hasan, Juri Ganitkevitch, Hermann Ney, and J. Andrés-Ferrer
Proceedings of EMNLP; Honolulu, Hawaii, October 2008. - Speaker Adaptation using
Maximum Likelihood Linear Regression
Juri Ganitkevitch
Seminar paper at RWTH Aachen University; Aachen, Germany, Summer 2005.
2011
2010
2009
2008
2005
Contact
My email address is juri at CS dot JHU dot edu. You can also
follow my rather unprofessional musings on Twitter.