Michael Paul
 
Ph.D. Student [CV]

Department of Computer Science
Center for Language and Speech Processing
Johns Hopkins University
Hackerman Hall (CSEB) 321
3400 North Charles Street
Baltimore, MD 21218


Scholar | Twitter

About me: I am a second-year PhD student of CS at Johns Hopkins University, advised by Jason Eisner and Mark Dredze. Before coming here, I was an undergraduate at the University of Illinois at Urbana-Champaign and later worked as a programmer for the Cognitive Computation Group. I am currently supported by an NSF Graduate Research Fellowship as well as a Dean's fellowship from the Whiting School of Engineering. My research interests include natural language processing, text mining, and machine learning, with an emphasis on building unsupervised models to find meaningful patterns in text. I'm also interested in applications to social media and health informatics.

News:


Research
2012
Michael J. Paul. Mixed Membership Markov Models for Unsupervised Conversation Modeling. To appear in the 2012 Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2012), Jeju, Korea. July 2012.
Michael J. Paul and Jason Eisner. Implicitly Intersecting Weighted Automata using Dual Decomposition. To appear in the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012), Montreal, Canada. June 2012. [paper]
[web]
William M. Darling, Michael J. Paul and Fei Song. Unsupervised Part-of-Speech Tagging in Noisy and Esoteric Domains With a Syntactic-Semantic Bayesian HMM. To appear in the EACL 2012 Workshop on Semantic Analysis in Social Media, Avignon, France. April 2012. [paper]
[web]

2011
Michael J. Paul and Mark Dredze. You Are What You Tweet: Analyzing Twitter for Public Health. In the proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM 2011), Barcelona, Spain. July 2011. [24% acceptance] [paper]
[slides]
[video]
[web]
Michael J. Paul and Mark Dredze. A Model for Mining Public Health Topics from Twitter. Technical Report. Johns Hopkins University. 2011. [paper]
[web]
Delip Rao, Michael Paul, Clayton Fink, David Yarowsky, Timothy Oates, Glen Coppersmith. Hierarchical Bayesian Models for Latent Attribute Detection in Social Media. In the proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM 2011), Barcelona, Spain. July 2011. [short paper] [paper]
[web]
Roxana Girju and Michael J. Paul. Modeling Reciprocity in Social Interactions with Probabilistic Latent Space Models. Natural Language Engineering 17(1), pages 1-36. Cambridge University Press 2011. [paper]
[link]
[data]
[web]

2010
Michael J. Paul, ChengXiang Zhai and Roxana Girju. Summarizing Contrastive Viewpoints In Opinionated Text. In the proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), pages 65-75, MIT, Cambridge, Massachusetts. October 2010. [25% acceptance] [paper]
[slides]
[data]
[web]
Michael Paul and Roxana Girju. Comparative Scientific Research Analysis with a Language-Independent Cross-Collection Model. In the proceedings of XXVI Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN 2010), Valencia, Spain. September 2010. [paper]
[web]
Michael Paul and Roxana Girju. A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics. In the proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI-10), pages 545-550, Atlanta, Georgia. July 2010. [26.9% acceptance] [paper]
[slides]
[code]
[web]

2009
Michael Paul. Cross-Collection Topic Models: Automatically Comparing and Contrasting Text. Undergraduate Thesis, advised by Roxana Girju. Department of Computer Science, University of Illinois at Urbana-Champaign. 2009. [paper]
[slides]
Michael Paul and Roxana Girju. Topic Modeling of Research Fields: An Interdisciplinary Perspective. In the proceedings of Recent Advances in Natural Language Processing (RANLP 2009), Borovets, Bulgaria. September 2009. [paper]
[web]
Michael Paul and Roxana Girju. Cross-Cultural Analysis of Blogs and Forums with Mixed-Collection Topic Models. In the proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), pages 1408-1417, Singapore. August 2009. [paper]
[code]
[data]
[web]
Michael Paul, Roxana Girju, Chen Li. Mining the Web for Reciprocal Relationships. In the proceedings of the 13th Conference on Computational Natural Language Learning (CoNLL 2009), Boulder, Colorado. June 2009. [paper]
[data]
[web]

2008
Michael Paul and Roxana Girju. AIRTA: An Automatic Interdisciplinary Research Topic Advisor. [extended abstract] NSF-sponsored Symposium on Semantic Knowledge Discovery, Organization and Use - Demo session, New York University. November 2008. [paper]
[poster]
[demo]