Photo of University Hall

University Calendar

Linguistics Brown Bag on Language & Information: Na'im Tyson

February 22, 2017, 1:00 pm
Location Schmitt Hall - 104
SponsorLinguistics DepartmentPosted InCollege of Humanities and Social Sciences

Lecturer: Dr. Na'im Tyson

Topic: Evaluation of Anchor Texts for Automated Links Discovery in Semi-structured Web Documents

  • Date: Wednesday, February 22
  • Time: 1:00pm
  • Place: Schmitt Hall 104
  • Refreshments will be provided
  • No RSVP needed

This talk was adapted from a previous one presented at the Language Resource and Evaluation Conference (May 2016). With an English noun phrase grammar defined by Hulth (2004) as a starting point, we created an English noun phrase chunker for extracting anchor text candidates within web-based articles. Freelance annotators—with little to no training outside their respective fields—evaluated articles that received these machine-generated anchor texts using an annotation environment. Unlike other large-scale linguistic annotation projects, where annotators receive an evaluation based on a reference corpus, there was not sufficient time or funding to create such a corpus for anchor text comparisons amongst the annotators. Instead of a reference corpus, we assumed that the anchor text generator was another annotator. We then computed the average Cohen’s Kappa Coefficient (Landis and Koch, 1977) across all pairings of the anchor text generator and an annotator. Our approach showed a fair agreement level on average (as described in Pustejovsky and Stubbs (2013, p. 131–132)).


About Dr. Na'im Tyson:

Dr. Na'im Tyson has been a practitioner in Computational Linguistics and Natural Language Processing (NLP) in academic and corporate research for over fifteen years. His most recent engagement is with About.com where he independently conducts research in Natural Language Processing, Data Analytics and Data Science. Research projects include automated link discovery, and linking text to corresponding entities such as people, places and organizations. Prior to About.com, he most recently co-sponsored four patents for an educational technology startup named Voxy.  These patents dealt with the creation of algorithms that drive English language exercises for a web and mobile language learning platform. 

Dr. Tyson’s doctoral research at The Ohio State University pertained to the exploration of acoustic and auditory features for discriminating between instances of canonical vowels excised from spontaneous speech. This research was inspired by his work as a Graduate Research Associate for the Buckeye Corpus of Spontaneous Speech.