By Adam Fourney, Ben Lafreniere, Richard Mann, Michael Terry | published 2012-05-05 |
1 |
Share:
Report a problem
This paper presents a recognizer for identifying references to user interface components in online documentation. The recognizer first extracts phrases matching a list of known components, then employs a classifier to reject coincidental matches. We describe why this seemingly straightforward problem is challenging, then show how informal conventions in documentation writing can be leveraged to perform classification. Using the features identified in this paper, our approach achieves an average F1 score of 0.81, and can correctly distinguish between actual command references and coincidental matches in 93.7% of test cases.