The WORDSMITH tools is ‘an integrated
suite of programs for looking at how words behave in texts.’ It ‘controls’ the
programs it contains: Concord (makes a concordance using plain texts or
web text files), KeyWords (locate and identify key words in a given
corpora), and WordList (generate word lists based shown in alphabetical
and frequency order).
Johnson
& Ensslin (2006) discussed the methodological concerns about keyword
analysis and the reliability of the BNC reference corpus when compared to
research corpora in order for the latter to be neutrally analyzed. They
identified two problems. The BNC constructed by Scott and composed of a set of
90.7 million words taken from the late 1980s and early 1990s, failed to cover
themes outside that time frame, the thing that resulted the “problem of age
disparity”. The other problem is related to “proper names” in newspapers and
media discourse. Proper names may appear as “key” keywords in any newspaper
corpus. Scott (2000), however, came to rule out proper names of any kind in
view of the fact that they change over time. Sinclear (2004) argued that
articles including proper names should be excluded on the basis that they put
the homogeneity of the research corpus at risk. But what about articles containing
household names that are deeply related to the area one is investigating?
Johnson & Ensslin (2006) suggested a couple of exits with dreadful setbacks:
Either build one’s own comparator from scratch to generate a more reliable list
of the most frequent words, which is time consuming, or conduct an extensive
editing work on the keyword lists, which will eventually put the reliability
and objectivity of the study into question.
What some other analysts did to leapfrog
these setbacks, like Baker (2004), is promoting a carefully triangulated
quantitative and qualitative analytic methodology by combining between statistical
findings and what Baker (2004) called “inclusive and subjective”
interpretations, to eschew both the lexical-only approach and the
subjectively-collected data.
Johnson, S. & Ensslin, A. (2006). Language in the News: Some
Reflections on Keyword Analysis Using Wordsmith Tools and the BNC. Leeds Working Papers in Linguistics and Phonetics, 11.