English 455
Literary and Linguistic Computing
Clai Rice
University of Louisiana
at Lafayette
	
14 Most Frequent Nouns (by Lemma) from the British National Corpus
time	1833
year 1639
people 1256
way 1108
man 1003
day 940
thing 776
child		710
Mr 673
government 670
work 653
life 645
woman 631
system 619
Office: Griffin 357
Phone: 2-1327
Email: crice@louisiana.edu
Office Hours: MW 9:00-12:00
and by appointment


Homework Assignments and Links | UL Moodle | Course Student Pages

Course Description:

The computer has revolutionized the practice of every humanities discipline. As society in general becomes more reliant on computing technology, and as the forms of human social interaction begin to presuppose computer-readability, all humanities scholars will benefit from being familiar with the many ways to read and write with a computer. The immediate availability of millions of digital texts, at Amazon, Google, and elsewhere, has already begun to transform our basic concepts of reading, scholarship, and reference. Jerome McGann forecasts that “in the next 50 years, the entirety of our inherited archive of cultural works will have to be re-edited within a network of digital storage, access, and dissemination.” Will we be ready to do the job?

This course will introduce students to basic concepts in linguistic and literary computing, and to several tools useful for researching, producing, and delivering electronic text.  Students will learn how to plan and create a corpus of electronic text from written or oral sources, use Wordsmith Tools to find, analyze, and display information about a text, mark up a text with XML tags to facilitate finding desirable information, and use some of the text and tool archives already available on the World Wide Web, including the Whitman and Dickinson Archives, the Rossetti Archive, the British Library’s Shakespeare in Quarto, the Proceedings of the Old Baily (criminal trials in London from 1674-1834), and on-line concordances for many authors and publications. We will also examine the practices involved with doing Humanities research in the computer age and aks how these practices might be redefining the Humanities as an academic discipline.

This project-oriented course will be useful for anyone interested in working with a body of text, whether it be a specific literary text or group of texts, a critical edition, data from interviews, collections of folk tales from a variety of sources, or genre-specific language such as political speeches or trial transcripts.  Individual student projects may treat literary or corpus-based linguistics topics.  Readings and discussion will be heavily supplemented by hands-on computer work with texts and corpora.  Only a basic familiarity with everyday computer tools like email and web-browsing will be assumed.

Texts:

Words and Phrases, by Michael Stubbs (Blackwell, 2001) ISBN: 0-613-20833-X
A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004. (http://www.digitalhumanities.org/companion/)
Corpus-Based Language Studies: An Advanced Resource Book, by Anthony McEnery, Richard Xiao, and Yukio Tono (Routledge 2005) ISBN: 978-0-415-28623-7
Graphs, Maps, Trees: Abstract Models for a Literary History by Franco Moretti. (Verso, 2005). ISBN: 978-1844671854
Also, all students will need to have their university computer account ID and password.


Daily Syllabus
  Monday Wednesday
Aug 24
Course Intro. Empirical Linguistics
Homework assignment.
Background: CDH 7: Linguistics
26 basic electronic text and text repositories HW
Companion to Digital Humanities, Ch 18: Electronic Texts
optional: A Brief Introduction to Humanities Computing and Electronic Text (http://www.ceth.rutgers.edu/intromat/introtext.html)
Aug 31
Unix environment (Telnet, FTP); WordSmith Tools HW 2
examine inaugural speech texts

Sep 7
No Class--Labor Day
9
Stubbs, Chap 1; use the BNC sampler to do exercise 2, p. 22 for Monday
Sep 14
Stubbs, Chap 2. HW4 due
Sinclair: Corpus and Text: Basic Principles
16 Stubbs, Chap 3, Working with phrases
HW5 due.
Sep 21
lexical profiles
Sinclair: How to build a corpus
23
Stubbs Chap 4
HW6 due.
Sep 28
fixed phrases
http://linserv1.cims.nyu.edu:23232/ngram_oanc/
30
phrases
Oct 5
Using statistics for collocations:
Stubbs 3.9 (73-75); Stubbs 1995, sec. 4; McEnery sec. A6;
(helpful summary of chi-square rationale)
7
Mid-term project due
Oct 14
Midterm in class;
14
hand tagging techniques--Leech: Adding Linguistic Annotation
Nested Markup (HTML, SGML, XML)
Edward Vanhoutte An Introduction to the TEI and the TEI Consortium (2004)
Oct 19 CDH 17: Renear, "Text Encoding."
CDH 16: Willett, "Audiences and Purposes"
Text Encoding Initiative (TEI), "A Gentle Introduction to XML" (http://www.tei-c.org/P4X/SG.html)
21 Markup.
Allen Renear, Elli Mylonas, and David Durand. "Refining Our Notion of What Text Really Is: The Problem of Overlapping Hierarchies." Research in Humanities Computing (1996)
Oct 26 HW7 due.
CDH 16: McGann, "Marking Texts"
28 CDH 22: Smith, "Scholarly Editing;"  CDH 24: Palmer, "Thematic Collections"
Dickinson Electronic Archives; Emily Dickinson Lexicon; William Blake Archive
Nov 2
McGann and Samuels, Electronic texts and deformative reading; Travesty 4
"Ubu web wants to be free;" UbuWeb; Aspen;
Nov 9
Moretti: Graphs, Maps, Trees (Grad Students Only)
11
Site Reviews Due
Nov 16 Stubbs, Chap 6
18 Stubbs: "Conrad in the Computer" (Moodle)
Nov 23
Toolan: "Keyword Abridgment" (Moodle); "Narrative Progression" (Moodle)
25
No Class
Nov 30
Dillon: "Corpus, Creativity, Cliche" (Moodle)
2
Stubbs, Chap 9
Dec 7
Final Exam / Project Due 9 Mid-Exam Break


Attendance: This course is cumulative and we will move through the material at a rapid pace.  Missing class for any reason usually causes students to fall woefully behind. University policy is that you may miss 10% of the class meetings without institutional consequences.  Subsequent absences will cause your grade to suffer.  No make-up tests will be given unless you tell me in advance of class that you will be absent for some (important) reason.

Americans With Disabilities Act Compliance Statement
It is the policy of the University of Louisiana at Lafayette to afford equal opportunity in education to qualified students. If you have a disability that may prevent you from meeting course requirements, contact the instructor immediately to file a student disability statement and to develop an accommodation plan. Course requirements will not be waived but reasonable accommodations will be developed to assist you in meeting the requirements.

Percentages of Each Assignment:
Homeworks
10% 
Tests (2)
15% each 
Midterm Project
20%
Final Project 40% 

The tests will be take-home exercise sets.  The projects will be your own design, and may deal with any type of corpus you want to work with. Some possibilities  include literary, web-based, child language, historical, documentary (such as legal or legislative proceedings), comprehensive, or self-created.  Graduate students will be encouraged to create and work with corpora that can serve as the basis for their dissertation study.


Go to Clai's Home Page