There are five workshops, and they are described below.  One of them, “Machine Learning in the Computational Humanities,” will be offered on Friday afternoon, October 24. The remaining four will be scheduled on Saturday, October 25. The workshops will be equally available to TEI and DHCS attendees. Applications will be handled on a first-come/first-serve basis.

Register for a workshop by sending  an email to, with the subject line “Register | {name of workshop}”. Please be sure to include your full name and, if applicable, institutional affiliation.

Registration for the TAPAS workshop has closed.

Using and Customizing TEI Boilerplate

John Walsh (Indiana)

Saturday, October 25, 9am to noon. Place TBA

An introduction to  TEI Boilerplate, a practical and no-nonsense system for publishing TEI documents on the Web.

Full description of workshop

TAPAS: TEI Archiving Publishing and Access Service

Syd Bauman, Benjamin Doyle, Julia Flanders (Northeastern University)

Saturday, October 25, 9:00 – 17:00.  Place TBA

A roll-out of the TAPAS project, a service for scholars and other creators of TEI data who need a place to publish their materials in different forms and ensure it remains accessible over time. The goal of TAPAS is to provide TEI publishing and repository services at low cost to those who lack institutional resources: faculty, students, librarians, archivists, teachers, and anyone else with TEI data who wants to store, share, and publish it.

To register, send an email to, with the subject line: Register | October 2014 TAPAS Workshop. Please be sure to include your full name and, if applicable, institutional affiliation.

Full description of workshop

An introduction to TEI’s ODD: One Document Does it All

Lou Burnard,  Sebastian Rahtz (Oxford)

Saturday, October 25, 9 :00 – 17:00 .  Place TBA

A general and hands-on introduction to the TEI’s  “literate programming” method of documentation.

Full description of workshop

The Music Encoding Initiative: a one-day survey

Perry Roland (Virginia) and Laurent Pugin (Répertoire International des Sources Musicales (RISM))

Saturday, October 25, 9:00 – 17:00. Place TBA

Perry Roland, a librarian at the University of Virginia, is the inventor of MEI, the musical cousin of the TEI. He will be joined by the co-director of the Swiss RISM in giving  an overview of MEI, about whose history you can find out more at

Full description of workshop

Machine Learning for the Computational Humanities

David Bamman
School of Computer Science, Carnegie Mellon University

Friday, October 24, 13:00 – 17:00 Place TBA

Machine learning is a branch of computer science that helps drive much of the exciting work in the computational corners of the humanities and social sciences; its methods underlie topic models, classifiers, clustering algorithms, syntactic parsers and named entity recognizers (among much more).  A variety of tools like MALLET and Weka have made the application of machine learning techniques widespread, but it’s easy to see them as black boxes; the goal of this tutorial is to break open these boxes and have a look inside.
We’ll survey a range of existing methods in machine learning, and answer the following questions for each one:
* What’s the basic intuition behind it?
* What assumptions does it make about the world (or the data)?
* Why would we prefer this method over others?
* What tools can we use to implement this method?
* How might you use this method for research in the humanities?
Machine learning techniques that we’ll cover include:
* Topic modeling and other probabilistic graphical models
* Classification methods (Logistic regression, Naive Bayes, CRFs, HMMs etc.)
* Clustering (EM, K-means, hierarchical clustering)
* Representation learning (including “deep learning”)
* Supervised vs. unsupervised learning
By the end of the tutorial, participants will be able to explain how each of these methods works from a high-level perspective, understand what is a good (and bad) time to apply each one, and know where to go for more information.  No prior computational background is required. This tutorial is free and open to the public.
David Bamman is a PhD student in Computer Science at Carnegie Mellon University.  His research applies natural language processing and machine learning to empirical questions in the humanities and social sciences, including modeling linguistic variation (ACL 2014, Journal of Sociolinguistics 2014), inferring character types in movie plot summaries (ACL 2013) and novels (ACL 2014), inferring social rank in an Old Assyrian trade network (DH 2013) and detecting censorship in Chinese social media (First Monday 2012).  David designed and co-taught an interdisciplinary (English/Computer Science) course at CMU on “Digital Literary and Cultural Studies,” for which he received Carnegie Mellon’s 2014 Alan J. Perlis Graduate Student Teaching Award.  Prior to CMU, David was a senior researcher at the Perseus Project.