Natural Language Processing and Machine Learning in the Humanities
The March 2016 failure of Microsoft’s prototype chatbot, Tay, was not a technological failure. It was a disciplinary failure. It was a failure of computer science to adapt and incorporate a critical humanist perspective when building systems in a complex cultural and social environment. Tay, which stands for “thinking about you,” was the name given to an artificial intelligence chatbot for Twitter that was quickly corrupted by users and began spewing racist, sexist, and homophobic slurs. Pundits quickly leapt to conclusions about the political beliefs of internet users, but these same pundits failed to understand that this hacking of Tay was in fact a critique of chatbots in the real world. Users of Twitter were exposing a fundamental error made by the Microsoft development team. Because the system learned directly from user input without editorial control or content awareness, Tay was quickly trained to repeat slurs by users eager to embarrass Microsoft.
This moment in technological development makes for an interesting anecdote, but it also represents the moment that chatbots entered the public consciousness in the 21st century. It represents a halting beginning to a period in which chatbots will become nothing less than the future direction of a unified interface for the whole of the web. Of course, chatbots captured imaginations in the late 90s as well. Systems like Cleverbot, Jabberwacky, and Splotchy were fascinating to play with, but they had no real application. Today, text based AI has been identified as the successor to keyword search. No longer will we plug keywords into Google, comb through lists of text, and depend on search engine optimization (SEO) to deliver the best content. Search will be around for a long time and is increasingly becoming more conversational, but in the near future much more content will be delivered through dialogue based text messenger services and voice controlled systems. We’ve seen the early stages of this change in products like Amazon’s Alexa, Apple’s Siri, Google Now, and Microsoft’s Cortana. We are now approaching a world that Apple envisioned in 1987 with a mockup system called the Knowledge Navigator that sought to give users an interactive and intelligent tool to access, synthesize, present, and share information seamlessly.
Humanities in the Loop
We are likely decades away from a true “knowledge navigator,” but a new generation of chatbots is now in development. The company that developed Siri for Apple is now in the final stages of development on a system called Viv. Viv is the first viable company to produce a unified interface for text and speech based AI assistants. Facebook has tested project M within its messenger app to allow users to issue commands, access services, and make purchases through text input. The remarkable thing about M is that Facebook has built a system with “humans in the loop.” When a service is accessed, perhaps by purchasing movie tickets, a human will fine tune the AI generated results for each transaction. There is currently an understanding within the machine learning community that human assisted training of these systems produces more accurate results but will also train more robust systems going forward. The current need for “human in the loop” systems means that we are at a crucial moment for humanists to lend their experience and critical abilities to the development and training of AI systems. In the field of machine learning, training a system to answer humanities based problems will show how these systems succeed or fail and will also demonstrate the value of the humanities in a digital world. If the purpose of the humanities is to better understand what it is to be human, training AI to answer philosophical, historical, or cultural questions will help us understand our experiences as we become more accustomed to intelligent systems. Grappling with AI, whether it is in a mundane consumer exchange or in matters of grave ethical importance, is rapidly becoming a practical problem in our lives.
It is for this reason that I believe that the next great advances in digital technology will be aided by humanists. With humanists in the loop, we will better understand the social and cultural contexts in which these systems appear and avoid the regrettable failure of systems like Tay in the future. We are currently on the cusp of a revolution in the applicability of natural language understanding, artificial intelligence, and conversation based interface design. These technologies will have ranging consequences socially, culturally, and economically in the coming decade, but these technologies are also deeply connected to the social and cultural contexts in which they appear. My goal is to train machines to learn humanistically. It is the literary critic’s ability to close read and interpret complex philosophical, historical, and artistic meaning that these systems lack. It is the historian’s ability to contextualize political and technological change within the breadth of human progress that will help these systems anticipate problems. It is the dramatist’s ability to understand performance and dialogue that will breathe life into our conversations with computers. Digital humanists are well situated to make the most of Natural Language Processing and find culturally significant training sets. As just one point of connection with our 20th century humanist tradition, my work aims to reorient Mikhail Bakhtin’s “dialogic imagination” in the 21st century. I wish to discover how the “double-voicedness” of the novel can become a participatory way of exposing the ideological boundaries in new media. I seek to reveal the “co-existing boundaries” between authors, artists, and other cultural figures and their audience through this near ubiquitous chat interface. Key philosophical and technical questions relate to how language is hybridized between user, author, and artificially intelligent systems. If a system learns from users, how do we treat the surviving legacies of authors as they blend with the contemporary voices? Can this machine learned “heteroglossia” of coexisting and conflicting dialects and languages teach us about the ideologies bound up in authorial voices from our past? How can a polyglot corpora from contemporary sources (social media, comment boards, public records, etc.) coalesce today’s ideologies and prejudices for coherent analysis and critique?
Faulknerbot is just the first of several systems that are capable of making large archival datasets available through a conversational interface. I am planning systems based on non-literary sources as well, such as the Truth and Reconciliation Commission archival documents as well as the National Inquiry into Murdered and Missing Indigenous Women and Girls. We are at a watershed moment to help communicate and disseminate these testimonies in as powerful and respectful way possible. We must also find ways to share these stories with young people. There is tremendous potential for conversational content discovery in these contexts. I am excited to collaborate with multiple knowledge stakeholders to generate these systems.
While the public appeal is important, expert researchers could use these systems to access and find passages related to their research and explore archival content in a conversational way. We are able to train a system with all the writings and interviews available. Interviews are an excellent training set because the questions asked by the interviewer anticipate user interests and model a conversational style of response. Chatbot-based content discovery is a low barrier access point well suited to mobile interfaces and younger scholars just becoming acquainted with archival resources. Expert researchers already well acquainted with archival resources may benefit from serendipitous discoveries that emerge through organic explorations of large textual corpora. This kind of interface holds potential for new media to engage with non-scholarly communities and new knowledge stakeholders.
Neural Conversation Models
There is a deeply emotional resonance that is carried through conversation. The blurring of lines between social media, search, and messaging will result in a seamless and unified interface for digital technology. Driven by the mobile space’s demand for streamlined UI design, we will become more reliant on assistive technologies that can anticipate, learn, and adapt to user input. The system that created Faulknerbot, for example, is a generative neural conversation model made possible with TensorFlow. This generative model uses sequence to sequence learning with neural networks (Ilya Sutskever, Oriol Vinyals, Quoc V. Le). This model links words statistically to determine “flows” of meaning through a tensor of words. Geoff Hinton calls this a “thought vector,” which allows for dynamic conversation in real time. In other words, this system links neural networks end-to-end to both decode user input and encode dynamic responses. These neural conversation systems are not merely a retrieval method, which limits the scope of the conversation. This system dynamically learns and allows for a retention of what has been said. The generative model allows for context based discussion without resorting to an enormous conversation log. In Tensor Flow, this operates on a Long Short Term Memory (LSTM) network. The cultural, social, and pedagogical consequences of these systems have yet to be fully understood. It is important for the humanities to anticipate this new cultural space. When the Google autocomplete system was introduced to search, there were many cultural commentators decrying the loss of independent thought and the potential for entrenching sexist and racist ideas. We have a urgent political and social need to demonstrate and build ethical systems. Technology that offends our sense of what it is to be essentially human is usually the next important media type, but we must be careful to retain our fundamental humanity. As scholars and citizens, we must begin training our students and faculty to make use of these systems. Technological change represents a moment of tremendous anxiety and uncertainty, but technological literacy represents the best possible means to make a productive use of this uncertainty. As with past periods of rapid change, it will been that most human of qualities, literacy, that will define the future.
Generalizing these systems is a difficult task, to be sure. AIML (Artificial Intelligence Markup Language) is the longest running technique for building chatbots. However, the tree structures developed in markup are not dynamic or scalable enough for a robust chatbot. They are based on crude if/else statements. There are many other platforms that help build similar tree structures in a GUI. The platforms wit.ai, pandorabots.com, and chatfuel.com have proven to be robust enough to train a system to answer specific questions. However, once larger training sets are established within a machine learning framework, a much more reliable and scalable system is possible. By building machine learning systems with Scikit-learn, TensorFlow, NLTK, and spaCy, static tree based systems can be augmented with large archives of both supervised and unsupervised data. With a team of undergraduate and graduate students lending the human oversight, effectively training these systems becomes a question of scaling machine learning. Collecting corpora and encoding the material to define user intention within any given expression becomes the task for students within the “bot factory.” When we consider the limits of machine learning, scholarly communication through chat interfaces is certainly the next logical step. However, these systems need human oversight. They require thoughtful and critical reflection. They require an attention to deep and nuanced meaning. They require a humanist in the loop.Next Page: Previous Page: