What to expect?
The programming language Python is widely used within many scientific domains nowadays and the language is readily accessible to scholars from the Humanities. Python is an excellent choice for dealing with (linguistic as well as literary) textual data, which is so typical of the Humanities. In this tutorial you will be thoroughly introduced to the language and be taught to program basic algorithmic procedures. This tutorial expects no prior experience with programming, although we hope to provide some interesting insights and skills for more advanced programmers as well. The tutorial consists of six chapters.
- Chapter 1 starts with the very basics where we will try to whet your appetite. You will be asked to do many short quizes to test whether you really understand the material.
- Chapter 2 will introduce you to the task of text processing. You will learn how to read files from
your computer, how to clean them and compute a frequency distribution over words.
- Chapter 3 deals with preprocessing text. You will learn some of the elementary tools to analyse your data.
- Chapter 4 is a more theoretical chapter that explains to you some of the basic programming principles, common practices and where to find documentation.
- In chapter 5 things are becoming increasingly difficult. First, you will write a program to compute the readability of texts. Next, you will implement the basic algorithm that is behind authorship attribution!
- In chapter 6 we will introduce you to the concept of Object Oriented Programming. You will implement a network structure with which you can analyze relations between people on Twitter.
In the course we will be using iPython Notebook software that works best with Google Chrome, Firefox and Safari will also work. Internet Explorer is not supported!
We will be using Python 3 in this tutorial, so lower versions of Python are not sufficient. Below we describe the installation procedure for Python and the necessary dependencies for this tutorial.
We also recommend you to install a good text editor, such as Sublime text 2. You are of course absolutely free to use your own favorite editor.
- We strongly advice you to install the Anaconda Python Distribution. This distribution contains all the necessary modules and packages needed for this course. It is available for all platforms and provides a simple installation procedure. You can download it from here. More detailed installation instructions can be found here After you have successfully installed Anaconda, simply double-click the file start-windows (if you work with Windows), start-osx.command (if you work on a Mac) or start-unix.sh (if you work with Linux).
- Download and install the Anaconda Python Distribution (see above).
- Double-click the file start-windows.bat. You can find this file in the folder of the course.
- If everything goes right, this should open your browser (preferably Google Chrome or Firefox) on a page http://127.0.0.1:8888/ (or something similar) which says `IP[y]: Notebook'. If for some reason, the notebook is opened by IE, copy the URL and paste that in either Google Chrome or Firefox.
Only take these steps if you know what you are doing. Otherwise, simply download and install the Anaconda Python Distribution (see above).
Ubuntu 12.10 and above:
If you are on another distribution, look for similar packages. If no package like ipython3-notebook is available for your distribution (such as on Ubuntu 12.04 and below). Then follow the below procedure instead. Adapt the lines with apt-get lines for your package manager, on Fedora/RedHat/CentOS and SuSE this will be yum instead:
- Open a terminal
- First deinstall ipython3 if it is already installed through your package manager, we will be reinstalling it from a newer source:
sudo apt-get remove ipython3
- Then type:
sudo apt-get install python3 python3-setuptools
sudo easy_install3 tornado
sudo easy_install3 zmq
sudo easy_install3 ipython
If you do not want to install the ipython notebook or just want to see what the tutorial is about, you can check out the static notebooks below.
Chapter 1 - Getting started
Chapter 2 - First steps into text processing
Chapter 3 - Text Analysis
Chapter 4 - Programming principles
Chapter 5 - Building NLP applications
Chapter 6 - Object Oriented Programming