Our team of PhD data science specialists serves the Harvard social science community. We perform free in-person consulting on publication research issues as a part of the Institute for Quantitative Social Science. Our services are available to faculty, postdocs, graduate students, staff, and undergraduates writing a senior thesis. Specifically, we aim to provide advice on:

  • Data analysis and programming
  • Organization, secure storage, and sharing of data
  • Research project planning
  • Training in the use of both established software packages and emerging tools

To schedule a consultation or request help, please send a detailed description of your specific research problem, your affiliation (School / Dept.), and position (e.g., grad student) to help@iq.harvard.edu or use our contact form.

Our team also offers additional data science services. If you have a data science problem and you wish to speak to us, please contact us at help@iq.harvard.edu.


Introduction to R

2017 Sep 15

Introduction to R (1:00 -- 3:30)

Date: 

Friday, September 15, 2017, 1:00pm to 3:30pm

Location: 

K018, CGIS Knafel building, concourse level

Get an introduction to R, the open-source system for statistical computation and graphics. 

With hands-on exercises, learn how to import and manage datasets, create R objects, install and load R packages, conduct basic statistical analyses, and create common graphical displays. This workshop is appropriate for those with little or no prior experience with R.

 

More details including workshop materials are available at http://dss.iq.harvard.edu/workshop-materials#widget-2.

This workshop is free for Harvard and MIT affiliates. Click here to sign up!

 

2018 Feb 16

Introduction to R (Feb. 16th, 9:30 -- 12:30)

Date: 

Friday, February 16, 2018, 9:30am to 12:30pm

Location: 

K018, CGIS Knafel building, concourse level

Get an introduction to R, the open-source system for statistical computation and graphics. 

With hands-on exercises, learn how to import and manage datasets, create R objects, install and load R packages, conduct basic statistical analyses, and create common graphical displays. This workshop is appropriate for those with little or no prior experience with R.

More details including workshop materials are available at http://dss.iq.harvard.edu/workshop-materials#widget-2.

This workshop is free for Harvard and MIT affiliates. Click here to sign up!

This workshop is full and registration is closed.

Regression Models in R

2017 Sep 22

Regression Models in R

9:30am to 12:00pm

Location: 

Belfer Case Study Room, CGIS South building, concourse level

This hands-on, intermediate R course will demonstrate a variety of statistical procedures using the open-source statistical software program, R. 

... Read more about Regression Models in R

Introduction to R Graphics with ggplot2

Basic R Programming for Data Analysis

Introduction to Stata

2018 Feb 23

Introduction to Stata (FULL)

9:30am to 12:30pm

Location: 

K018, CGIS Knafel building, concourse level

This class will provide a hands-on introduction to Stata. You will learn how to navigate Stata’s graphical user interface, import data, calculate descriptive statistics and manage data and value labels. This workshop is designed for individuals who have little or no experience using Stata software.

... Read more about Introduction to Stata (FULL)

Data Management in Stata

2016 Apr 01

Data Management in Stata

9:00am to 12:00pm

Location: 

Rm K018, 1737 Cambridge St (CGIS Knafel Building)

This class will introduce common data management techniques in Stata.  Topics covered include basic data manipulation commands such as: recoding variables, creating new variables, working with missing data, and generating variables based on complex selection criteria.  Participants will be introduced to strategies for merging datasets (adding both variables and observations), and collapsing datasets. This workshop is intended for users who have an introductory level of knowledge of Stata software.

This workshop is free for Harvard and MIT affiliates. Click here to sign up!

... Read more about Data Management in Stata

Regression and Graphing in Stata

2017 Sep 08

Regression and Graphing in Stata

1:00pm to 3:30pm

Location: 

K018, CGIS Knafel building, concourse level

This hands-on class will provide a comprehensive introduction to graphics in Stata.  Topics for the class include graphing principles, descriptive graphs, linear regression, factor variables, and post-estimation graphs.  This is an introductory workshop appropriate for those with only basic familiarity with Stata.

... Read more about Regression and Graphing in Stata

Introduction to Python

2016 Apr 28

Introduction to Python

9:30am to 12:30pm

Location: 

Rm K018, 1737 Cambridge St (CGIS Knafel Building)

Have you always wanted to learn a programming language, but not sure how to get started? This workshop teaches the basic grammar of the python programming language, a powerful but easy to use tool for getting more out of your computer.  Little to no knowledge of python or programming is assumed.

This workshop is free for Harvard and MIT affiliates. Click here to sign up!

... Read more about Introduction to Python

2017 Mar 23

Introduction to Python for Data Analysis 2

6:00pm to 8:00pm

Location: 

1737 Cambridge St. Cambridge, CGIS Knafel Building, K018

The modules for manipulating tabular data in Python are different enough, that it sometimes feels like a different language from basic python. Building on the foundation from Introduction to Python for Data Analysis 1, we will explore this aspect of python together, loading simple datasets into Python.

Prerequites: Introduction to Python for Data Analysis 2

Workshop Preparation

In preparation for the workshop please install the Anaconda distribution of Python 3.5.  It can be found here:
https://www.continuum.io/downloads...

Read more about Introduction to Python for Data Analysis 2
2017 Oct 06

Introduction to Python

9:30am to 12:00pm

Location: 

1737 Cambridge St., CGIS Knafel Building, Room K018 (Concourse Level)

This workshop introduces the basic elements of Python, a general purpose programming language commonly used for data cleaning, analysis, visualization, and other applications. Participants will learn how to use the language as well as how to set up a development environment for Python on their personal computer. This workshop is intended for social scientists who are new to programming. No experience is required.

... Read more about Introduction to Python

2018 Mar 09

Introduction to Python (FULL)

9:30am to 12:00pm

Location: 

1737 Cambridge St., CGIS Knafel Building, Room K018 (Concourse Level)

 

This workshop introduces the basic elements of Python, a general purpose programming language commonly used for data cleaning, analysis, visualization, and other applications. Participants will learn how to use the language as well as how to set up a development environment for Python on their personal computer. This workshop is intended for social scientists who are new to programming. No experience is required.

... Read more about Introduction to Python (FULL)

Intermediate Python

2016 Apr 29

Intermediate Python

9:30am to 12:30pm

Location: 

Rm K018, 1737 Cambridge St (CGIS Knafel Building)

This course is a survey of advanced features of the python programming language that are relevant to data analysis.  This includes exposure to some of the most powerful features of python, such as functional and object-oriented programming.  In addition, we will learn how to use inspection to learn about the undocumented features of new modules and data structures.

This workshop is free for Harvard and MIT affiliates. Click here to sign up!

... Read more about Intermediate Python

2016 Nov 03

Visualization in Python

6:00pm to 8:00pm

Location: 

1737 Cambridge St. Cambridge, CGIS Knafel Building, K018

In this course, we explore what it takes to create beautiful visualizations in Python using the matplotlib and seaborn packages.
Prerequites: Introduction to Python for Data Analysis 1; Introduction to Python for Data Analysis 2

 

Workshop Preparation

In preparation for the workshop please install the Anaconda distribution of Python 3.5.  It can be found here:

https://www.continuum.io/downloads

This is the only version of Python that will be supported. If you are having trouble installing this...

Read more about Visualization in Python
2017 Mar 29

Text Analysis in Python

6:00pm to 8:00pm

Location: 

1737 Cambridge St. Cambridge, CGIS Knafel Building, K018

Python is an extremely powerful tool for text analysis. We will explore the use of TextBlob, nltk and scipy for text analysis.


Prerequites: This is an advanced python workshop. To get the most out of the material, comfort with base python is recommended, along with some familiarity to numpy and scipy, and some exposure to pandas.

 

Workshop Preparation

In preparation for the workshop please install the Anaconda distribution of Python 3.5.  It can be found here:
https://www.continuum.io/downloads

... Read more about Text Analysis in Python
2017 Oct 13

Introduction to Using APIs With Python

1:00pm to 3:30pm

Location: 

K018, CGIS Knafel building, concourse level

An application programming interface (API) is a tool that allows computers to communicate and share information. For social scientists, APIs can be useful for accessing data or services from firms, organizations, or government agencies. This workshop will introduce the use of APIs to obtain data from sources such as Survey Monkey, Twitter, or Data.gov. This workshop is intended for social scientists who are new to working with APIs, but have some familiarity with Python or have attended the Introduction to Python workshop.
... Read more about Introduction to Using APIs With Python

2018 Mar 23

Introduction to Web Scraping With Python (FULL)

9:30am to 12:00pm

Location: 

K018, CGIS Knafel building, concourse level

Web scraping is a method of extracting and restructuring information from web pages. This workshop will introduce basic techniques for web scraping using the popular Python libraries BeautifulSoup and Requests. Participants will practice accessing websites, parsing information, and storing data in a CSV file. This workshop is intended for social scientists who are new to web scraping, but have some familiarity with Python or have attended the Introduction to Python workshop.

... Read more about Introduction to Web Scraping With Python (FULL)

 

My Data Science Tool Box

This post describes the tools I currently use for working with data. People often ask me to recommend specific tools, and I always hesitate, because so much boils down to personal preference. I recently added a workshop to the DSS lineup providing an overview of popular tools for working with data. The core idea is that researchers have a lot of choices available when it comes to choosing tools to implement a reproducible workflow. For example, it doesn't really...

Read more about My Data Science Tool Box

Update: Stata v14.1 and 15 Advisory: xtreg, fe does what you expect; manual is incorrect

UPDATE: Sunday, March 4

Yesterday, I received the following message from David Drukker, the Executive Director of Econometrics at Stata:

"The xtreg-fe command in Stata produces consistent point estimates and
standard errors for all the model parameters.  There is a typo on page 27 of
https://www.stata.com/manuals/xtxtreg.pdf .  The formula for bar(bar(y))
should be the grand mean instead of the average of the panel-level means.

William Gould explicitly derived the grand mean as the term to add back in
to...

Read more about Update: Stata v14.1 and 15 Advisory: xtreg, fe does what you expect; manual is incorrect
Read more