Python resources
MUSA 550 assumes some general familiarity with programming concepts, but there aren’t any formal Python prerequisites. However, we recommend all students that they use some of the free, online resources for learning Python’s core concepts. Below, we include a number of online resources that are freely available for students in the course.
DataCamp courses
DataCamp is providing 6 months of complimentary access to its courses for students in MUSA 550. Whether you have experience with Python or not, this is a great opportunity to learn the basics of Python and practice your skills.
It is strongly recommended that you watch some or all of the introductory videos below to build a stronger Python foundation for the semester. The more advanced, intermediate courses are also great — the more the merrier!
To gain access, use this unique invite link. You will need to create or sign in to a DataCamp account with an “upenn.edu” email address.
Introductory DataCamp courses include:
- Introduction to Python for Data Science
- Python Data Science Toolbox, Part 1
- Python DataScience Toolbox, Part 2
- Introduction to NumPy
And there are also shorter, free tutorials available on some core Python concepts:
A few courses covering more advanced topics include:
There are also courses available to help reinforce topics we will cover in detail during the semester, including:
- Data manipulation with pandas
- Joining data with pandas
- Introduction to Data Visualization with seaborn
- Introduction to Data Visualization with matplotlib
- Intermediate Data Visualization with seaborn
- Supervised Learning with scikit-learn
Check out the full list of available Python courses on DataCamp’s website.
Introductory Python tutorials
The following tutorials assume no background in Python and provide a fairly comprehensive introduction to Python and its core concepts.
- Practical Python Programming by David Beazley
- Python for Social Science (in particular, the first four chapters)
- Scientific Python Basics from the Berkeley Institute for Data Science (notebook version)
More in-depth resources
There are two books available online for free that can serve as good resources for introductory Python basics as well as more advanced data science concepts.
Python for Data Analysis
The Python for Data Analysis book by Wes McKinney (the creator of Pandas) is an excellent resource that covers the pandas library in great detail. The first few chapters are very good at covering the foundations of Python that we will use in this course:
The Python Data Science Handbook
The The Python Data Science Handbook by Jake VanderPlas is a free, online textbook covering the Python basics needed for this course. It is a bit more advanced than the resources in the previous section and assumes some familiarity with Python.
In particular, the first four chapters are excellent:
The data analysis library pandas
and the visualization library matplotlib
will be covered extensively in this course, but the above chapters provide additional background material on this foundational Python tools.
Note: You can click on the “Open in Colab” button for each chapter and run the examples interactively in a cloud computing environment directly in the browser (using Google Colab).
Additional Resources
- The Berkeley Institute for Data Science has compiled a number of Python resources
- The subreddit r/learnpython is a good place for Python resources — it maintains a comprehensive wiki of resources and tutorials.