Code
= "https://raw.githubusercontent.com/MUSA-550-Fall-2023/week-2/main/data/measles_incidence.csv"
url = pd.read_csv(url, skiprows=2, na_values="-") data
This page is generated from a Jupyter notebook and shows examples of embedding interactive charts produced using Altair and hvPlot.
First, let’s load the data for measles incidence in wide format:
= "https://raw.githubusercontent.com/MUSA-550-Fall-2023/week-2/main/data/measles_incidence.csv"
url = pd.read_csv(url, skiprows=2, na_values="-") data
YEAR | WEEK | ALABAMA | ALASKA | ARIZONA | ARKANSAS | CALIFORNIA | COLORADO | CONNECTICUT | DELAWARE | ... | SOUTH DAKOTA | TENNESSEE | TEXAS | UTAH | VERMONT | VIRGINIA | WASHINGTON | WEST VIRGINIA | WISCONSIN | WYOMING | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1928 | 1 | 3.67 | NaN | 1.90 | 4.11 | 1.38 | 8.38 | 4.50 | 8.58 | ... | 5.69 | 22.03 | 1.18 | 0.4 | 0.28 | NaN | 14.83 | 3.36 | 1.54 | 0.91 |
1 | 1928 | 2 | 6.25 | NaN | 6.40 | 9.91 | 1.80 | 6.02 | 9.00 | 7.30 | ... | 6.57 | 16.96 | 0.63 | NaN | 0.56 | NaN | 17.34 | 4.19 | 0.96 | NaN |
2 | 1928 | 3 | 7.95 | NaN | 4.50 | 11.15 | 1.31 | 2.86 | 8.81 | 15.88 | ... | 2.04 | 24.66 | 0.62 | 0.2 | 1.12 | NaN | 15.67 | 4.19 | 4.79 | 1.36 |
3 | 1928 | 4 | 12.58 | NaN | 1.90 | 13.75 | 1.87 | 13.71 | 10.40 | 4.29 | ... | 2.19 | 18.86 | 0.37 | 0.2 | 6.70 | NaN | 12.77 | 4.66 | 1.64 | 3.64 |
4 | 1928 | 5 | 8.03 | NaN | 0.47 | 20.79 | 2.38 | 5.13 | 16.80 | 5.58 | ... | 3.94 | 20.05 | 1.57 | 0.4 | 6.70 | NaN | 18.83 | 7.37 | 2.91 | 0.91 |
5 rows × 53 columns
Then, use the pandas.melt()
function to convert it to tidy format:
= data.drop("WEEK", axis=1)
annual = annual.groupby("YEAR").sum().reset_index()
measles = measles.melt(id_vars="YEAR", var_name="state", value_name="incidence") measles
YEAR | state | incidence | |
---|---|---|---|
0 | 1928 | ALABAMA | 334.99 |
1 | 1929 | ALABAMA | 111.93 |
2 | 1930 | ALABAMA | 157.00 |
3 | 1931 | ALABAMA | 337.29 |
4 | 1932 | ALABAMA | 10.21 |
Finally, load altair:
import altair as alt
And generate our final data viz:
# use a custom color map
= alt.Scale(
colormap =[0, 100, 200, 300, 1000, 3000],
domainrange=[
"#F0F8FF",
"cornflowerblue",
"mediumseagreen",
"#FFEE00",
"darkorange",
"firebrick",
],type="sqrt",
)
# Vertical line for vaccination year
= pd.DataFrame([{"threshold": 1963}])
threshold
# plot YEAR vs state, colored by incidence
= (
chart
alt.Chart(measles)
.mark_rect()
.encode(=alt.X("YEAR:O", axis=alt.Axis(title=None, ticks=False)),
x=alt.Y("state:N", axis=alt.Axis(title=None, ticks=False)),
y=alt.Color("incidence:Q", sort="ascending", scale=colormap, legend=None),
color=["state", "YEAR", "incidence"],
tooltip
)=650, height=500)
.properties(width
)
= alt.Chart(threshold).mark_rule(strokeWidth=4).encode(x="threshold:O")
rule
= chart + rule
out out
Generate the same data viz in hvplot:
# Make the heatmap with hvplot
= measles.hvplot.heatmap(
heatmap ="YEAR",
x="state",
y="incidence", # color each square by the incidence
C=np.sum, # sum the incidence for each state/year
reduce_function=450,
frame_height=600,
frame_width=True,
flip_yaxis=90,
rot=False,
colorbar="viridis",
cmap="",
xlabel="",
ylabel
)
# Some additional formatting using holoviews
# For more info: http://holoviews.org/user_guide/Customizing_Plots.html
= heatmap.redim(state="State", YEAR="Year")
heatmap = heatmap.opts(fontsize={"xticks": 0, "yticks": 6}, toolbar="above")
heatmap heatmap