Backtesting theWallStreetBets trading strategy in Python
I’ve spent a large part of the last year working with data from discussion on wallstreetbets, the internet’s most active day trading community, and want to show how to create and backtest a simple strategy using the data. Implementing this strategy will require a working version Python along with access to the Trader plan of Quiver’s API.
The Strategy
Simplicity is the goal here, as I just want to provide a framework which can be built upon as desired:
- Get data on the previous week’s WallStreetBets discussion.
- Identify the five most mentioned stocks.
- Buy those stocks at the start of the trading week, sizing positions based on how much they were talked about in proportion to each other.
- Sell positions at the end of the trading week.
- Repeat
One thing which I do not incorporate into this strategy is any information on the sentiment of wallstreetbets towards individual stocks. While the subreddit generally tends towards long positions (“stocks only go up” is a common saying) this is something that might be worth implementing in a more sophisticated strategy.
Implementation
Getting wallstreetbets discussion data
I used the quiverquant package in Python to easily access wallstreetbets discussion data through Quiver Quantitative’s API.
import quiverquant
import pandas as pd#Replace <token> with your personal token
quiver = quiverquant.quiver(<token>)df = quiver.wallstreetbets(date_from=”20180901")
Using the above code, I am able to get a Pandas DataFrame of approximately 348k rows with daily data on WallStreetBets discussion going back to September 2018. With the Institutional plan you can dive deeper into analyzing the sentiment of comments and take a more granular look at discussion, but for the purposes of implementing this strategy that isn’t necessary.
I will then group the data to get the number of times each ticker was mentioned each week.
dfWeek = df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Ticker'])['Count'].sum().reset_index().sort_values('Date')
Backtesting
Next up is the implementation of the strategy. I’m not going to go into too much depth on this block of code, because I expect that most of you will be more interested in building your own strategies rather than copying the one that I show here.
import yfinance as yf
import datetime as dt
dfLarge = dfWeek[dfWeek["Count"]>1]
dfLarge = dfLarge.sort_values("Date", ascending=True)
dates = dfLarge["Date"].unique()
#Initial capital of mock portfolio
capital = 100000
started = False
startedDFW = False
for date in dates[:-2]:
dfW = dfLarge[dfLarge["Date"]==date]
dfW = dfW.sort_values("Count", ascending=False).head(5)
dfW['prop'] = dfW['Count']/dfW["Count"].sum()
dfW['buy'] = capital*dfW['prop']
buyDate = date+pd.Timedelta(days=6)
dfW['buyDate'] = [buyDate]*len(dfW['buy'])
if not startedDFW:
dfWs = dfW
startedDFW = True
else:
dfWs = pd.concat([dfWs, dfW])
sellDate = date+pd.Timedelta(days=15)
startedWeek = False
print(date)
for index, row in dfW.iterrows():
ticker = row["Ticker"]
print(ticker)
try:
ytStock = yf.download(ticker, start=str(buyDate.date()), end=str(sellDate.date()), interval="1d").reset_index()
shares = row["buy"]/ytStock["Adj Close"].values[0]
ytStock = ytStock.iloc[1:]
except:
print("Error")
ytStock = yf.download("SPY", start=str(buyDate.date()), end=str(sellDate.date()), interval="1d").reset_index()
shares = row["buy"]/ytStock["Adj Close"].values[0]
ytStock = ytStock.iloc[1:]
ytStock["OpenAmount"] = ytStock["Open"]*shares
ytStock["CloseAmount"] = ytStock["Adj Close"]*shares
ytStock["Ticker"] = [ticker]*len(ytStock["OpenAmount"])
ytStock = ytStock.fillna(method='ffill')
ytStock = ytStock.fillna(method='bfill')
ytStock = ytStock.dropna()
if not startedWeek:
dfCombined = ytStock
startedWeek = True
else:
dfCombined = pd.concat([dfCombined, ytStock])
if not started:
dfAll = ytStock
started = True
else:
dfAll = pd.concat([dfAll, ytStock])
capital = 0
for ticker in dfCombined["Ticker"].unique():
dfT = dfCombined[dfCombined["Ticker"]==ticker]
capital+=dfT["CloseAmount"].values[-1]
print("Week end capital: ", capital)
Visualization & Analysis
Because I want to compare the performance of this WSB portfolio with the market, I will also get data on the performance of SPY over the same time frame.
dfDay = dfAll.groupby("Date").sum().reset_index()
dfDay["Fund"] = ["WSB"]*len(dfDay["Close"])
dfSPY = yf.download("SPY", start="2018-09-01", end="2021-02-18", interval="1d").reset_index()
dfSPY["Fund"] = ["S&P 500"] * len(dfSPY["Open"])
shares = 100000/dfSPY["Open"].values[0]
dfSPY["OpenAmount"] = dfSPY["Open"]*shares
dfSPY["CloseAmount"] = dfSPY["Close"]*shares
dfCombined = pd.concat([dfDay, dfSPY])
Now I can graph out how the WSB fund did compared to the market using Plotly.
import plotly.express as px
import plotly
fig = px.line(dfCombined, x="Date", y="CloseAmount", title='WSB', color="Fund", color_discrete_sequence=["rgb(229, 81, 39)","rgb(118, 213, 232)" ])
wsbReturn = (capital-100000)/100000*100
fig.update_layout(title="<b>+"+str(round(wsbReturn, 2))+"% Return</b><br>Aug 2018 - Feb 2021", titlefont=dict(color='rgb(229, 81, 39)', size=20), plot_bgcolor='rgb(32,36,44)', paper_bgcolor='rgb(32,36,44)')
fig.update_xaxes(title_text="",color='white', showgrid=False, tickfont=dict(size=10))
fig.update_yaxes(title_text="$", color='white', showgrid=False, titlefont=dict(size=20),gridcolor="rgb(228,49,34)")
fig.update_layout(
legend=dict(
title=dict(text="",font=dict(color='white')),
x=.85, y=1.15,
font=dict(
color='white',
size=15
)
)
)
fig.update_traces(line=dict(width=3))
fig.show()
I can also see what the portfolio was comprised of each week.
import plotly.graph_objects as go
fig = go.Figure(px.bar(dfWs, x="buyDate", y="buy", color='Ticker',text='Ticker',color_discrete_sequence=px.colors.qualitative.Light24))
fig.update_layout(title="Portfolio by Week", titlefont=dict(color='rgb(228,49,34)'), plot_bgcolor='rgb(32,36,44)', paper_bgcolor='rgb(32,36,44)')
fig.update_xaxes(title_text="",color='white', showgrid=False, fixedrange=False)
fig.update_yaxes(title_text="$",color='white', showgrid=False, fixedrange=False,gridwidth=1,gridcolor="rgb(109,177,174)")
fig.update_layout(
legend=dict(
title=dict(text="Ticker",font=dict(color="white")),
font=dict(
color='white'
),
)
)
fig.show()
This graphic is pretty indistinguishable as a static image, but I put interactive versions of the visualizations up on this dashboard, which allows you to see the information by zooming and hovering.
Conclusion
It probably goes without saying that the past performance of this strategy is no indication of future results and that this post is not intended as financial advice.
That being said, I do think that, in the right hands, there is strong potential in using data from wallstreetbets discussion to generate alpha.