Backtesting theWallStreetBets trading strategy in Python

4 min readMay 23, 2021

I’ve spent a large part of the last year working with data from discussion on wallstreetbets, the internet’s most active day trading community, and want to show how to create and backtest a simple strategy using the data. Implementing this strategy will require a working version Python along with access to the Trader plan of Quiver’s API.

The Strategy

Simplicity is the goal here, as I just want to provide a framework which can be built upon as desired:

Get data on the previous week’s WallStreetBets discussion.
Identify the five most mentioned stocks.
Buy those stocks at the start of the trading week, sizing positions based on how much they were talked about in proportion to each other.
Sell positions at the end of the trading week.
Repeat

One thing which I do not incorporate into this strategy is any information on the sentiment of wallstreetbets towards individual stocks. While the subreddit generally tends towards long positions (“stocks only go up” is a common saying) this is something that might be worth implementing in a more sophisticated strategy.

Implementation

Getting wallstreetbets discussion data

I used the quiverquant package in Python to easily access wallstreetbets discussion data through Quiver Quantitative’s API.

import quiverquant
import pandas as pd#Replace <token> with your personal token
quiver = quiverquant.quiver(<token>)df = quiver.wallstreetbets(date_from=”20180901")

Using the above code, I am able to get a Pandas DataFrame of approximately 348k rows with daily data on WallStreetBets discussion going back to September 2018. With the Institutional plan you can dive deeper into analyzing the sentiment of comments and take a more granular look at discussion, but for the purposes of implementing this strategy that isn’t necessary.

I will then group the data to get the number of times each ticker was mentioned each week.

dfWeek = df.groupby([pd.Grouper(key='Date', freq='W-MON'), 'Ticker'])['Count'].sum().reset_index().sort_values('Date')

Backtesting

Next up is the implementation of the strategy. I’m not going to go into too much depth on this block of code, because I expect that most of you will be more interested in building your own strategies rather than copying the one that I show here.

import yfinance as yf
import datetime as dt
dfLarge = dfWeek[dfWeek["Count"]>1]
dfLarge = dfLarge.sort_values("Date", ascending=True)
dates = dfLarge["Date"].unique()
#Initial capital of mock portfolio
capital = 100000
started = False
startedDFW = False
for date in dates[:-2]:
    dfW = dfLarge[dfLarge["Date"]==date]
    dfW = dfW.sort_values("Count", ascending=False).head(5)
    dfW['prop'] = dfW['Count']/dfW["Count"].sum()
    dfW['buy'] = capital*dfW['prop']
    buyDate = date+pd.Timedelta(days=6)
    dfW['buyDate'] = [buyDate]*len(dfW['buy'])
    if not startedDFW:
        dfWs = dfW
        startedDFW = True
    else:
        dfWs = pd.concat([dfWs, dfW])
    sellDate = date+pd.Timedelta(days=15)
    startedWeek = False
    print(date)
    for index, row in dfW.iterrows():
        ticker = row["Ticker"]
        print(ticker)
        try:
            ytStock = yf.download(ticker, start=str(buyDate.date()), end=str(sellDate.date()), interval="1d").reset_index()
            shares = row["buy"]/ytStock["Adj Close"].values[0]
            ytStock = ytStock.iloc[1:]
        except:
            print("Error")
            ytStock = yf.download("SPY", start=str(buyDate.date()), end=str(sellDate.date()), interval="1d").reset_index()
            shares = row["buy"]/ytStock["Adj Close"].values[0]
            ytStock = ytStock.iloc[1:]
        ytStock["OpenAmount"] = ytStock["Open"]*shares
        ytStock["CloseAmount"] = ytStock["Adj Close"]*shares
        ytStock["Ticker"] = [ticker]*len(ytStock["OpenAmount"])
        ytStock = ytStock.fillna(method='ffill')
        ytStock = ytStock.fillna(method='bfill')
        ytStock = ytStock.dropna()
        if not startedWeek:
            dfCombined = ytStock
            startedWeek = True
        else:
            dfCombined = pd.concat([dfCombined, ytStock])
        
        if not started:
            dfAll = ytStock
            started = True
        else:
            dfAll = pd.concat([dfAll, ytStock])
            
    capital = 0
    for ticker in dfCombined["Ticker"].unique():
        dfT = dfCombined[dfCombined["Ticker"]==ticker]
        capital+=dfT["CloseAmount"].values[-1]
    print("Week end capital: ", capital)

Visualization & Analysis

Because I want to compare the performance of this WSB portfolio with the market, I will also get data on the performance of SPY over the same time frame.

dfDay = dfAll.groupby("Date").sum().reset_index()
dfDay["Fund"] = ["WSB"]*len(dfDay["Close"])
dfSPY = yf.download("SPY", start="2018-09-01", end="2021-02-18", interval="1d").reset_index()
dfSPY["Fund"] = ["S&P 500"] * len(dfSPY["Open"])
shares = 100000/dfSPY["Open"].values[0]
dfSPY["OpenAmount"] = dfSPY["Open"]*shares
dfSPY["CloseAmount"] = dfSPY["Close"]*shares
dfCombined = pd.concat([dfDay, dfSPY])

Now I can graph out how the WSB fund did compared to the market using Plotly.

import plotly.express as px
import plotly
fig = px.line(dfCombined, x="Date", y="CloseAmount", title='WSB', color="Fund", color_discrete_sequence=["rgb(229, 81, 39)","rgb(118, 213, 232)" ])
wsbReturn = (capital-100000)/100000*100
fig.update_layout(title="<b>+"+str(round(wsbReturn, 2))+"% Return</b><br>Aug 2018 - Feb 2021", titlefont=dict(color='rgb(229, 81, 39)', size=20), plot_bgcolor='rgb(32,36,44)', paper_bgcolor='rgb(32,36,44)')
fig.update_xaxes(title_text="",color='white', showgrid=False, tickfont=dict(size=10))
fig.update_yaxes(title_text="$", color='white', showgrid=False, titlefont=dict(size=20),gridcolor="rgb(228,49,34)")
fig.update_layout(
    legend=dict(
        title=dict(text="",font=dict(color='white')),
        x=.85, y=1.15,
        font=dict(
            color='white',
            size=15
        )
    )
)
fig.update_traces(line=dict(width=3))
fig.show()

Performance of wallstreetbets strategy vs. the market

I can also see what the portfolio was comprised of each week.

import plotly.graph_objects as go
fig = go.Figure(px.bar(dfWs, x="buyDate", y="buy", color='Ticker',text='Ticker',color_discrete_sequence=px.colors.qualitative.Light24))
fig.update_layout(title="Portfolio by Week", titlefont=dict(color='rgb(228,49,34)'), plot_bgcolor='rgb(32,36,44)', paper_bgcolor='rgb(32,36,44)')
fig.update_xaxes(title_text="",color='white', showgrid=False, fixedrange=False)
fig.update_yaxes(title_text="$",color='white', showgrid=False,  fixedrange=False,gridwidth=1,gridcolor="rgb(109,177,174)")
fig.update_layout(
    legend=dict(
        title=dict(text="Ticker",font=dict(color="white")),
        
        font=dict(
            color='white'
        ),
        
    )
)
fig.show()

This graphic is pretty indistinguishable as a static image, but I put interactive versions of the visualizations up on this dashboard, which allows you to see the information by zooming and hovering.

Conclusion

It probably goes without saying that the past performance of this strategy is no indication of future results and that this post is not intended as financial advice.

That being said, I do think that, in the right hands, there is strong potential in using data from wallstreetbets discussion to generate alpha.