StockTrendsBot: Using Python to Create a Stock Performance Bot for Reddit
Purpose of the Bot
Using Python and the praw library, I wanted to create a bot for Reddit that would crawl new posts in various subreddits and look for company names being mentioned. I hoped that this bot would show trends in stock prices over various amounts of time so that whoever is reading the post will have the bigger picture of overall company and stock performance. It’s always important to put things in context, especially in the age of sensationalist headlines.
How It Was Made: Getting Lists of Public Companies
First, I needed to find a list of all of the public companies across American stock exchanges (currently I’m only doing the NYSE, NASDAQ and AMEX, but this could be extended to international exchanges in the future). NASDAQ.com kindly provides these lists in a downloadable .CSV format here: http://www.nasdaq.com/screening/company-list.aspx. I created a Django model to hold the company info and formatted the company names to make them more colloquial. I overrode the save method to clean the companies as they were loaded into the database:
class Company(models.Model):
symbol = models.CharField(max_length=10)
name = models.CharField(max_length=255)
ipo_year = models.CharField(max_length=10)
sector = models.CharField(max_length=55)
industry = models.CharField(max_length=55)
name_has_been_formatted = models.BooleanField(default=False)
def __str__(self):
return self.name
def save(self, *args, **kwargs):
# remove punctuation from the name and replace suffixes
name_formatted = re.sub(r'[^\w\s]','', self.name)
three_char_suffix_list = ['Inc', 'Ltd', 'PLC', 'Corp']
for suffix in three_char_suffix_list:
if name_formatted[-3:] == suffix:
name_formatted = name_formatted[:-3]
two_char_suffix_list = ['Co', 'LP']
for suffix in two_char_suffix_list:
if name_formatted[-2:] == suffix:
name_formatted = name_formatted[:-2]
setattr(self, 'name', name_formatted.strip())
setattr(self, 'name_has_been_formatted', True)
super(Company, self).save(*args, **kwargs)
How It Was Made: Searching Reddit for Mentions of a Company
The bot uses the praw library for Python to loop through various subreddits, scanning the 5 newest posts. It also records the identification number of the post every time it posts a comment. That way it will only apply to each post once. I’m using business and investing related subreddits mostly, to lessen the chance for errors. If you’re interested, I’d suggest checking out the full code at the link above. The bot runs an unlimited amount of times by using a “while True: job()” statement with an error handler to email me in case the bot runs into problems.
while True:
try:
praw_object = praw.Reddit(
client_id = reddit_id,
client_secret = reddit_secret,
user_agent = reddit_user_agent,
password = reddit_password,
username = reddit_username
)
start_stocktrendsbot(praw_object)
except Exception as e:
def send_me_mail(e):
send_mail(
'StockTrendsBot failed!', 'STB failed with an error message of ' + str(e),
'ericlighthofmann@gmail.com', ['ericlighthofmann@gmail.com']
)
if str(e) != 'KeyboardInterrupt':
if not emailed:
send_me_email(e)
emailed = True
emailed_datetime = datetime.now()
else:
last_emailed = datetime.now() - emailed_datetime
if last_emailed.seconds / 60 / 60 > 2:
send_me_email(e)
In order to search for an exact match of the company, I split on spaces and looked for occurrences of the company name in the title. I also added error handling to wait the correct amount of time if the bot is posting too often.
for submission in praw_object.subreddit(sr).new(limit=5):
if submission.id not in PostRepliedTo.objects.all().values_list(
'submission_id', flat=True
):
for name in Company.objects.all().values_list('name', flat=True):
if name.lower() in submission.title.lower().replace('\'s', '').split(' '):
current_company = Company.objects.filter(name=name).first()
stock_info = StockInfo(current_company)
try:
logging.info('Replying to : ' + str(submission.title))
logging.info('reddit.com' + str(submission.permalink))
submission.reply(stock_info.text_output)
PostRepliedTo.objects.get_or_create(
submission_id = submission.id,
url = 'reddit.com'+submission.permalink,
)
except praw.exceptions.APIException as e:
if 'minutes' in str(e):
time_to_wait = int(str(e).split(' minutes')[0][-1:])
logging.warning('Sleeping for ' + str(time_to_wait) + ' minutes.')
time.sleep(time_to_wait*60+70)
elif 'seconds' in str(e):
time_to_wait = int(str(e).split(' seconds')[0][-2:])
logging.warning('Sleeping for ' + str(time_to_wait) + ' seconds.')
time.sleep(time_to_wait+10)
time.sleep(10)
How It Was Made: Getting the Current and Historical Prices of the Stock
Once a mention of a company is found, it creates an instance of the StockInfo class. StockInfo contains our current price data, historical price data, and outputs the text using Reddit's markdown requirements. We're using the IEX API and the Python wrapper (found here: https://github.com/addisonlynch/iexfinance) to get our current price and our historical prices. The API call returns a json which we parse according to the dates we're looking for.
class StockInfo():
def get_current_price(self, current_company):
logging.info('getting info for ' + str(current_company.name) + ' (' + str(current_company.symbol) + ')')
stock_object = Stock(current_company.symbol.upper())
current_price = round(float(stock_object.get_price()),2)
return current_price
def get_historical_change(self, current_company):
one_week_ago = datetime.now() - relativedelta(weeks=1)
one_month_ago = datetime.now() - relativedelta(months=1)
one_year_ago = datetime.now() - relativedelta(years=1)
def get_historical_price(time_period):
def format_date(date_input):
return datetime.strftime(date_input, '%Y-%m-%d')
historical_price = {}
while historical_price == {}:
historical_price = get_historical_data(current_company.symbol.upper(),
format_date(time_period), time_period, output_format='json'
)
if historical_price == {}:
time_period = time_period + relativedelta(days=1)
price = historical_price[format_date(time_period)]['close']
return price
weekly_price = get_historical_price(one_week_ago)
monthly_price = get_historical_price(one_month_ago)
yearly_price = get_historical_price(one_year_ago)
return weekly_price, monthly_price, yearly_price
def get_change(self, current_price, historical_price):
change = round((current_price-historical_price) / historical_price * 100,1)
return change
def get_trend_text_output(self, change, time_period):
def get_change_marker(change):
if change > 0.0:
change_marker = '▲ +'
elif change < 0.0:
change_marker = '▼'
else:
change_marker = 'even at'
return change_marker
change_marker = get_change_marker(change)
text_output = ('Over the past ' + time_period + ', ' +
current_company.symbol + ' is ' + change_marker + str(change) + '%' + '\n\n'
)
return text_output
def get_text_output(self, current_company):
output = ('**' + current_company.name + ' (' + current_company.symbol + ')**' +
'\n\n' + 'Current price: $' + str(self.current_price) +
'\n\n' + self.weekly_text_output +
self.monthly_text_output +
self.yearly_text_output +
'***' + '\n\n' + '^Beep ^Boop, ^I ^am ^a ^bot. ' +
'^I ^delete ^my ^comments ^if ^they ^are ^-3 ^or ^lower. ' +
'^Message ^[HomerG](\/u\/HomerG) ^with ^any ^suggestions, ^death ^threats, ^etc.' + '\n\n' +
'^To ^see ^source ^code ^and ^how ^I ^was ^made, ^click ^[here.](http:\/\/www.hofdata.com/blog/stock-trends-bot)')
return output
def __init__(self, current_company):
self.current_price = self.get_current_price(current_company)
self.weekly_price, self.monthly_price, self.yearly_price = \
self.get_historical_change(current_company)
self.weekly_change = self.get_change(self.current_price, self.weekly_price)
self.monthly_change = self.get_change(self.current_price, self.monthly_price)
self.yearly_change = self.get_change(self.current_price, self.yearly_price)
self.weekly_text_output = self.get_trend_text_output(self.weekly_change, 'week')
self.monthly_text_output = self.get_trend_text_output(self.monthly_change, 'month')
self.yearly_text_output = self.get_trend_text_output(self.yearly_change, 'year')
self.text_output = self.get_text_output(current_company)
How It Was Made: Posting the Formatted Comment and Self-Moderating
Once the current and historical prices have been found and the output text has been formatted according to Reddit's markdown, we have the bot leave a comment on the relevant submission and add that submission ID into our database so that we don't post multiple times on the same submission. Lastly, after the bot has looped through all of our subreddits, it goes through all of its comments and looks for those that have less than a -3 score and deletes them. Often, the bot misidentifies whether a company is actually being mentioned. For example, there's a public company called Best - you can imagine this word comes up as a false positive quite a bit. Deleting comments when they have a low score helps the bot moderate itself.
# checking for downvoted comments and deleting at <= -3
comments = praw_object.user.me().comments.new(limit=None)
for comment in comments:
if comment.score <= -3:
logging.info('Deleting a comment at ' + str(comment.permalink))
comment.delete()
Conclusion
The bot is currently running on 10 different subreddits and I’ve only received one death threat so far (it was a joke death threat, don’t worry!). In the future, it would be great to have it be able to run in the main subreddit (r/all) and have it scan the language in articles or text posts to determine whether the poster was indeed talking about the company. Some sort of topic analysis using natural language processing will be the next step.
Please let me know any feedback, suggestions for improvement, etc and thanks for reading!