Generic - Hyperspace by HTML5 UP

Investing with Python: Monitoring the SEC’s Edgar Database to Create an Event Driven Investment Strategy

Dec. 23, 2018

investing2.jpg

Introduction

It has always been a goal of mine to use Python to collect and parse large amounts of data and create an investment strategy based on those indicators. I originally had the plan of using Quantopian, a browser-based, Python framework for automated investing, but my strategy didn’t end up quite fitting with what Quantopian was able to do. As a substitute for full automation, my script monitors the SEC’s Edgar database and texts and emails me whenever there is a stock that fits my criteria to buy or sell. I then use Robinhood, a commission-free brokerage, to manually buy and sell those stocks.

Investing Strategy

There is a common argument in investing that when someone high up in a company buys a large amount of their own company’s stock, it is wise to follow suit. This is because investors believe these company leaders may be privy to information about future products or sales, believe their stock is undervalued, or have a general confidence about their company’s future prospects. Essentially, the idea is they must know something we, everyday investors, don’t. Therefore, I wanted to develop a strategy around these insider buys and see whether that theory held water. Whenever an “insider” (CEO, COO, director, etc.) in a public company makes a purchase, sale, or is granted shares of that company’s stock, they are required to submit a “Form 4” to declare those movements to the Securities Exchange Commission (SEC). These Form 4s are stored digitally in the SEC’s Edgar database. The Edgar database also has an accompanying RSS (Rich Site Summary) page which updates in near real time. We can use Python to continuously read and parse this RSS for constant news about submitted Form 4s.

First, we need to create a list or “screen” of stocks that we want to be looking at. I screened mine based on a few different criteria: market cap, price to book ratio, etc. A robust, free stock screener can be found at Zacks Investment Research. Second, if a Form 4 comes up in the RSS feed from a company in our stock screen, we need to navigate to that Form 4 and parse its XML. Third, we need the script to notify us of a stock to buy if it meets our criteria. I’ve chosen to receive both a text and an email for this using Twilio and Gmail. Fourth, we need to have Python tell us when to sell our stocks by maintaining a portfolio and constantly checking prices through Yahoo Finance.

Explanation of Code: Monitoring the SEC’s RSS Feed

#-------------------------------------------------------------------------------------#
#SCRIPT BODY
#Stock screen: Market Cap < $500m, D/E < 1.5, P/E < 1.5
    
url = 'http://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=&company=&dateb=&owner=only&start=0&count=100&output=atom'
print ('monitoring feed...')
run_counter = 0
def job():
    global run_counter
    time.sleep(5)
    run_counter += 1
    if run_counter % 100 == 0:
        print ('Completed ' + str(run_counter) + ' passes.')
        print ('--------------')
    edgar_feed(url)
    check_price()

while True:
    job()

Here is the initial part of the script that makes sure the two functions, edgar_feed and check_price are constantly running. The URL that is listed is the link to the SEC’s Edgar RSS feed. I have a simple run counter just to keep track and make sure the script is running.

def edgar_feed(url):
    try:
        d = feedparser.parse(url)
        time.sleep(2)
        #d = feedparser.parse(r'feed_example.txt')
        lower = [x.lower() for x in CompanyNameList]
        lower = [x.replace('.', '') for x in lower]
        lower = [x.replace(',', '') for x in lower]
        for entry in range(0,99):
            company_name = d.entries[entry].title.lower()
            company_name = company_name.split('- ')
            company_name = company_name[1].split(' (')
            company_name = company_name[0]
            company_name = company_name.replace('.', '')
            company_name = company_name.replace(',', '')
            if '&amp;' in company_name:
                company_name = company_name.replace('&amp;', '&')
            if company_name in lower and d.entries[entry].title[0:1:] == '4':
                link = d.entries[entry].link
                last50 = slice(-50, None)
                if link[last50] not in stocks_sent:
                    scrape_xml(link)
                    stocks_sent.append(link[last50])
            else:
                pass
    except Exception as e:
        print (e)
        print (datetime.datetime.today())
        pass

I initially load in the companies in my stock screen using Excel (openpyxl in Python) and append each company to CompanyNameList. Using the feedparser module, we call feedparser.parse on the URL of our RSS feed. Since I have the view set to the latest 100 Forms filed with the SEC, I have it loop over all 100 entries, then format the company names so that they match with the format of the companies in my stock screen. If there is a match in company names and the form is a Form 4, I get the link of that entry and send it to another function, scrape_xml.

Explanation of Code: Parse the XML of Form 4s

#----------------------------------------------------------------------------------#
#SCAN EDGAR AND SCRAPE XML

def scrape_xml(link):
    TotalValue = 0
    transactionCodeList = []
    DorIList = []
    TitleList = []
    today = datetime.datetime.today()
    today = today.strftime('%m/%d/%Y %I:%M %p')

    headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36",
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "accept-charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.3",
    "accept-encoding": "gzip, deflate, sdch",
    "accept-language": "en-US,en;q=0.8",
    }
    res = requests.get(link, headers=headers)
    time.sleep(2)
    soup = bs4.BeautifulSoup(res.text, 'html.parser')
    time.sleep(2)
    #try:
    for a in soup.find_all('a'):
        if a.getText()[-4:] == '.xml':
            address = 'http://www.sec.gov' + a['href']
            print ('Scraping XML on ' + str(today) + ' at ' + str(link))
            res = requests.get(address, headers=headers, timeout=None)
            tree = ET.fromstring(res.content)
            if tree.find('reportingOwner/reportingOwnerRelationship/isOfficer')== None:
                isOfficer = ''
            elif tree.find('reportingOwner/reportingOwnerRelationship/isOfficer').text == 'true':
                isOfficer = '1'
            elif tree.find('reportingOwner/reportingOwnerRelationship/isOfficer').text == 'false':
                isOfficer = '0'
            else:
                isOfficer = tree.find('reportingOwner/reportingOwnerRelationship/isOfficer')
                isOfficer = isOfficer.text
            transactionCode = tree.findall('nonDerivativeTable/nonDerivativeTransaction/transactionCoding/transactionCode')
            if transactionCode == None:
                transactionCode = []
            tradingSymbol = tree.find('issuer/issuerTradingSymbol')
            transactionShares = tree.findall('nonDerivativeTable/nonDerivativeTransaction/transactionAmounts/transactionShares/value')
            if transactionShares == None:
                transactionShares = []
            transactionPricePerShare = tree.findall('nonDerivativeTable/nonDerivativeTransaction/transactionAmounts/transactionPricePerShare/value')
            if transactionShares == None:
                transactionShares = []
            DorI = tree.findall('nonDerivativeTable/nonDerivativeTransaction/ownershipNature/directOrIndirectOwnership/value')
            if DorI == None:
                DorI = []
            for price, shares, direct, code in zip(transactionPricePerShare, transactionShares, DorI, transactionCode):
                if direct.text == 'D' and code.text == 'P':
                    TotalValue = TotalValue + float(shares.text)*float(price.text)
                else:
                    pass
            for code in transactionCode:
                transactionCodeList.append(code.text)
            for item in DorI:
                DorIList.append(item.text)
            print ('Officer is: ' + str(isOfficer))
            print('Transaction codes are: ' + str(transactionCodeList))
            print ('TotalValue is: ' + str(TotalValue))
            print ('Direct or Indirect list is ' + str(DorIList))
            print ('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
            if isOfficer == str(1) and 'P' in transactionCodeList and TotalValue > 10000 and 'D' in DorIList and tradingSymbol.text not in portfolio:
                print ('Officer is: ' + str(isOfficer))
                print('Transaction codes are: ' + str(transactionCodeList))
                print ('TotalValue is: ' + str(TotalValue))
                print ('Direct or Indirect list is ' + str(DorIList))
                print ('Stock found.')
                print (today)
                print (tradingSymbol.text)
                with open('portfolio.txt', 'a') as f:
                    f.write(tradingSymbol.text + '\n')
                ticker = tradingSymbol.text.lower()
                res = requests.get('http://finance.yahoo.com/q?s=' + ticker)
                soup = bs4.BeautifulSoup(res.text, 'html.parser')
                elems = soup.select('#yfs_l84_'+str(ticker))
                current_price = elems[0].getText()
                with open('bought_price.txt', 'a') as f:
                    f.write(current_price + '\n')
                email (tradingSymbol.text, link)
                text_phone (tradingSymbol.text)
                text_scott (tradingSymbol.text)
                text_carl (tradingSymbol.text)
                print ('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')

We begin by passing the link from the RSS feed to the function. That link is a summary page and, as such, has many hyperlinks on it, but there’s only one relevant XML link that we want to find and parse. If the word “Archives” and “xml” are in the address, we take that link as that link structure will always point to the correct XML. Next, we parse that XML using ElementTree and look for the criteria that would tell us that an insider buy is valid for our strategy. My personal strategy was to look at whether or not the buyer was an officer (CEO, COO, etc), was a large enough buy (greater than $10,000), and was a direct acquisition through the market. This is to weed about events like the sale of stock, the granting of stock through options, etc. If an insider buy does match our criteria, we use Python to send out an email alert and a text alert, telling us that there is a stock to buy. That company is also added to a “Portfolio” text file so that we can keep track of our owned stocks and the current price of the stock (found by querying Yahoo Finance).

Explanation of Code: Automatic Emailing and Texting

----------------------------------------------------------------------------------#
#COMMUNICATION FUNCTIONS

def email(tradingSymbol, link):
    today = datetime.datetime.today()
    today = today.strftime('%m/%d/%Y %I:%M %p')
    smtpObj = smtplib.SMTP('smtp.gmail.com', 587)
    smtpObj.ehlo()
    smtpObj.starttls()
    smtpObj.login('******************@gmail.com', '***************')
    print(smtpObj.sendmail('******************@gmail.com',\
                     '******************@gmail.com',\
                     'Subject: ' + str(today) + ' | Stock order: ' + str(tradingSymbol) + '.\nBuy this stock and heres the address + ' + str(link) + '\n'))
    smtpObj.quit()

def text_phone(tradingSymbol):
    accountSID = '***************'
    authToken = '***************'
    twilioCli = TwilioRestClient(accountSID, authToken)
    myTwilioNumber = '***************'
    myCellPhone = '+***************'
    message = twilioCli.messages.create(body='Yo, buy this stock: ' + str(tradingSymbol), from_=myTwilioNumber, to=myCellPhone)

To email me an alert, I used the built in smtplib module in Python. The code is pretty straightforward, just replace the asterisks with your email address and password.

To text me an alert, I used the Twilio module. Twilio has a free user membership that allows you to register a number and text an unlimited amount of times. You need to register on the Twilio website and create an app/phone number to receive your accountSID and authToken.

Explanation of Code: Maintaining a Portfolio and Checking Prices

def check_price():
    for stock, price in zip(portfolio, bought_price):
        try:
            ticker = stock.lower()
            res = requests.get('http://finance.yahoo.com/q?s=' + ticker)
            soup = bs4.BeautifulSoup(res.text, 'html.parser')
            elems = soup.select('#yfs_l84_'+str(ticker))
            current_price = elems[0].getText()
            ticker = stock.upper()
            if float(current_price) > 1.02*float(price) and stock not in checked:
            #Email
                today = datetime.datetime.today()
                today = today.strftime('%m/%d/%Y %I:%M %p')
                smtpObj = smtplib.SMTP('smtp.gmail.com', 587)
                smtpObj.ehlo()
                smtpObj.starttls()
                smtpObj.login('******************@gmail.com', '***************')
                print(smtpObj.sendmail('******************@gmail.com',\
                                 '******************@gmail.com',\
                                 'Subject: ' + str(today) + ' | Stock to sell after 2% gains: ' + str(ticker) + '.\nSell this stock' + '\n'))
                smtpObj.quit()
            #Text me
                accountSID = '***************'
                authToken = '***************'
                twilioCli = TwilioRestClient(accountSID, authToken)
                myTwilioNumber = '***************'
                myCellPhone = '+***************'

If there is a 2% gain on the current price compared to my purchase price, Python sends me an email, text, and removes it from the “Portfolio” text file. This stock is now out of my portfolio until I purchase it again. I repeat the same behavior but with a 5% loss. Of course, you can tweak these numbers to whatever fits your strategy. You can get fancier, if you want, by comparing trends in small cap stocks (Russell 2000) or the market overall (S&P). This would allow you to hold onto a stock if the market was trending upwards and vice versa.

Conclusion

While this script is not fully automated as I had originally hoped, it does come close. I have to manually do the actual purchasing and selling of the stock, but Python does automatically alert me which stocks to buy. Of course, you can use this script and tailor your strategy to different factors or look at a different universe of stocks. Feel free to give me any tips related to my code, investment strategy, etc!

You can find the complete code here:

https://github.com/ericlighthofmann/EdgarScrape/blob/master/edgar_email.py

Return to All Posts