Go to the SEC website and download the FY18 10-k Excel files for Pfizer (PFE), Merck (MRK) and Johnson & Johnson (JNJ). In order to download the files you will need to search for each company and find the 10k filing for the year 2018. You can limit the file types you see by typing 10-k into filing type within the filter results option. Click on the interactive data option next to the 10-k filing information. Directly below the company name you will see an option to “View Excel Document” – click here and download the 10-k. Save each of the three files to the same folder. Name the files based on company name, file type and year (e.g. MRK_10K_FY18). In order to combine data, we will use an Excel function called VLOOKUP (Microsoft Excel VLOOKUP (Links to an external site.)).
- Read through how the VLOOKUP function works (Microsoft Excel VLOOKUP (Links to an external site.)).
- Create a new Excel file
Make the following columns
- Company Name
- Cost of Goods Sold
- Gross Profit
- Net Income
Use VLOOKUP to populate the Company Name from your files.
- For example, if using the naming convention above, the following would populate the Pfizer name:
=VLOOKUP(“Entity Registrant Name”,'[PFE_10K_FY18.xlsx]Document and Entity Information’!$A:$D, 2, FALSE)
The formula looks for the cell in the first row of the array with the value “Entry Registrant Name” and returns the value in the second column of the array in the file PFE_10K_FY18.xlsx and in the tab “Document and Entity Information. In this case the value in that cell is PFIZER INC.
You can see the power of VLOOKUP to quickly pull data from different sources. One could duplicate this formula and only change the file name to pull data from many sources, assuming the files are formatted the same – that is the same array, columns, tab names and lookup value apply.
- Create nine total rows in your Excel file. Three for each company. Use VLOOKUP to populate the company name (3 rows for each company), and then fill in the years FY18, FY17 and FY16. Each company should have three rows and three years when completed. The Find and Replace feature in Excel (Control H) can help you quickly replace company names within the formula.
- We can use the same VLOOKUP function to pull data for Sales, Cost of Goods Sold and Net Income. Look through the excel files for the three companies and identify some barriers to using VLOOKUP. For example, VLOOKUP works best when the files are formatted the same, the tab names are the same and the lookup value are the same. Is that the case with these files? Identify at least three challenges you see with using VLOOKUP.
This is the reality with many data sets such as 10-Ks. They are inconsistent between companies, and often inconsistent even within a company. It is often faster to keep the VLOOKUP formula consistent, and instead change the data sets to fit the VLOOKUP parameters. In this case we would have to:
- Make the Consolidated Statements of Income tab have a consistent name
- Make Sales, Cost of Goods Sold and Net Income consistent lookup values
- Make sure the data we want to pull is within the same column
- Use your downloaded Excel files to practice VLOOKUP and populate the remaining values for Sales, Cost of Goods Sold, and Net Income for each company for years FY18, FY17 and FY16. (Note that the “cleaned up” files only include the Document and Entity Information and Consolidated Statements of Income tab, and that only those values needed for pulling data have been changed). Calculate Gross Profit based on sales and cost of goods sold. When you are complete you should have a data set that is 10 rows (with header) by 6 columns.
- A useful tool to analyze time series data for year over year comparisons are pivot tables (Pivot Table Excel (Links to an external site.)). Create a new sheet and use pivot tables to analyze year over year changes in Sales, Cost of Goods Sold, Gross Profit and Net Income.
- Comment on some individual corporate and industry trends you see.