• The Mindful Data Path
  • Posts
  • Correlation Matrix Example. Wind and Solar energy. How to Read a Profit and Loss Statement

Correlation Matrix Example. Wind and Solar energy. How to Read a Profit and Loss Statement

Correlation Matrix is a must-have tool for your exploratory data analysis. Societies must invest in wind and solar energy. Knowing how to read an Income Statement is key to invest

The Data Science Topic of the Week: The Correlation Matrix

Learning without practice is not effective

The Dataset

A new week, a new exercise!

We will be working on a McDonald's Indian menu dataset. You can download it on Kaggle here below:

This dataset contains nutrition information for every item on the Indian McDonald's menu. I'm sure you are wondering what we are going to do with it!

Let's load the necessary Python libraries and look at the dataset:

Extract from the dataset

The Use Case

Instruction: create a correlation matrix to compare the dataset's coefficients of correlation

What is a correlation matrix?

It's a common tool I use a lot when performing EDA (Exploratory Data Analysis). It helps to compare the coefficients of correlation between the different features in the dataset.

 A correlation matrix helps to:  

  • summarize a large dataset, identify patterns, and make decisions based on it.

  • see which variables are more correlated

It's a critical step in the pre-processing of machine learning pipelines. It is particularly valuable in regression techniques such as simple linear regression, multiple linear regression, and lasso regression models.

Please try to do the exercise on your own before looking at the solution if you can.

Here is my code snippet:

You can copy and paste the code from my GitHub gist with the link below: 

Output

Comments

  • Pearson correlation: this calculation assesses the strength of a two-variable linear relationship. It has a value from -1 to 1: -1 shows a total negative correlation, 0 no correlation, and + 1 total positive correlation. We round it to 2 decimals.

  • plt.figure: we set the figure size and dpi (dots-per-inch) is the resolution. It's 600 here to get very high quality.

  • sns.heatmap: we use the heatmap method to create the correlation matrix:

    • square: we set it to True. Each cell will be square-shaped.

    • vmin, vmax, center: we set the range of values to anchor the colormap

    • cmap: we choose the 'RdBu_r' one. It's diverging colormap

    • cbar_kws: to make the colorbar small, we use the shrink argument with a value smaller than 1.

Analysis

We can conclude that energy, protein, sodium, total fat, and saturated fat all have a strong positive correlation (unsurprisingly, it doesn't seem like a healthy diet)

Energy and total carbohydrate also have a strong positive correlation.

Foods and beverages with a high total sugar content also have a high added sugar content (Yikes!)

References:

The Mindful Data of the Week

The newsletter has a new section! Every Friday, I'll highlight one interesting piece of data that raises awareness and inspires action to better our lives and society.

I have picked a chart from the latest IPCC report (The Intergovernmental Panel on Climate Change (IPCC) is the United Nations body for assessing the science related to climate change) and I cropped this part below:

You can visualize the full chart here

I chose 2 interesting facts on this chart:

  • Wind and solar energy are the most promising and cost-effective sources of energy for combating climate change.

  • While it is clear that nuclear power is inefficient and much more expensive.

There is a current narrative that abandoning nuclear power, as Germany did after Fukushima, was a huge mistake, and that societies should return to nuclear power.

It is the alternative choice that the governments chose that determines whether the choice is good or bad. In the case of Germany, because they were unprepared, they substituted nuclear power with coal, which is far more harmful.

Governments should invest heavily in wind and solar energy, as shown in this chart.

The Finance Topic of the Week: How to Read a P&L

“You have to understand accounting and you have to understand the nuances of accounting. It's the language of business and it's an imperfect language, but unless you are willing to put in the effort to learn accounting - how to read and interpret financial statements - you really shouldn't select stocks yourself.”

Warren Buffett - CEO of Bershire Hathaway

If you invest in stocks, you must be able to read the 3 financial statements:

  • The Income Statement

  • The Balance Sheet

  • The Cash Flow

I'll start a series on each of these three statements. I'll use NVIDIA as a case study for the Income Statement.

The Income Statement Basics

It is also known as the profit and loss statement (Profit & Loss). It shows a company's revenue and expenses over time.

Let's take it one step at a time:

Always start at the top with revenue and work your way down by deducting various expenses.

  • Revenue (also called the top line): number of net sales generated by product/service sales to customers Net means that it includes discounts, returns, and other deductions.

  • Cost of Goods Sold (COGS): these are the costs incurred by the company to manufacture the product/service. It is also known as Direct Costs. The COGS percentage is calculated by dividing the Revenue by the COGS percentage.

  • Gross Margin (GM): the difference between revenue and cost of goods sold It demonstrates the company's efficiency in producing goods and services. Revenue is also commonly divided. The greater the GM value, the better.

  • Operating Expenses: These are the costs incurred by the company to carry out its daily operations. Some businesses divide it into several categories, while others lump it all together.

  • Operating Income: obtained after operating expenses are deducted

  • Non-Operating Income/Expense: all other costs unrelated to the operation of the business It can be an exchange rate loss or gain, a gain or loss from the sale of an asset, or financial charges.

  • Pre-Tax Income (EBIT): After deducting the Non-Operating Income/Expense.

  • Income Tax Expense:It includes all taxes that the company is required to pay.

  • Net Income: This is the actual earnings or profits, also known as the "bottom line."

  • Earnings Per Share (EPS): This is a popular metric. Earnings are divided by the total number of shares outstanding.

Now that we've covered the fundamentals, let's practice with a real-world recent profit and loss statement.

Practice on a real-world example: Nvidia

DISCLAIMER: None of this is financial advice. This newsletter is strictly educational and is not investment advice or a solicitation to buy or sell any assets or to make any financial decisions. Please be careful and do your own research.

Nvidia is a global corporation that manufactures graphic processors, mobile technologies, and chipsets. 

I chose Nvidia's most recent earnings report (link here) and read my comments on the main highlights:

There are three common standard practices to follow when assessing the P&L:

  • Making a ratio of P&L costs to sales is a common practice. It enables you to state:

"For every $100 in sales, the company spends $xxx in COGS." (for example)

  • The previous time period (quarter, year...): how is the company doing compared to last year? Which lines of the income statement have changed significantly?

  • The comparison with the industry and the competitors: what seems to be a high gross margin may be low in comparison to the industry average. .For comparison, I chose AMD, and you can find their Q2 earnings report here.

Let's compare Nvidia and AMD. Ratios are comparable for Q2 22 between the two companies for some of the main cost lines. 

Analyzing the income statement is not enough. It must be combined with the balance sheet and cash flow. This will be covered in future newsletter issues!

 That's a wrap for today. Stay curious, practice your Python and see you next week!

I have a Data Science blog on Medium: Khuong Lân Cao Thai – Medium

 If you find my posts helpful, you can consider donating to me the equivalent of a coffee ($3) with the button below.

Ko-fi doesn't take fees so all the money you donate will go straight to me. By supporting me, I will be able to invest more in building content with higher quality.