# SQL and window-functions

## Difference in BNP (aka GDP)

In BNP
at cbs the relative difference against the previous year are
presented. In the title on this site
R and SQL, and more, the problem
of the missing context was solved using an exponential X-axis. This
title saves the R-code for later and dives into SQL
window-functions. One in particular: *lag*.

CBS presents relative difference, hence I will uncover these also.

In order to deliver numbers to our R-script about relative
difference, we adapt the SQL-statement to deliver the difference
too (as the CBS reports until 2015, we limit *year* in the `where`

-clause):

select year, round( avg( bnp /1000000 / v.value * e.value) ) as bnp, 1 + year - (select min(year) from bnp) as rank, 100 * ( avg(bnp) - LAG( avg(bnp) ) OVER ( ORDER BY year )) / ( LAG( avg(bnp) ) OVER ( ORDER BY year ) ) as bnpreldiff, <<inflation for this year to take into account>> as inflation from bnp JOIN valuta v ON v.name = bnp.valuta JOIN valuta e ON e.name = v.refvaluta where e.name = 'EUR' and year <= 2015

The SQL window-function *lag* is used for each *bnp* to get the
previous *bnp* for calculating the relative difference. The
counterpart of *lag* is *lead*, which looks forward. Read
Detect
values changes for more on window-functions.

The code used by the R code then is contained in *bnps*:

<<bnp>> group by year order by year

We can re-use the R-code written in
R and SQL, and more,
changing
using `mydata$bnp`

into `mydata$bnpreldiff`

and show:

base = 1.06 nsamples = seq( 1, nrow( mydata ) ) plot( base^mydata$rank, mydata$bnpreldiff, type="l", xaxt="n" ) axis( 1, at=base^nsamples, labels=mydata$year, cex.axis=1.2, las=2 ) abline( h=0, col="lightgrey" )

As you can see, I put a grey-line at 0-growth.

The CBS graph is:

In order to compare with it, we choose a base of 1.2 and make the canvas a bit more square:

## Inflation

The purchasing value of money drives inflation in case it decreases. A growing BNP makes inflation possible, if not, is a driving force behind it. It is debatable whether it is accurate to correct BNP for inflation. For the sake of some smart R-code and even smarter SQL we will pursue graphs about BNP and relative differences as we did in previous titles, but then corrected for inflation. I.e. if the BNP is 100 in one year and 102 the next and if the inflation was 2% in that year, the corrected BNP for the next year is 100 as well.

Because we already know the relative difference with the year before, for every year, the inflation shifts every Y-value down (or up) by the inflation-percentage. Hence, graphing a inflation-corrected relative difference graph is easy when the inflation is known per year.

I got the figures from Historische inflatie and (for those before 1962) from: Statline Cbs.

There is one snag though, the inflation to impose upon the BNP of e.g. 1976 is the one reported for 1975.

So for the SQL we use another window-function:

coalesce( LAG( avg( inflation ) ) OVER ( ORDER BY year ), 0 )

The R-code now simply subtracts every *inflation*-number from the
*bnpreldiff* column:

### Still not the same

'Well' I hear you saying, 'it still doesn't look slightly
similar'. This is true for the aesthetics. This blog isn't about
aesthetics. You can try mimicing these yourself by using *ggplot2*,
an R package which CBS probably used themselves.

## about this title

The document to generate the scripts has the same source as the document you are reading now.

Most scripts are bare bone, the amount of fancy stuff is kept to an absolute minimum in order to present only the concepts at hand and only that.

This title was written between 10th and 11th of August 2017