SQL and window-functions
Difference in BNP (aka GDP)
In BNP at cbs the relative difference against the previous year are presented. In the title on this site R and SQL, and more, the problem of the missing context was solved using an exponential X-axis. This title saves the R-code for later and dives into SQL window-functions. One in particular: lag.
CBS presents relative difference, hence I will uncover these also.
In order to deliver numbers to our R-script about relative
difference, we adapt the SQL-statement to deliver the difference
too (as the CBS reports until 2015, we limit year in the
select year, round( avg( bnp /1000000 / v.value * e.value) ) as bnp, 1 + year - (select min(year) from bnp) as rank, 100 * ( avg(bnp) - LAG( avg(bnp) ) OVER ( ORDER BY year )) / ( LAG( avg(bnp) ) OVER ( ORDER BY year ) ) as bnpreldiff, <<inflation for this year to take into account>> as inflation from bnp JOIN valuta v ON v.name = bnp.valuta JOIN valuta e ON e.name = v.refvaluta where e.name = 'EUR' and year <= 2015
The SQL window-function lag is used for each bnp to get the previous bnp for calculating the relative difference. The counterpart of lag is lead, which looks forward. Read Detect values changes for more on window-functions.
The code used by the R code then is contained in bnps:
<<bnp>> group by year order by year
We can re-use the R-code written in
R and SQL, and more,
mydata$bnpreldiff and show:
base = 1.06 nsamples = seq( 1, nrow( mydata ) ) plot( base^mydata$rank, mydata$bnpreldiff, type="l", xaxt="n" ) axis( 1, at=base^nsamples, labels=mydata$year, cex.axis=1.2, las=2 ) abline( h=0, col="lightgrey" )
As you can see, I put a grey-line at 0-growth.
The CBS graph is:
In order to compare with it, we choose a base of 1.2 and make the canvas a bit more square:
The purchasing value of money drives inflation in case it decreases. A growing BNP makes inflation possible, if not, is a driving force behind it. It is debatable whether it is accurate to correct BNP for inflation. For the sake of some smart R-code and even smarter SQL we will pursue graphs about BNP and relative differences as we did in previous titles, but then corrected for inflation. I.e. if the BNP is 100 in one year and 102 the next and if the inflation was 2% in that year, the corrected BNP for the next year is 100 as well.
Because we already know the relative difference with the year before, for every year, the inflation shifts every Y-value down (or up) by the inflation-percentage. Hence, graphing a inflation-corrected relative difference graph is easy when the inflation is known per year.
There is one snag though, the inflation to impose upon the BNP of e.g. 1976 is the one reported for 1975.
So for the SQL we use another window-function:
coalesce( LAG( avg( inflation ) ) OVER ( ORDER BY year ), 0 )
The R-code now simply subtracts every inflation-number from the bnpreldiff column:
Still not the same
'Well' I hear you saying, 'it still doesn't look slightly similar'. This is true for the aesthetics. This blog isn't about aesthetics. You can try mimicing these yourself by using ggplot2, an R package which CBS probably used themselves.
about this title
The document to generate the scripts has the same source as the document you are reading now.
Most scripts are bare bone, the amount of fancy stuff is kept to an absolute minimum in order to present only the concepts at hand and only that.
This title was written between 10th and 11th of August 2017