SQL and window-functions

Difference in BNP (aka GDP)

In BNP at cbs the relative difference against the previous year are presented. In the title on this site R and SQL, and more, the problem of the missing context was solved using an exponential X-axis. This title saves the R-code for later and dives into SQL window-functions. One in particular: lag.

CBS presents relative difference, hence I will uncover these also.

In order to deliver numbers to our R-script about relative difference, we adapt the SQL-statement to deliver the difference too (as the CBS reports until 2015, we limit year in the where-clause):

select year, 
       round( avg( 
          bnp /1000000 / v.value * e.value)
       ) as bnp, 
       1 + year - 
          (select min(year) from bnp) 
         as rank,
       100 * ( avg(bnp) 
               - LAG( avg(bnp) ) 
                 OVER ( ORDER BY year ))
             ( LAG( avg(bnp) ) 
               OVER ( ORDER BY year ) )
         as bnpreldiff,
       <<inflation for this year to take into account>> as inflation
  from bnp 
       JOIN valuta v 
         ON v.name = bnp.valuta
       JOIN valuta e 
         ON e.name = v.refvaluta
 where e.name = 'EUR' and year <= 2015

The SQL window-function lag is used for each bnp to get the previous bnp for calculating the relative difference. The counterpart of lag is lead, which looks forward. Read Detect values changes for more on window-functions.

The code used by the R code then is contained in bnps:

group by year
order by year

We can re-use the R-code written in R and SQL, and more, changing using mydata$bnp into mydata$bnpreldiff and show:

base = 1.06
nsamples = seq( 1, nrow( mydata ) )
plot( base^mydata$rank, mydata$bnpreldiff, 
      type="l", xaxt="n" )
axis( 1, at=base^nsamples, 
      las=2 )
abline( h=0, col="lightgrey" )

Sorry, your browser does not support SVG.

As you can see, I put a grey-line at 0-growth.

The CBS graph is:

Sorry, your browser does not support SVG.

In order to compare with it, we choose a base of 1.2 and make the canvas a bit more square:

Sorry, your browser does not support SVG.


The purchasing value of money drives inflation in case it decreases. A growing BNP makes inflation possible, if not, is a driving force behind it. It is debatable whether it is accurate to correct BNP for inflation. For the sake of some smart R-code and even smarter SQL we will pursue graphs about BNP and relative differences as we did in previous titles, but then corrected for inflation. I.e. if the BNP is 100 in one year and 102 the next and if the inflation was 2% in that year, the corrected BNP for the next year is 100 as well.

Because we already know the relative difference with the year before, for every year, the inflation shifts every Y-value down (or up) by the inflation-percentage. Hence, graphing a inflation-corrected relative difference graph is easy when the inflation is known per year.

I got the figures from Historische inflatie and (for those before 1962) from: Statline Cbs.

There is one snag though, the inflation to impose upon the BNP of e.g. 1976 is the one reported for 1975.

So for the SQL we use another window-function:

coalesce( LAG( avg( inflation ) ) OVER ( ORDER BY year ), 0 )

The R-code now simply subtracts every inflation-number from the bnpreldiff column:

Sorry, your browser does not support SVG.

Still not the same

'Well' I hear you saying, 'it still doesn't look slightly similar'. This is true for the aesthetics. This blog isn't about aesthetics. You can try mimicing these yourself by using ggplot2, an R package which CBS probably used themselves.

about this title

The document to generate the scripts has the same source as the document you are reading now.

Most scripts are bare bone, the amount of fancy stuff is kept to an absolute minimum in order to present only the concepts at hand and only that.

This title was written between 10th and 11th of August 2017