# SQL and window-functions

## Difference in BNP (aka GDP)

In BNP at cbs the relative difference against the previous year are presented. In the title on this site R and SQL, and more, the problem of the missing context was solved using an exponential X-axis. This title saves the R-code for later and dives into SQL window-functions. One in particular: lag.

CBS presents relative difference, hence I will uncover these also.

In order to deliver numbers to our R-script about relative difference, we adapt the SQL-statement to deliver the difference too (as the CBS reports until 2015, we limit year in the `where`-clause):

```select year,
round( avg(
bnp /1000000 / v.value * e.value)
) as bnp,
1 + year -
(select min(year) from bnp)
as rank,
100 * ( avg(bnp)
- LAG( avg(bnp) )
OVER ( ORDER BY year ))
/
( LAG( avg(bnp) )
OVER ( ORDER BY year ) )
as bnpreldiff,
<<inflation for this year to take into account>> as inflation
from bnp
JOIN valuta v
ON v.name = bnp.valuta
JOIN valuta e
ON e.name = v.refvaluta
where e.name = 'EUR' and year <= 2015
```

The SQL window-function lag is used for each bnp to get the previous bnp for calculating the relative difference. The counterpart of lag is lead, which looks forward. Read Detect values changes for more on window-functions.

The code used by the R code then is contained in bnps:

```<<bnp>>
group by year
order by year
```

We can re-use the R-code written in R and SQL, and more, changing using `mydata\$bnp` into `mydata\$bnpreldiff` and show:

```base = 1.06
nsamples = seq( 1, nrow( mydata ) )
plot( base^mydata\$rank, mydata\$bnpreldiff,
type="l", xaxt="n" )
axis( 1, at=base^nsamples,
labels=mydata\$year,
cex.axis=1.2,
las=2 )
abline( h=0, col="lightgrey" )
```

As you can see, I put a grey-line at 0-growth.

The CBS graph is:

In order to compare with it, we choose a base of 1.2 and make the canvas a bit more square:

## Inflation

The purchasing value of money drives inflation in case it decreases. A growing BNP makes inflation possible, if not, is a driving force behind it. It is debatable whether it is accurate to correct BNP for inflation. For the sake of some smart R-code and even smarter SQL we will pursue graphs about BNP and relative differences as we did in previous titles, but then corrected for inflation. I.e. if the BNP is 100 in one year and 102 the next and if the inflation was 2% in that year, the corrected BNP for the next year is 100 as well.

Because we already know the relative difference with the year before, for every year, the inflation shifts every Y-value down (or up) by the inflation-percentage. Hence, graphing a inflation-corrected relative difference graph is easy when the inflation is known per year.

I got the figures from Historische inflatie and (for those before 1962) from: Statline Cbs.

There is one snag though, the inflation to impose upon the BNP of e.g. 1976 is the one reported for 1975.

So for the SQL we use another window-function:

```coalesce( LAG( avg( inflation ) ) OVER ( ORDER BY year ), 0 )
```

The R-code now simply subtracts every inflation-number from the bnpreldiff column:

### Still not the same

'Well' I hear you saying, 'it still doesn't look slightly similar'. This is true for the aesthetics. This blog isn't about aesthetics. You can try mimicing these yourself by using ggplot2, an R package which CBS probably used themselves.