venue: IJCAI 2020
If the aggregation function of previous GNN layers (e.g. GCN and GAT) is

then the paper extends it with a bilinear aggregator:

where

It sums up the elementwise product of every pair of neighbor nodes of a target node (self-interactions excluded).
The experimental results show that BGAT (BGCN) outperforms vanilla GAT(GCN) by 1.5% (1.6%).
A linear combination of AGG output and BA output may not be optimal. Other feature aggregation mechanism can also be used (e.g. concatenation with a FFN).