Abstract: There is an increasing tendency to fine-tune large-scale pre-trained language models (LMs) using small private datasets to improve their capability for downstream applications. In this paper ...