Multilingual natural language understanding, which aims to comprehend multilingual documents, is an important task. Existing efforts have been focusing on the analysis of centrally stored text data, but in real practice, multilingual data is usually distributed. Federated learning is a promising paradigm to solve this problem, which trains local models with decentralized data on local clients and aggregates local models on the central server to achieve a good global model. However, existing federated learning methods assume that data are independent and identically distributed (IID), and cannot handle multilingual data, that are usually non-IID with severely skewed distributions: First, multilingual data is stored on local client devices such that there are only monolingual or bilingual data stored on each client. This makes it difficult for local models to know the information of documents in other languages. To solve the aforementioned challenges of multilingual federated NLU, we propose a plug-and-play knowledge composition~(KC) module, called FedKC, which exchanges knowledge among clients without sharing raw data. pecifically, we propose an effective way to calculate a consistency loss defined based on the shared knowledge across clients, which enables models trained on different clients achieve similar predictions on similar data. Leveraging this consistency loss, joint training is thus conducted on distributed data respecting the privacy constraints. We also analyze the potential risk of FedKC and provide theoretical bound to show that it is difficult to recover data from the corrupted data. We conduct extensive experiments on three public multilingual datasets for three typical NLU tasks, including paraphrase identification, question answering matching, and news classification. The experiment results show that the proposed FedKC can outperform state-of-the-art baselines on the three datasets significantly.