GAL-VNE: Solving the VNE Problem with Global Reinforcement Learning and Local One-Shot Neural Prediction


The NP-hard combinatorial Virtual Network Embedding Problem (VNEP) has attracted increasing attention, which refers to finding the node and edge mapping between a virtual net (user request) and the physical net (server resource). To adapt to different requests and available resources in real-world applications, learning-based methods are recently devised beyond traditional heuristic solvers. However, the efficiency and scalability hinder its applicability as reinforcement learning (RL) is often adopted in an auto-regressive node-by-node mapping manner to handle complex mapping constraints, for each coming request for mapping. Moreover, existing learning-based works often independently consider each online request, limiting the long-term online service performance. This is because the previous mappings would affect the latter when they are deployed on the same physical network. In this paper, we present a synergistic Global-And-Local learning approach for the VNE problem (GAL-VNE). At the global level across requests, RL is employed to capture the cross-request relation for better global resource accommodation to improve overall performance. At the local level within each request, we aim to replace the sequential decision-making procedure whose cost relies much on the network size, with a more efficient one-shot generation scheme. The main challenge for such a one-shot model is how to encode the constraints under an end-to-end learning and inference paradigm. Accordingly, within the “rank-then-search” paradigm, we propose to first pretrain a graph neural network (GNN)-based node ranker with imitation supervision from an off-the-shelf solver (moderately expensive yet high quality), which is meanwhile regularized by a neighboring smooth prior. Then reinforcement learning is used to finetune the GNN ranker whose supervision directly refers to the final (undifferentiable) business objectives concerning revenue and cost, etc. Extensive experimental results on benchmarks show that our method outperforms classic and learning-based methods in both efficacy and efficiency.

In ACM SIGKDD Conference on Knowledge Discovery and Data Mining