Abstract:The target assignment of formation air defense is studied, markov decision model is used to describe the dynamic target assignment process of formation air defense, the formation air defense target allocation reinforcement learning system is constructed, the system composition is described, the model solving method based on Q-Learning algorithm is given, and the model affect is simulated and analyzed, which proves the effectiveness of the model.