城市网约车专兼职司机自动分类算法

    Automatic Classification for Full-time and Part-time Car-hailing Drivers

    • 摘要: 为精准区分网约车专兼职司机,优化网约车订单分派算法,提高网约车服务效率和满意度,挖掘了大量网约车平台专兼职车辆订单数据的变化规律,并基于订单数据生成个体车辆订单时序图,提出了一种考虑网约车订单时序关联性的专兼职司机自动分类模型,实现以聚类中心曲线来精准表征专兼职司机属性。模型以精准率、召回率和综合性能指数(F1)作为验证分类精度和有效性的指标,选择基于欧式距离的聚类模型(EKmeans)和基于形状距离的聚类模型(DTWKmeans)作为基线模型,最后通过滴滴出行网约车数据进行了有效性验证。结果表明:与基线模型相比,本模型生成的聚类中心曲线能够更好地表征网约车专兼职司机运营订单的动态变化规律,并能够更精确地实现网约车专兼职司机的自动分类。模型对网约车专兼职司机的分类精度显著提高,其中对专兼职司机的分类指标中F1指数分别为0.70、0.88,相较于基线模型最高提升了55.56%和37.5%。与基线模型相比,本模型在精准率和召回率2个指标上取得了最好的平衡状态,使得模型能够较好地满足要求。基于曲线形成的聚类模型能够更好反映类的形状和特征。本模型可以更准确地对专兼职驾驶员进行分类,对于网约车平台专兼职司机精准管理、派单优化和服务水平提升具有重要意义。

       

      Abstract: To accurately distinguish between full-time and part-time drivers, optimize the order allocation algorithm of online car-hailing, and improve the service efficiency and satisfaction of online car-hailing, this study explored the changing rules of a large number of full-time and part-time car order data on online car-hailing platforms. Based on the order data, individual vehicle order time series diagrams were generated, and an automatic classification model for full-time and part-time drivers was proposed, which considers the temporal correlation of online car-hailing orders. The clustering center curve was used to accurately characterize the attributes of full-time and part-time drivers. The precision, recall, and F1 score were used as indicators to validate the accuracy and effectiveness of the classification. The Euclidean K-means (EKmeans) clustering model and the shape-based dynamic time warping K-means clustering model were selected as the baseline models, and the effectiveness of the proposed method was verified using ride-hailing data from Didi Chuxing. Results show that compared with the baseline models, the clustering center curve generated by the proposed model can better characterize the dynamic changes in the operation of full-time and part-time ride-hailing drivers and achieve more accurate automatic classification of full-time and part-time drivers. The proposed model significantly improves the classification accuracy of full-time and part-time drivers, with F1 scores of 0.70 and 0.88, respectively, which increases by 55.56% and 37.5%, respectively, compared with the baseline models. The proposed model achieves a good balance between precision and recall, which is better than the baseline models. The clustering model based on curve shape can better reflect the shape and features of the clusters. The proposed model can more accurately classify full-time and part-time drivers, which is of great significance for the precise management, order dispatching optimization, and service level improvement of ride-hailing platforms.

       

    /

    返回文章
    返回