Main Article Content
This paper compares a Proximal Policy Optimisation (PPO) deep reinforcement-learning agent with an Ant Colony Optimisation (ACO) solver on the offline, heterogeneous-bin three-dimensional bin packing problem (3D-BPP). Both algorithms were evaluated on fifty synthetic instances using a unified composite scoring function covering placement ratio, volume utilisation, bin-count penalty and mean per-bin waste. PPO achieves a higher mean composite score (0.346 vs. 0.283), wins on 38 of 50 instances with an average winning margin of 0.101, and resolves each instance in under 60 seconds on a commodity CPU. ACO exhibits greater score variance and resolves instances in up to 1,706 seconds, but its training-free character makes it relevant when the instance distribution changes too rapidly for policy retraining. The PPO training cost of approximately 5.5 hours is recovered after 58 instances compared with ACO at mean inference times. A paired Wilcoxon signed-rank test is identified as the appropriate significance test once per-instance data are made available.
Article Details
Ansel J., Yang E., He H., Gimelshein N., Jain A., Voznesensky M., … Chintala S. (2024) PyTorch 2: Faster Machine Learning through Dynamic Python Bytecode Transformation and Graph Compilation. Proceedings of ASPLOS ’24. ACM. https://doi.org/10.1145/3620665.3640366 (Crossref)
Christensen H. I., Khan A., Pokutta S., Tetali P. (2016) Multidimensional Bin Packing and Other Related Problems: A Survey. Computer Science Review, 24. (Crossref)
Dorigo M., Gambardella L. M. (1997) Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem. IEEE Transactions on Evolutionary Computation, 1(1), 53-66. (Crossref)
Dorigo M., Stützle T. (2018) Ant Colony Optimization: Overview and Recent Advances. [in:] Handbook of Metaheuristics. Springer, 311-351. (Crossref)
Dyckhoff H. (1990) A Typology of Cutting and Packing Problems. European Journal of Operational Research, 44(2), 145-159. (Crossref)
Dyckhoff H., Finke U. (1992) Cutting and Packing in Production and Distribution: A Typology and Bibliography. Springer. (Crossref)
Elhedhli S., Gzara F., Yildiz B. C. (2019) Three-Dimensional Bin Packing and Mixed-Case Palletization. INFORMS Journal on Optimization, 1(4), 323-352. (Crossref)
Garey M. R., Johnson D. S. (1979) Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.
gym-BinPack3D [Source Code] (2024) https://github.com/ylchan87/gym-BinPack3D
Gzara F., Elhedhli S., Yildiz B. C. (2020) The Pallet-Loading Problem: Three-Dimensional Bin Packing with Practical Constraints. European Journal of Operational Research, 287, 545-565. (Crossref)
Hu H., Zhang X., Yan X., Wang L., Xu Y. (2017) Solving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method. arXiv preprint arXiv:1708.05930.
Kanna S. K., Jaisree A. D., Balasundaram K., Kumar S. B. (2015) Optimization of 3D Constrained Rectangular Bin Packing Problem using Recursive Ant Colony Algorithm. IOSR Journal of Mechanical and Civil Engineering, 12(4), 65-70.
Karaboga D., Basturk B. (2008) On the Performance of Artificial Bee Colony (ABC) Algorithm. Applied Soft Computing, 8(1), 687-697. (Crossref)
Lampropoulos A. S., Tsihrintzis G. A. (2015) Machine Learning Paradigms: Applications in Recommender Systems. Springer International Publishing. (Crossref)
Levine J., Ducatelle F. (2004) Ant Colony Optimization and Local Search for Bin Packing and Cutting Stock Problems. Journal of the Operational Research Society, 55(7), 705-716. (Crossref)
Li X., Zhang K. (2015) A Hybrid Differential Evolution Algorithm for Multiple Container Loading Problem with Heterogeneous Containers. Computers & Industrial Engineering, 90, 305-313. (Crossref)
Maia D. B., Borenstein D. (2024) A Hybrid Approach Combining Ant Colony Optimization and Column Generation for Solving the 3D Bin Packing Problem with Rotation. Available at SSRN 4978748 [preprint]. (Crossref)
Manthey B., van Rhijn J., Safari A., Vredeveld T. (2025) Convergence and Running Time of Time-Dependent Ant Colony Algorithms. arXiv preprint arXiv:2501.10810 [preprint].
Downloads
- Piotr Jałowiecki, Michał Gostkowski, Krzysztof Zmarzłowski, Tomasz Woźniakowski, PRODUCTIVITY PARADOX IN SELECTED SECTORS OF AGRI-FOOD PRODUCTION BRANCH IN POLAND , Metody Ilościowe w Badaniach Ekonomicznych: Tom 19 Nr 4 (2018)
- Tomasz Woźniakowski, Magdalena Nowakowska, Maksymalizacja współczynnika GFP w produkcji napojów ekologicznych na podstawie danych z gospodarstw z województwa mazowieckiego , Metody Ilościowe w Badaniach Ekonomicznych: Tom 24 Nr 2 (2023)
- Tomasz Woźniakowski, Machine Learning Based Predictions of Sales Leads: Proof of Quality from Polish Business-to-Business Company , Metody Ilościowe w Badaniach Ekonomicznych: Tom 26 Nr 2 (2025)

Utwór dostępny jest na licencji Creative Commons Uznanie autorstwa – Użycie niekomercyjne 4.0 Międzynarodowe.
Publikowane artykuły dostępne są na warunkach Open Access na zasadach licencji Creative Commons CC BY-NC – do celów niekomercyjnych udostępnione materiały mogą być kopiowane, drukowane i rozpowszechniane. Autorzy ponoszą opłatę za opublikowanie artykułu.