Mazyar Taghavi
Biography
Mazyar (he/him) holds an M.Sc. in Applied Mathematics (Optimization) from Iran University of Science and Technology (IUST), where his thesis focused on *Optimization in Deep Reinforcement Learning*, and an M.Eng. in Computer Engineering (Artificial Intelligence and Robotics) from Payame Noor University (PNU), with a thesis on *Autonomous Medical Care for Space Travelers Using Deep Reinforcement Learning*. His academic training spans advanced linear and non-linear programming, optimal control, statistical signal processing, machine learning, and intelligent robotics. He is currently a Ph.D. researcher at the University of Madeira, Funchal, Portugal, working within the OPTIMA project (2025.02832.MAD, funded by FCT-Madeira). His research concentrates on mathematical optimization and reinforcement learning, with a strong emphasis on multi-agent systems, quantum-inspired methods, and optimal control for autonomous and intelligent systems.
At the ITI / University of Madeira, his work within the OPTIMA project focuses on developing AI-driven solutions for sustainable tourism. Concrete examples include: fine-tuning specialized Large Language Models (LLMs) for the Madeira tourism domain using retrieval-augmented generation (RAG) and multimodal data; designing a multi-objective contextual recommendation system that balances individual visitor satisfaction with real-time occupancy and overcrowding mitigation; and implementing a dynamic pricing module based on reinforcement learning and optimization techniques to adjust pricing according to demand, visitor mobility, and environmental conditions. He also contributes to digital cultural narrative generation and the integration of these modules into a unified web platform. These activities build directly on his prior published research, which includes quantum-inspired multi-agent reinforcement learning for UAV-assisted 6G networks, optimal control for autonomous robotics, and mathematical frameworks for zero-shot stochastic meta multi-agent reinforcement learning under uncertainty.
Supervisor: Fábio Mendonça